The ASM instruction you always wanted, but never had?

category: code [glöplog]
architect1: why yes, yes it does. But we want it on 6502 :)

sigflup: that makes no logical sense. It must be:

mov cl,3

There. NOW it's perfect :)
added on the 2010-07-12 20:48:30 by ferris ferris

divz <register> ; divide by zero so I could destroy the world without Devpac throwing a hissy fit!

added on the 2010-07-12 20:55:33 by Gmitts Gmitts

There's no chip with this instruction! Not even one! How could of the developers fucked up so bad?
added on the 2010-07-12 20:58:15 by sigflup sigflup
sse add-all-components-and-store-in-first (or somewhere else for that matter) for dotproducts et cetera w/o tedious shuffling

yes, i know sse3 had haddps. getting there.
added on the 2010-07-12 21:03:02 by superplek superplek
added on the 2010-07-12 21:03:34 by superplek superplek
at the time of the i386 DOS coding, I always missed the

Code: sdmp

instruction (Set Demo Mode Please) which would

1. initialize the 32bit protected mode
2. setup the 320x240 32bpp linear framebuffer gfx mode
3. initialize the sound system

so we wouldn't have to mess with dos4gw, dpmi, vesa, sb and all that crap. Alternativelly, a DOS interruption would have rocked.

Code: mov eax, GFX_320_240_32 | SND_SB_220_44100_16 int 21h

but it never happened :(
added on the 2010-07-12 21:29:56 by iq iq
perlinnoise cof, cof, cof;
added on the 2010-07-12 21:45:07 by xernobyl xernobyl
rlwinm rA, rS, SH, MB, ME

but on an 68000.
added on the 2010-07-12 21:47:30 by Paranoid Paranoid
arm code:
Code:add r0,r1,r2 lsl #4

which adds r1 and r2 left shifted of 4 bits, and stores the result in r0.

the shift value could be in a register too.
Code:add r0, r0, r0 lsl #8 multiplies r0 by 257 mod 2^32 add r0, r0, #47 adds a prime number

each time you execute this two-instruction LCG pseudo-random generator you get a uniformly-distributed 32-bit random number. Ideal for starfield and other noisy generation.
added on the 2010-07-12 23:39:09 by zerkman zerkman
those are just limited FMAD instructions :p
added on the 2010-07-13 00:24:33 by shuffle2 shuffle2
What about:




FpuUnconditionalCheckKernel ?
added on the 2010-07-13 01:08:04 by T$ T$
no, because:
- to use fmad, you need to initiate a destination register with the value to be added, hence at least two instructions and two registers. My solution only uses one single register, and can be called without initialization as many times as needed to generate as many random numbers as needed.
- fmad is floating-point, hence not suitable for a LCG random number generator.
added on the 2010-07-13 08:09:39 by zerkman zerkman
6510 move zero page
added on the 2010-07-13 14:23:00 by linde linde
I miss a simple swap.w on 68000, a rol.w is much too slow (on 020+ the rol is fast though), and a swap.b wouldn't hurt either.
added on the 2010-07-13 14:32:37 by evil evil
hmm, yeah. Every 16-bit+ processor should have a swap instruction. Except in the case of SIMD, e.g. x86/SSE, where the 'shuffle' instructions are a result of poor SSE (1.0, 2.0, 3.0?) instruction set design.
added on the 2010-07-13 15:05:04 by trc_wm trc_wm
clz or div on ARM7tdmi aka GBA.
added on the 2010-07-13 15:11:37 by raer raer
Mooooar registersss
added on the 2010-07-13 15:11:58 by Optimonk Optimonk
or more Brainsss, eh? ;P
added on the 2010-07-13 15:13:46 by raer raer
In the past, I've always wondered why some processors do not include more registers. I would have killed for more (general purpose) registers on the 8086.

You'd think that adding a few registers wouldn't hurt anyone. However, adding more registers means more bits are needed in the opcode to define which register is accessed. In addition, most processors have multi-port register banks (multiple registers can be read or sometimes written). These register banks can get quite complex, especially if you have a pipelined design. For instance, you need many more multiplexers and handle write contention, if the CPU can write more than one register at a time.

Adding more registers not only adds to the chip area, making it more expensive, the added complexity increases the access time of the register bank and the CPU will be slower. With modern IC processes (.18 um and smaller), these things are less of a problem, but in the 70ies, 80ies and 90ies these were undoubtedly some of the reasons for register limited designs.

added on the 2010-07-13 15:22:01 by trc_wm trc_wm
I love the auxillary registers on the z80. Love to do exx, ex af,af' (I rarely use it though), or select carefully my registers so that I can do ex de,hl especially to add directly to (hl). I play a lot with them. I am wondering why the x86 (where the 8086 is from 8080 that is from z80 ancestor I think?) doesn't have them.

Also the z80 has the IX,IY which are slow but you can use IXh,IXl,IYh,IYl, which proved very useful when I just need a loop counter in external loops and can't find anything (or am too lazy to make some ugly automodifying code just before the jump). Recent versions of Winape32 assembly lets you use them directly without having to remember the hex codes as in old assemblers on the CPC.

But yes, the register thing, probably you would need more space in the opcode structure. I was talking with Antitec of Dirty Minds and we always thought, if there would be just one more 16bit register, it would be just fine because there are a lot of routines where you struggle and if you had just one more register it would fit perfectly. That's what we thought. Although today I am more good with Z80 and I love to play with the regs and always find a good solution for my code to fit well with the regs (and a pair of EXXs well placed is a solution too and only loose 2 NOP cycles).
added on the 2010-07-13 15:38:59 by Optimonk Optimonk
iq: that's a bit silly don't you think? you're basically asking for a wholly different and complex OS/bios :)
added on the 2010-07-13 15:45:14 by superplek superplek
I am wondering why the x86 (where the 8086 is from 8080 that is from z80 ancestor I think?) doesn't have them.

Some guys (including Federico Faggin) left Intel to form Zilog after a disagreement on the architecture of the 8085. Intel finalized the 8085 and Zilog produced the z80. They were direct competitors, so neither was probably keen to 'borrow' eachother's ideas.
added on the 2010-07-13 15:49:43 by trc_wm trc_wm
Z80 auxiliary registers are a hack. The designers understood the want of extra registers but needed to keep compatibility with the mainstream-at-the-time 8080. Maybe it seemed like a great idea at the time, but it was a hack nevertheless: very sharply aimed at the present, but without big future. From Intel's perspective it was a much better idea to move on to the more advanced 16-bitters instead, which they did.
added on the 2010-07-13 16:05:39 by svo svo
6502, mov. How I miss it.
added on the 2010-07-13 16:09:03 by visy visy
A question for those who have experience programming the Amiga or Atari ST..

How often does the CPU wait for the graphics processor? I assume graphics memory is shared and the CPU is denied memory access when the GFX processor is reading data to build the screen. I'm not familiar with the memory architecture on these machines.
added on the 2010-07-13 18:47:14 by trc_wm trc_wm