Back    Contents   Next

Emu10k1 Instructions

The instructions again where: [1]

Opcode Number Opcode NeumonicInstructioncomment
0x0 MACS R = A + (X * Y >> 31) saturation
0x1 MACS1 R = A + (-X * Y >> 31) saturation
0x2 MACW R = A + (X * Y >> 31) wraparound
0x3 MACW1 R = A + (-X * Y >> 31) wraparound
0x4 MACINTS R = A + X * Y saturation
0x5 MACINTW R = A + X * Y wraparound (31-bit)
0x6 ACC3 R = A + X + Y saturation
0x7 MACMV R = A, acc += X * Y 67 bit accum, you can grab MS 32 bits of LS 32 bits
0x8 ANDXOR R = (A & X) ^ Y 
0x9 TSTNEG R = (A >= Y) ? X : ~X  
0xa LIMIT R = (A >= Y) ? X : Y 
0xb LIMIT1 R = (A < Y) ? X : Y 
0xcLOG ... 
0xdEXP ...  
0xe INTERP R = A + (X * (Y - A) >> 31) ; saturation

Operand Formats:

      opcode R,A,X,Y

Important: The Operands always represents an address to a register. The Emu10k1 architeture does not implement indirect, immediate or any other type of addressing mode.

Instruction Specifics

In this section I will try to give a detailed description of each instruction. Since Creative/Emu have never released any official documents, some of the instructions are still not fully understood. As well, in some instance, special quirks arise (such as the ACCUM quirk), more quirk may also be lurking so beware.

If you discover any inacuracies or would like to add something don't hesitate to email me.



Formula:     R = A + (X * Y >> 31)


Formula:     R = A + (-X * Y >> 31)


These instructions are frequently refered to as "fractional" macs. They performes the multiplication of operands X(or -X) and Y. Of the resulting 63 bit value, the 32 most significant (the "High" value) bits are taken and added with operand A. The result of the operation is then stored in the R operand.

They are called fractional macs because the operation of using the 32 MSB is equivalent to dividing by 2^31. Thus one could say that the formula is also given by:     R = A+X*Y/(2^31)

When defining constants to be stored into general purpose registers (GPRs), a special operand can be used for the macs instruction. The "#" indicates to the assembler that the value of the operand should be multiplied by (2^31-1).

For example: #0.5 would be mutiplied (2^31-1) to give a value (in hex) $3fffffff. Fractional macs allow us to implement non-integer values (such as filter coefficients) using integer mathematics. The real numbers (between 1 and -1) are essentially quantized to 2^31 different values, giving us a delta (smallest increment between two values) of 2^(-31)=4.65661287307739e-10 which is more than accurate for most calculations.

For more info on storing values in GPRs see the DC directive.


These instructions saturate on overflow.



Formula:     R = A + (X * Y >> 31)


formula:     R = A + (-X * Y >> 31)


These instruction behave in exactly the same way as the MACS and MASC1 instruction except that it handles overflows differently.

Overflow Behaviour

These instructions wraparound upon overflow. A simple illustration, imagine a number system with possible values between -9 and 9.

With saturation 8+2=9
with overflow 8+4=-8

The saturate method results in smaller increases in error with overflow (errors result in noise), a thus at first glance offers a better handling of overflow. However, the saturate method prevents us from using an important property of two's complement arithmetic:

"If several 2's-complement numbers whose sum would not overflow are added, then the result of 2's-complement accumulation of these numbers is correct, even though intermediate sums might overflow"[4].

This property can be used in FIR filtering were one has taken care to design the system such that the total output does not overflow. Note that this property does not hold true through a multiplication, however (thus you cannot blindly use this in IIR filtering).


Formula:     R = A + X * Y


Both these perform an integer mac operation.


MACINTS saturates upon overflow.

With MACINTW, the result is wrapped around but the sign bit (bit 31) is zeroed. Essentially the wrap around occurs around bit 30 instead of bit 31 (I have no idea why this would be useful).


Formula:     R = A + X + Y


ACC3 is perty straight forward. It simply sums the three operands placing the result in R. The result is saturated upon overflow.

It should be noted that the accumulation accurs in the High Accumulator.


Formula:     R = A, acc += X * Y

The MACMV instruction combines a multiply accumulate and a parallel move into one instruction. The result of the X Y multiplication is accumulated into the accumulator. The result must be fetched via a MAC,MACINT , or ACC3 instruction following the series of MACMV instrution. The Accumulator register address (0x56) can only be specified as the A registers, if used in X or Y, the emu10k1 will use 0 instead.

The accumulator is 67 bits wide, and will wraparound on overflow. The ACC3 and MACS will fetch the HIGH accumulator, were as the MACSINT instruction will fetch the LOW accumulator. When fetched, if the accumulator contains value greater than 63 bits one length, the accumulator will be saturated

The MACMV instruction is most useful for FIR filters as it can process each delayed unit in one instruction, including shifting the delays. Hence a N order FIR filter uses just a little over N+1 instructions (plus a few overhead instructions).


Formula     R = (A & X) ^ Y


The ANDXOR can be used to synthesis standard logical instructions. The table below shows some of the logical operations that can be synthesised.[5]


The as10k1 assembler is distributed with a file called "emu_constants.asm", this file contains macros with these logical operations already synthesized.



Formula:     R = (A >= Y) ? X : ~X


Formula:     R = (A >= Y) ? X : Y


Formula:     R = (A < Y) ? X : Y


The Result of these operations are condition upon the values of A and Y. The result of the TSTNEG instruction will be complemented if A<Y. The Limit and limit1 instructions function in a similar maner, but the value of the resultant can be X or Y depending on the conditional.


Behaviour: The LOG formula converts linear data into a Sign-Exponent-Mantissa form. The size of the exponent (i.e. number of bits occupied by) is variable between 2 and 5 bits. The sign occupies one bit and the mantissa occupies the rest. The LOG instruction can also perform absolute value, negative absolute, as well as negative on the resultant.[2]

The Resultant is stored in the following exponential form:

1 bit2-5 bits 29-26 bits

Instruction Format:      LOG      R,Lin_data,Max_exponent,Y


2 <= Max_exponent <= 31 (=0x1f);

And "sign" is a register containing a two bit number that has the following properties:

Value in Sign OperandDescription
0 0Normal
0 1absolute value*
1 0negative of absolute value*
1 1negative*
* Note that a 1 bit error exist in inverted values because it is actually the 1's compliment that is taken. (Still close enough for Rock and Roll! :-)

The way it works [?]:

Example of conversions (max_exp=7(3 bits),sign=00):

Linear ValueLOG valueBreak Down
SignExponentMantissa Implicit MSB?
0x40004000 0x70001000 070x0001000Yes
0x20002000 0x60001000 060x0001000Yes
0x10001000 0x50001000 050x0001000Yes
0x08000800 0x40001000 040x0001000Yes
0x04000400 0x30001000 030x0001000Yes
0x02000200 0x20001000 020x0001000Yes
0x01000100 0x10001000 010x0001000Yes
0x00800080 0x08000800 000x8000800No
0xff008000 0xf008000f 100x008000fNo



This instruction performs oposite of the LOG instruction.


Formula:    R = A + (X * (Y - A) >> 31)


Used for linear interpolating between two points. "X" should be positive and represents a fractional value between 0 and 1. "x" is the fraction of the interval between A and Y where the desired value is located.

The INTERP instruction is not only useful for linear interpolation, it can also be used for rescaling values. In such a case, the input must be bounded by [0,1], the output will be bounded by [A,Y]. Thus the intruction can be though of as:

   interp    R,MIN,X,MAX

where MIN and MAX are the bounds of your output.


The skip command is available to provide some flow control in the dsp programs. The skip command has the following format:


COUNT is a register containing the number of instructions o skip.

CC_TEST is a register containing a value which indicates under which conditions to skip on.

CCR is address of the CCR register. This can be any register, thus a previously saved CCR can be reused, or a constant can be used to implement an always skip (a NOP).

R is an address to copy the value of the CCR to for future use.

The CC_TEST operand uses one of 4 equations.

Form 1( S * Z * M * B * N * S'* Z' * M' * B' * N') + ( S * Z * M * B * N * S'* Z' * M' * B' * N') + ( S * Z * M * B * N * S'* Z' * M' * B' * N')
Form 2( S + Z + M + B + N + S'+ Z' + M' + B' + N') * ( S + Z + M + B + N + S'+ Z' + M' + B' + N') * ( S + Z + M + B + N + S'+ Z' + M' + B' + N')
Form 3( S * Z * M * B * N * S'* Z' * M' * B' * N') + ( S * Z * M * B * N * S'* Z' * M' * B' * N') + ( S + Z + M + B + N + S'+ Z' + M' + B' + N')
Form 4( S + Z + M + B + N + S'+ Z' + M' + B' + N') * ( S + Z + M + B + N + S'+ Z' + M' + B' + N') + ( S * Z * M * B * N * S'* Z' * M' * B' * N')

Presumably, the two most significant bits of CC_TEST select one of the above equations. The remaining 30 bits act as a mask that controls whether an element is active.

Through informed trial and error, the following CC_TEST have been discovered:

NameBranch onformulaCC_TEST
beqEqual-to (zero) ==0 0x00000008
bneNot Equal-to !=0 0x00000100
bltless-than <0 0x00000004
bleless-than or equal-to <=0 TBD
bgtgreater-than >0 0x00000180
bgegreater-than or equal-to >=0 0x00000180
bsaOn saturation -- 0x00000010

The file "emu_constants.asm" contains macros with these branch-on-condition already synthesized.

The File "emu_constants.asm" contains the following macros:

Logical Operations
andAND   R,src1,src2 R = src1 & src2
orOR   R,src1,src2 R = src1 | src2
xorXOR   R,src1,src2 R = src1 ^ src2
nandNAND   R,src1,src2 R = (src1&src2)'
norNOR   R,src1,src2 R = (src1|src2)'
notNOT   R,src1 R = src1'
Miscellaneous Operations
negNEG   R,src1 R = -src1 (==src1'+1)
moveMOVE   R,src1 R = src1
testTEST   src1 null=src1 (sets CCR for skip)
cmpCMP   src1,src2 null=src1-src2 (sets CCR for skip)

Back    Contents    Next