Instruction encoding

An instruction is encoded as a 32-bit word. There are four different formats: A, B, C and D. The format of the instruction is determined by the six most significant bits of the instruction word, and another five bits (bits 2 to 6) are used to distinguish between format A and B.

Format	# Instr.	Operands
A	124	Reg, Reg, Reg
B	256	Reg, Reg
C	47	Reg, Reg, 14-bit immediate
D	15	Reg, 21-bit immediate

      3             2               1
     |1| | | | | | |4| | | | | | | |6| | | | | | | |8| | | | | | | |0|
     +-----------+---------+---------+---+---------+---+-------------+
 A:  |0 0 0 0 0 0|REG1     |REG2     |VM |REG3     |F  |OP (7b)      |
     +-----------+---------+---------+-+-+---------+---+---------+---+
 B:  |0 0 0 0 0 0|REG1     |REG2     |V|FUNC (6b)  |F  |1 1 1 1 1|OP |
     +-----------+---------+---------+-+-+---------+---+---------+---+
 C:  |OP (6b)    |REG1     |REG2     |V|H|IMM (14b)                  |
     +---+-------+---------+---------+-+-+---------------------------+
 D:  |1 1|OP (4b)|REG1     |IMM (21b)                                |
     +---+-------+---------+-----------------------------------------+

The fields of the instruction word are interpreted as follows:

Field	Description
OP	Operation
FUNC	A 6-bit function identifier
REGn	Register (5 bit identifier)
IMM	Immediate value
H	High (1) or low (0) immediate value
VM	Vector mode (2-bit): 00: scalar <= op(scalar,scalar) 10: vector <= op(vector,scalar) 11: vector <= op(vector,vector) 01: vector <= op(vector,fold(vector))
V	Vector mode (1-bit): 0: scalar <= op(scalar[,scalar]) 1: vector <= op(vector[,scalar])
F	Flavor (see below)

The immediate value for format C instructions is interpreted as follows:

If H=0, the value is the 32-bit sign extend version of IMM.
If H=1, the value is IMM << 18, with the 18 lowest bits equal to the least significant bit of IMM.

The interpretation of the F field depends on the instruction type:

For load/store instrcutions it is interpreted as an Index scale (a multiplication factor for the 3rd operand).
For the SEL instrcution it is interpreted as an operand order modifier.
For all other instructions it is interpreted as a Packed mode descriptor.

F	Packed mode	Index scale	SEL mode
00	None (1 x 32 bits)	*1	c <= (a & c) \| (b & ~c)
01	Byte (4 x 8 bits)	*2	c <= (b & c) \| (a & ~c)
10	Half-word (2 x 16 bits)	*4	c <= (c & a) \| (b & ~a)
11	(reserved)	*8	c <= (b & a) \| (c & ~a)

Note that most C type instructions are immediate operand versions of corresponding A type instructions. For instance the A type instruction ADD reg1, reg2, reg3 has a corresponding C type instruction ADD reg1, reg2, #imm.

Instruction list

Legend

Name	Description
dst	Destination register
src1	Source operand 1
src2	Source operand 2
src3	Source operand 3
i21	21-bit immediate value
I	Supports immediate operand
V	Supports vector operation
P	Supports packed operation

Load/store instructions

Mnemonic	I	V	Operands	Operation	Description
LDB	x	(1)	dst, src1, src2	dst <= [src1 + src2] (byte)	Load signed byte
LDUB	x	(1)	dst, src1, src2	dst <= [src1 + src2] (byte)	Load unsigned byte
LDH	x	(1)	dst, src1, src2	dst <= [src1 + src2] (halfword)	Load signed halfword
LDUH	x	(1)	dst, src1, src2	dst <= [src1 + src2] (halfword)	Load unsigned halfword
LDW	x	(1)	dst, src1, src2	dst <= [src1 + src2] (word)	Load word
LDEA	x	(1)	dst, src1, src2	dst <= src1 + src2	Load effective address
STB	x	(1)	src1, src2, src3	[src2 + src3] <= src1 (byte)	Store byte
STH	x	(1)	src1, src2, src3	[src2 + src3] <= src1 (halfword)	Store halfowrd
STW	x	(1)	src1, src2, src3	[src2 + src3] <= src1 (word)	Store word
LDLI	x		dst, #i21	dst <= signextend(i21)	Alt. 1: Load immediate (low 21 bits)
LDHI	x		dst, #i21	dst <= i21 << 11	Alt. 2: Load immediate (high 21 bits)
LDHIO	x		dst, #i21	dst <= (i21 << 11) \| 0x7ff	Alt. 3: Load immediate with low ones (high 21 bits)

See addressing modes for more details.

(1): The third operand in vector loads/stores is used as a stride or offset parameter.

Branch and jump instructions

Mnemonic	I	Operands	Operation	Description
J	x	src1, #i21	pc <= src1+signextend(i21)*4	Jump to register address Note: src1 can be PC
JL	x	src1, #i21	lr <= pc+4, pc <= src1+signextend(i21)*4	Jump to register address and link Note: src1 can be PC
BZ	x	src1, #i21	pc <= pc+signextend(i21)*4 if src1 == 0	Conditionally branch if equal to zero
BNZ	x	src1, #i21	pc <= pc+signextend(i21)*4 if src1 != 0	Conditionally branch if not equal to zero
BS	x	src1, #i21	pc <= pc+signextend(i21)*4 if src1 == 0xffffffff	Conditionally branch if set (all bits = 1)
BNS	x	src1, #i21	pc <= pc+signextend(i21)*4 if src1 != 0xffffffff	Conditionally branch if not set (at least one bit = 0)
BLT	x	src1, #i21	pc <= pc+signextend(i21)*4 if src1 < 0	Conditionally branch if less than zero
BGE	x	src1, #i21	pc <= pc+signextend(i21)*4 if src1 >= 0	Conditionally branch if greater than or equal to zero
BLE	x	src1, #i21	pc <= pc+signextend(i21)*4 if src1 <= 0	Conditionally branch if less than or equal to zero
BGT	x	src1, #i21	pc <= pc+signextend(i21)*4 if src1 > 0	Conditionally branch if greater than zero

Special PC-addition

Mnemonic	I	V	P	Operands	Operation	Description
ADDPCHI	x			dst, #i21	dst <= pc + (i21 << 11)	Add high immediate to PC

Note: ADDPCHI can be used together with load/store instructions to perform 32-bit PC-relative addressing in just two instructions, or together with jump instructions to perform 32-bit PC-relative jumps in just two instructions.

Integer ALU instructions

Mnemonic	I	V	P	Operands	Operation	Description
CPUID		x		dst, src1, src2	dst <= cpuid(src1, src2)	Get CPU information based on src1, src2 (see CPUID)
OR	x	x		dst, src1, src2	dst <= src1 \| src2	Bitwise or
NOR	x	x		dst, src1, src2	dst <= ~(src1 \| src2)	Bitwise nor
AND	x	x		dst, src1, src2	dst <= src1 & src2	Bitwise and
BIC	x	x		dst, src1, src2	dst <= src1 & ~src2	Bitwise clear
XOR	x	x		dst, src1, src2	dst <= src1 ^ src2	Bitwise exclusive or
ADD	x	x	x	dst, src1, src2	dst <= src1 + src2	Addition
SUB	x	x	x	dst, src1, src2	dst <= src1 - src2	Subtraction (note: src1 can be an immediate value)
SEQ	x	x	x	dst, src1, src2	dst <= (src1 == src2) ? 0xffffffff : 0	Set if equal
SNE	x	x	x	dst, src1, src2	dst <= (src1 != src2) ? 0xffffffff : 0	Set if not equal
SLT	x	x	x	dst, src1, src2	dst <= (src1 < src2) ? 0xffffffff : 0	Set if less than (signed)
SLTU	x	x	x	dst, src1, src2	dst <= (src1 < src2) ? 0xffffffff : 0	Set if less than (unsigned)
SLE	x	x	x	dst, src1, src2	dst <= (src1 <= src2) ? 0xffffffff : 0	Set if less than or equal (signed)
SLEU	x	x	x	dst, src1, src2	dst <= (src1 <= src2) ? 0xffffffff : 0	Set if less than or equal (unsigned)
MIN	x	x	x	dst, src1, src2	dst <= min(src1, src2) (signed)	Minimum value
MAX	x	x	x	dst, src1, src2	dst <= max(src1, src2) (signed)	Maximum value
MINU	x	x	x	dst, src1, src2	dst <= min(src1, src2) (unsigned)	Minimum value
MAXU	x	x	x	dst, src1, src2	dst <= max(src1, src2) (unsigned)	Maximum value
ASR	x	x	x	dst, src1, src2	dst <= src1 >> src2 (signed)	Arithmetic shift right
LSL	x	x	x	dst, src1, src2	dst <= src1 << src2	Logic shift left
LSR	x	x	x	dst, src1, src2	dst <= src1 >> src2 (unsigned)	Logic shift right
SHUF	x	x		dst, src1, src2	dst <= shuffle(src1, src2)	Shuffle bytes according to the shuffle descriptor in src2 (see SHUF)
SEL	x	x		dst, src1, src2	dst <= (src1 & dst) \| (src2 & ~dst)	Bitwise select (there are three other variants for handling different operand orders)
CLZ		x	x	dst, src1	dst <= clz(src1)	Count leading zeros
POPCNT		x	x	dst, src1	dst <= popcnt(src1)	Count number of bits set to 1 in src1
REV		x	x	dst, src1	dst <= rev(src1)	Reverse bit order
PACK		x	x	dst, src1, src2	dst <= ((src1 & 0x0000ffff) << 16) \| (src2 & 0x0000ffff)	Pack two half-words into a word. Use PACK.H to pack four bytes into a word.
PACKS		x	x	dst, src1, src2	dst <= (saturate(src1) << 16) \| saturate(src2)	Pack two half-words into a word, with signed saturation. Use PACKS.H to pack four bytes into a word.
PACKSU		x	x	dst, src1, src2	dst <= (saturate(src1) << 16) \| saturate(src2)	Pack two half-words into a word, with unsigned saturation. Use PACKSU.H to pack four bytes into a word.

Saturating and halving arithmentic instructions

Mnemonic	V	P	Operands	Operation	Description
ADDS	x	x	dst, src1, src2	dst <= saturate(src1 + src2)	Saturating addition (signed)
ADDSU	x	x	dst, src1, src2	dst <= saturate(src1 + src2)	Saturating addition (unsigned)
ADDH	x	x	dst, src1, src2	dst <= (src1 + src2) / 2	Halving addition (signed)
ADDHU	x	x	dst, src1, src2	dst <= (src1 + src2) / 2	Halving addition (unsigned)
SUBS	x	x	dst, src1, src2	dst <= saturate(src1 - src2)	Saturating subtraction (signed)
SUBSU	x	x	dst, src1, src2	dst <= saturate(src1 - src2)	Saturating subtraction (unsigned)
SUBH	x	x	dst, src1, src2	dst <= (src1 - src2) / 2	Halving subtraction (signed)
SUBHU	x	x	dst, src1, src2	dst <= (src1 - src2) / 2	Halving subtraction (unsigned)

Multiply and divide instructions

Mnemonic	V	P	Operands	Operation	Description
MULQ	x	x	dst, src1, src2	dst <= (src1 * src2) >> 31	Fixed point multiplication (signed Q31 format, or Q15/Q7 for packed operations)
MUL	x	x	dst, src1, src2	dst <= src1 * src2	Multiplication (signed or unsigned, low 32 bits)
MULHI	x	x	dst, src1, src2	dst <= (src1 * src2) >> 32	Multiplication (signed, high 32 bits)
MULHIU	x	x	dst, src1, src2	dst <= (src1 * src2) >> 32	Multiplication (unsigned, high 32 bits)
DIV	x	x	dst, src1, src2	dst <= src1 / src2	Division (signed, integer part)
DIVU	x	x	dst, src1, src2	dst <= src1 / src2	Division (unsigned, integer part)
REM	x	x	dst, src1, src2	dst <= src1 % src2	Remainder (signed)
REMU	x	x	dst, src1, src2	dst <= src1 % src2	Remainder (unsigned)

Floating-point instructions

Mnemonic	V	P	Operands	Operation	Description
FMIN	x	x	dst, src1, src2	dst <= min(src1, src2)	Floating-point minimum value
FMAX	x	x	dst, src1, src2	dst <= max(src1, src2)	Floating-point maximum value
FSEQ	x	x	dst, src1, src2	dst <= (src1 == src2) ? 0xffffffff : 0	Set if equal (floating-point)
FSNE	x	x	dst, src1, src2	dst <= (src1 != src2) ? 0xffffffff : 0	Set if not equal (floating-point)
FSLT	x	x	dst, src1, src2	dst <= (src1 < src2) ? 0xffffffff : 0	Set if less than (floating-point)
FSLE	x	x	dst, src1, src2	dst <= (src1 <= src2) ? 0xffffffff : 0	Set if less than or equal (floating-point)
FSUNORD	x	x	dst, src1, src2	dst <= (isNaN(src1) \|\| isNaN(src2)) ? 0xffffffff : 0	Set if unordered (NaN)
FSORD	x	x	dst, src1, src2	dst <= (!isNaN(src1) && !isNaN(src2)) ? 0xffffffff : 0	Set if ordered
ITOF	x	x	dst, src1, src2	dst <= ((float)src1) * 2^-src2	Cast signed integer to float with exponent offset
UTOF	x	x	dst, src1, src2	dst <= ((float)src1) * 2^-src2	Cast unsigned integer to float with exponent offset
FTOI	x	x	dst, src1, src2	dst <= (int)(src1 * 2^src2)	Cast float to signed integer with exponent offset
FTOU	x	x	dst, src1, src2	dst <= (unsigned)(src1 * 2^src2)	Cast float to unsigned integer with exponent offset
FTOIR	x	x	dst, src1, src2	dst <= (int)round(src1 * 2^src2)	Round float to signed integer with exponent offset
FTOUR	x	x	dst, src1, src2	dst <= (unsigned)round(src1 * 2^src2)	Round float to unsigned integer with exponent offset
FPACK	x	x	dst, src1, src2	dst[1] <= (float16)src1 dst[0] <= (float16)src2	Reduce precision and pack two floating-point values into a word. Use FPACK.H to pack four floating-point values.
FADD	x	x	dst, src1, src2	dst <= src1 + src2	Floating-point addition
FSUB	x	x	dst, src1, src2	dst <= src1 - src2	Floating-point subtraction
FMUL	x	x	dst, src1, src2	dst <= src1 * src2	Floating-point multiplication
FDIV	x	x	dst, src1, src2	dst <= src1 / src2	Floating-point division
FUNPL	x	x	dst, src1	dst <= (float)src1[0]	Unpack low half-precision floating-point value. Use FUNPL.H to unpack two quarter-precision floating-point values.
FUNPH	x	x	dst, src1	dst <= (float)src1[1]	Unpack high half-precision floating-point value. Use FUNPH.H to unpack two quarter-precision floating-point values.
FSQRT	x	x	dst, src1	dst <= sqrt(src1)	Floating-point square root

Vector instructions

Most instructions (excluding branch instructions) can be executed in both scalar and vector mode.

For instance the integer instruction ADD has the following operation modes:

ADD Sd,Sa,Sb - scalar <= scalar + scalar
ADD Sd,Sa,#IMM - scalar <= scalar + scalar
ADD Vd,Va,Sb - vector <= vector + scalar
ADD Vd,Va,#IMM - vector <= vector + scalar
ADD Vd,Va,Vb - vector <= vector + vector

See Vector Design for more information.

Packed operation

Many instructions support packed operation. Suffix the instruction with .B for packed byte operation (each word is treated as four bytes, and four operations are performed in parallel), or .H for packed half-word operation (each word is treated as two half-words, and two operations are performed in parallel).

For instance, the following packed operations are possible for the MAX instruction:

MAX.B Sd,Sa,Sb - 4x packed byte MAX, scalar <= max(scalar, scalar)
MAX.B Vd,Va,Sb - 4x packed byte MAX, vector <= max(vector, scalar)
MAX.B Vd,Va,Vb - 4x packed byte MAX, vector <= max(vector, vector)
MAX.H Sd,Sa,Sb - 2x packed half-word MAX, scalar <= max(scalar, scalar)
MAX.H Vd,Va,Sb - 2x packed half-word MAX, vector <= max(vector, scalar)
MAX.H Vd,Va,Vb - 2x packed half-word MAX, vector <= max(vector, vector)

Note that immediate operands are not supported for packed operations.

See Packed Operations for more information.

Planned instructions

Move scalar registers to/from vector register elements.
More floating-point instructions (round, ...?).
Interrupt / supervisor mode instructions (move to/from user registers, return from interrupt, ...).
Control instructions/registers (cache control, interrupt masks, status flags, ...).
Load Linked (ll) and Store Conditional (sc) for atomic operations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instructions.md

Instructions.md

Instruction encoding

Instruction list

Legend

Load/store instructions

Branch and jump instructions

Special PC-addition

Integer ALU instructions

Saturating and halving arithmentic instructions

Multiply and divide instructions

Floating-point instructions

Vector instructions

Packed operation

Planned instructions

Files

Instructions.md

Latest commit

History

Instructions.md

File metadata and controls

Instruction encoding

Instruction list

Legend

Load/store instructions

Branch and jump instructions

Special PC-addition

Integer ALU instructions

Saturating and halving arithmentic instructions

Multiply and divide instructions

Floating-point instructions

Vector instructions

Packed operation

Planned instructions