Line 4: |
Line 4: |
| A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table. | | A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table. |
| Both instructions and descriptors are coded in little endian. | | Both instructions and descriptors are coded in little endian. |
− | Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro] | + | Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro]. |
| + | The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx]. |
| Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes. | | Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes. |
| | | |
Line 354: |
Line 355: |
| | 0x12 | | | 0x12 |
| | 1u | | | 1u |
− | | ARL | + | | MOVA |
− | | Address Register Load; sets (a0, a1, _, _) to SRC1 (cast to integer). | + | | Address Register Load; sets (a0.x, a0.y, _, _) to SRC1 (cast to integer). |
| |- | | |- |
| | 0x13 | | | 0x13 |
Line 470: |
Line 471: |
| | 3 | | | 3 |
| | FORLOOP | | | FORLOOP |
− | | Loops over the code between itself and DST. First sets lcnt to INT.y, then increments lcnt by INT.z after each loop. Loops until lcnt reaches INT.y+INT.x, inclusive (that is : for(aL=INT.y;aL<=INT.y+INT.x;aL+=INT.z)). (INT is i0-i3, an integer vector uniform) | + | | Loops over the code between itself and DST. First sets aL to INT.y, then increments aL by INT.z after each loop. Loops until aL reaches INT.y+INT.x, inclusive (that is : for(aL=INT.y;aL<=INT.y+INT.x;aL+=INT.z)). (INT is i0-i3, an integer vector uniform) |
| |- | | |- |
| | 0x2A | | | 0x2A |
Line 592: |
Line 593: |
| == Relative addressing == | | == Relative addressing == |
| | | |
− | There are 3 global address registers : a0, a1 and a2 = lcnt (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. | + | There are 3 global address registers : a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. |
| | | |
− | For example, if IDX = 2, a1 = 3 and SRC1 = c8, then instead SRC1+a1 = c11 will be used for the instruction. | + | For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. |
| | | |
− | a0 and a1 can be set manually through the ARL instruction. lcnt is set automatically by the LOOP instruction. Note that lcnt is still accessible and valid after exiting a LOOP block. | + | a0.x and a0.y can be set manually through the MOVA instruction. aL is set automatically by the LOOP instruction. Note that aL is still accessible and valid after exiting a LOOP block. |
| | | |
| == Comparison operator == | | == Comparison operator == |