https://www.3dbrew.org/w/api.php?action=feedcontributions&user=Oreo639&feedformat=atom
3dbrew - User contributions [en]
2024-03-29T10:50:39Z
User contributions
MediaWiki 1.35.8
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=22377
GPU/Shader Instruction Set
2023-10-02T09:38:34Z
<p>Oreo639: Clarify LITP instruction</p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Partial lighting computation, may be used in conjunction with EX2, LG2, etc to compute the vertex lighting coefficients. See the [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx Microsoft] and [https://registry.khronos.org/OpenGL/extensions/ARB/ARB_vertex_program.txt ARB] docs for more information on how to implement the full lit function; DST = {max(src.x, 0), max(min(src.y, 127.9961), -127.9961), 0, max(src.w, 0)} and it sets the cmp.x and cmp.y flags based on if the respective src.x and src.w components are >= 0.<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values. The way out-of-bounds values behave when reading uniforms is as follows:<br />
* If the offset is out of byte bounds (less than -128 or greater than 127), the offset is not applied (treated as 0).<br />
* The offset is added to the constant register index and masked by 0x7F.<br />
* If the resulting index is greater than 95, the result is (1, 1, 1, 1).<br />
* Otherwise, the result is the value at the indexed constant register.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Vertex shader<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output registers hold the data to be passed to the later GPU stages and are write-only. Each of the output register is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
Keep in mind that writing to the same output register/component more than once appears appears to cause problems (e.g. GPU hangs).<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0xF<br />
| o0-o15<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0xF<br />
| v0-v15<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=21974
GPU/Shader Instruction Set
2022-10-29T19:41:02Z
<p>Oreo639: </p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Vertex shader<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output registers hold the data to be passed to the later GPU stages and are write-only. Each of the output register is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
Keep in mind that writing to the same output register/component more than once appears appears to cause problems (e.g. GPU hangs).<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0xF<br />
| o0-o15<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0xF<br />
| v0-v15<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21973
SHBIN
2022-10-29T19:36:08Z
<p>Oreo639: Update DVOJ section</p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]] (with the extension .bcsdr). They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Likely a version number)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Likely a version number)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex and geometry shader outmaps (geometry shader)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float constant register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Label ID<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Unknown (always 1?)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Size of label's location (in words). 0xFFFFFFFF/(uint32_t)-1 if there is no size.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored in a constant table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant register ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Boolean constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Integer constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| floating-point constant register ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Output type (see table below)<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Register ID<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Output attribute component mask (e.g. 5=xz)<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Unknown (Consistently the same number throughout the DVLE, may vary between DVLEs?)<br />
|-<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Type<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
<br />
Keep in mind that the usage of the term "Uniform" here is used as [https://developer.download.nvidia.com/CgTutorial/cg_tutorial_chapter03.html defined by Nvidia] (variable who obtains its initial value from an external environment) and not as defined by RenderMan/GLSL (variables whose values are constant over a shaded surface).<br />
<br />
The uniform table contains a list of all registers whose initial values are derived by an external source along with their layout and associated symbol.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Register index of the start of the uniform<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Register index of the end of the uniform (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x2<br />
| Unknown. (Likely a version number)<br />
|-<br />
| 0x06<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x07<br />
| 0x1<br />
| true = merge vertex and geometry shader outmaps (geometry shader)<br />
|-<br />
| 0x08<br />
| 0x2<br />
| Bitmask of used input registers.<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used output registers.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset to operand descriptor table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Homebrew_Applications&diff=21561
Homebrew Applications
2021-08-15T07:28:00Z
<p>Oreo639: Remove irrelevant note</p>
<hr />
<div>== Installing ==<br />
Applications are installed by copying the necessary files directly to the <code>3ds/</code> folder in the root of the SD card (preferred for new designs), or in a subdirectory of <code>3ds/</code>, in which case said subfolder must be named identically to its executable. Most applications come with two files:<br />
* <code>[appname].3dsx</code>: The executable.<br />
* <code>[appname].smdh</code>: The icon/metadata. (Not required in any case, and may be integrated into the <code>.3dsx</code>)<br />
* <code>[appname].xml</code>: The list of supported targets (i.e. installed titles which the app supports replacing in memory at runtime, thus inheriting its permissions), and of any arguments to be passed to the .3dsx. (Optional)<br />
<br />
A standalone .xml file can point to a differently-named .3dsx, launching it with potentially different arguments so that a single application can run in different modes.<br />
<br />
The [[Homebrew Launcher]] will scan the SD card for all <code>.3dsx</code> files, but will only display an icon for those who have one according to the format described above. Recent enough versions can freely navigate the filesystem to select an application.<br />
<br />
== List ==<br />
<br />
=== Launchers ===<br />
<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="10%" | Open-Source<br />
|-<br />
| [https://github.com/fincs/new-hbmenu Homebrew Launcher]<br />
| Run homebrew on your 3DS! Compatible with Rosalina and all prior 3dsx loading solutions<br />
| [https://devkitpro.org devkitPro]<br />
| [https://github.com/fincs/new-hbmenu/releases Here]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/3ds_hb_menu Homebrew Starter Pack]<br />
| Everything to get you started.<br />
| [[User:smea|smea]]<br />
| [https://smealum.github.io/ninjhax2/starter.zip Here]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/3ds_hb_menu Homebrew Launcher (v1.x)]<br />
| The old version of the 3DS Homebrew Launcher, originally created for ninjhax 1.x (Discontinued)<br />
| [[User:smea|smea]]<br />
| [https://smealum.github.io/ninjhax2/boot.3dsx Here]<br />
| Yes<br />
|-<br />
| [ Mashers' HBL]<br />
| Homebrew Launcher with grid and folder support. (Discontinued)<br />
| [[User:Mashers|Mashers]]<br />
| [https://github.com/d0k3/3DS-Extended-Homebrew-Starter-Pack/blob/35b8ab7dc40cb550b6ea45da319cdd0a0a3b2b54/boot.3dsx Here]<br />
| Lost in masher's retirement<br />
|}<br />
<br />
=== Applications ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/VideahGams/3dsfetch 3dsfetch]<br />
| Small 3DS version of a popular Linux ricing script called screenfetch.<br />
| [[User:VideahGams|VideahGams]]<br />
| [https://github.com/VideahGams/3dsfetch/tree/master Here]<br />
| Yes<br />
| 2015-09-17<br />
|-<br />
| [https://github.com/Klairm/3DS-PluginsFolder 3DS-PluginsFolder]<br />
| Simple program that creates folders with TitleID and copy plugins on them<br />
| [[User:Klairm|Klairm]]<br />
| [https://github.com/Klairm/3DS-PluginsFolder Here]<br />
| Yes<br />
| 2020-09-06<br />
|-<br />
| [https://github.com/JohnodonCode/TSI9 TSI9]<br />
| A simple program for detecting touch screen input.<br />
| [[User:Johnodon|Johnodon]]<br />
| [https://github.com/JohnodonCode/TSI9/releases Here]<br />
| Yes<br />
| 2020-1-18<br />
|-<br />
| [https://github.com/joel16/3DSident/ 3DSident]<br />
| Identity tool for the Nintendo 3DS heavily inspired by PSPident.<br />
| [[User:Joel16|Joel16]]<br />
| [https://github.com/joel16/3DSident/releases Here]<br />
| Yes<br />
| 2018-8-2<br />
|-<br />
| [https://gbatemp.net/threads/release-clear-mac-filter.515882/ Clear MAC Filter]<br />
| Reset 8-hour per-console StreetPass rate limiting<br />
| tastymeatball<br />
| [https://gbatemp.net/threads/release-clear-mac-filter.515882/ Here]<br />
| Yes<br />
| 2018-8-24<br />
|-<br />
| [https://github.com/CPunch/CtrRGBPATTY/releases CtrRGBPATTY]<br />
| Generate patches that edit LED notifications<br />
| CPunch<br />
| [https://github.com/CPunch/CtrRGBPATTY/releases Here]<br />
| Yes<br />
| 2017-11-3<br />
|-<br />
| [https://github.com/plutooo/ctrrpc ctrrpc]<br />
| A small and easily extensible RPC server/client written in C/Python. Allows you to quickly poke service-commands and <code>syscall</code>s over Wi-Fi from a Python shell on your PC. Useful during reverse-engineering. ''No longer under (active) development?''<br />
| [[User:plutooo|plutoo]]<br />
| Build from [https://github.com/plutooo/ctrrpc repo]<br />
| Yes<br />
| 2014-11-10<br />
|-<br />
| [https://github.com/yellows8/ctr-streaming-server ctr-streaming-server]<br />
| A 3DS homebrew audio/video playback server. It can also send [[HID_Shared_Memory|HID]] state to the client (see the README) when enabled. The included <code>parse_hidstream</code> tool can be used to parse that HID data to simulate keyboard/mouse input events, via Linux <code>uinput</code>. ''No longer under (active) development?''<br />
| [[User:yellows8|yellows8]]<br />
| Build from [https://github.com/yellows8/ctr-streaming-server repo]<br />
| Yes<br />
| 2014-11-20<br />
|-<br />
| [https://github.com/DownloadMii/DownloadMii-3DS DownloadMii]<br />
| A WIP repo-based online marketplace for homebrew applications & games. Appears to be non-active and out-of-date.<br />
| [[User:filfat|filfat]]<br />
| Build from [https://github.com/DownloadMii/DownloadMii-3DS repo]<br />
| Yes<br />
| 2015-11-24<br />
|-<br />
| [https://github.com/linoma/fb43ds fb43ds]<br />
| A simple 3DS Facebook chat client<br />
| [[User:linoma|linoma]]<br />
| Build from [https://github.com/linoma/fb43ds repo]<br />
| Yes<br />
| 2015-04-07<br />
|-<br />
| [https://github.com/iamevn/for-anyone-who-walks-a-lot for-anyone-who-walks-a-lot]<br />
| Tool to get past the 10 coin per day limit on earning Play Coins by walking.<br />
| [[User:iamevn|iamevn]]<br />
| [https://github.com/iamevn/for-anyone-who-walks-a-lot/releases Here]<br />
| Yes<br />
| 2016-03-26<br />
|-<br />
| [https://github.com/zeta0134/3ds-homebrew-browser Homebrew Browser]<br />
| Download homebrew from the internet!<br />
| [[User:cromo|cromo]], [[User:zeta0134|zeta0134]]<br />
| [https://github.com/zeta0134/3ds-homebrew-browser/releases Here]<br />
| Yes<br />
| 2015-10-07<br />
|-<br />
| [https://github.com/MrJPGames/NFCReader NFCReader]<br />
| Allows you to use your 3DS as a NFC/RFID UID Scanner.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/NFCReader/releases Here]<br />
| Yes<br />
| 2017-01-21<br />
|-<br />
| [https://github.com/SciresM/ScreenInfo ScreenInfo]<br />
| Identify whether New 3DS LCD panels are TN or IPS.<br />
| [[User:SciresM|SciresM]]<br />
| [https://github.com/SciresM/ScreenInfo/releases Here]<br />
| Yes<br />
| 2016-09-04<br />
|}<br />
<br />
=== Game Engines ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/TurtleP/LovePotion LÖVE Potion]<br />
| [https://love2d.org/ LÖVE] for Nintendo 3DS.<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/LovePotion/releases Here]<br />
| [https://github.com/TurtleP/LovePotion Yes]<br />
| 2021-01-16<br />
|-<br />
| [https://ctrulua.github.io/ ctrµLua]<br />
| A Lua interpreter for 3DS, brought to life by the remnants of the µLua community.<br />
| [[User:Firew0lf|Firew0lf]], Reuh, Negi<br />
| [https://github.com/ctruLua/ctruLua/releases Here]<br />
| Yes<br />
| 2016-06-27<br />
|-<br />
| [https://blog.easyrpg.org/2016/05/player-for-nintendo-3ds/ EasyRPG Player]<br />
| RPG Maker 2000/2003 interpreter<br />
| [[User:Rinnegatamante|Rinnegatamante]] & EasyRPG Team<br />
| [https://easyrpg.org/player/downloads/ Here]<br />
| [https://github.com/EasyRPG/Player Yes]<br />
| 2019-03-03<br />
|-<br />
| [https://github.com/Rinnegatamante/lpp-3ds LuaPlayer+ 3DS]<br />
| First Lua interpreter 3DS homebrew, under Lua 5.3.1<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [https://github.com/Rinnegatamante/lpp-3ds/releases Here]<br />
| Yes<br />
| 2016-09-21<br />
|-<br />
| [http://vault.digitalmzx.net MegaZeux 3DS]<br />
| A port of the MegaZeux GCS to the 3DS.<br />
| MegaZeux developers<br />
| [http://vault.digitalmzx.net Here]<br />
| [https://github.com/AliceLR/megazeux Yes]<br />
| 2018-03-04<br />
|}<br />
<br />
=== Games ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/MrJPGames/2048-3D 2048-3D]<br />
| A port of the popular game 2048 for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/2048-3D/releases Here]<br />
| Yes<br />
| 2016-02-12<br />
|-<br />
| ''[https://github.com/smealum/3dscraft 3DSCraft]''<br />
| A Minecraft port for the 3DS. ''No longer under (active) development?''<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/3dscraft repo] (alt. [https://smealum.github.io/3dscraft/downloads/3dscraft_141120.zip here])<br />
| Yes<br />
| 2014-11-20<br />
|-<br />
| [https://github.com/markwinap/3DS_Nyan_Cat 3DS Nyan Cat]<br />
| A port of Nyan Cat for the 3DS, using <code>LIBSF2D</code>.<br />
| [[User:markwinap|markwinap]]<br />
| Build from [https://github.com/markwinap/3DS_Nyan_Cat repo] (alt. [https://www.dropbox.com/s/e400my3xm0zw74r/nyan_cat.zip?dl=0 here])<br />
| Yes<br />
| 2015-05-26<br />
|-<br />
| [https://github.com/TurtleP/Antibounce Antibounce]<br />
| "Move your player to bounce around and collect coins. Go between screens through the holes in the sides of the floor. 3D can also be enabled."<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/Antibounce/releases Here]<br />
| Yes<br />
| 2015-12-23<br />
|-<br />
| [https://github.com/Magicrafter13/Breakout Breakout]<br />
| "A 3ds Breakout Clone."<br />
| [[User:Magicrafter13|Magicrafter13]]<br />
| [https://github.com/Magicrafter13/Breakout/releases Here]<br />
| Yes<br />
| 2017-10-17<br />
|-<br />
| ''[https://github.com/UnsureSherlock/checkers3ds checkers3ds]''<br />
| A checkers game in glorious ASCII. ''No longer under development.''<br />
| [[User:UnsureSherlock|UnsureSherlock]]<br />
| Build from [https://github.com/UnsureSherlock/checkers3ds repo]<br />
| Yes<br />
| 2016-02-25<br />
|-<br />
| [https://github.com/Kaisogen/CookieCollector-3DS- Cookie Collector]<br />
| A tiny adaptation of the popular [https://en.wikipedia.org/wiki/Cookie_Clicker Cookie Clicker] game for the 3DS.<br />
| [[User:Kaisogen|Kaisogen]]<br />
| [https://github.com/Kaisogen/CookieCollector-3DS-/releases Here]<br />
| Yes<br />
| 2017-06-04<br />
|-<br />
| [https://github.com/TheMachinumps/Cookie_Clicker_3DS Cookie Clicker 3DS]<br />
| A simple Cookie Clicker type of game inspired by [[User:Kaisogen|Kaisogen]]'s Cookie Collector<br />
| [[User:TheMachinumps|TheMachinumps]]<br />
| [https://github.com/TheMachinumps/Cookie_Clicker_3DS/releases Here]<br />
| Yes<br />
| 2016-08-27<br />
|-<br />
| [https://github.com/masterfeizz/EDuke3D EDuke3D]<br />
| An unofficial port of EDuke32 for the 3DS.<br />
| [[User:MasterFeizz|MasterFeizz]]<br />
| [https://github.com/masterfeizz/EDuke3D/releases Here]<br />
| Yes<br />
| 2016-05-09<br />
|-<br />
| [https://github.com/BHSPitMonkey/Helii3DS Helii]<br />
| A port of [https://github.com/BHSPitMonkey/Helii3D Helii] for the 3DS.<br />
| [[User:BHSPitMonkey|BHSPitMonkey]]<br />
| [https://github.com/BHSPitMonkey/Helii3DS/releases Here]<br />
| Yes<br />
| 2015-09-18<br />
|-<br />
| [https://github.com/sgowen/insectoid-defense Insectoid Defense]<br />
| A Sci-Fi Tower Defense game.<br />
| [[User:Sgowen|sgowen]]<br />
| [https://github.com/sgowen/insectoid-defense/releases Here]<br />
| Yes<br />
| 2015-11-09<br />
|-<br />
| [https://github.com/VideahGams/NumberFucker3DS NumberFucker3DS]<br />
| Simple math game, originally used as a debug game for LövePotion.<br />
| [[User:VideahGams|VideahGams]]<br />
| [https://github.com/VideahGams/NumberFucker3DS Here]<br />
| Yes<br />
| 2015-09-19<br />
|-<br />
|[https://github.com/nop90/ZeldaROTH/ Zelda ROTH for 3DS]<br />
|A port of Legend of Zelda: Return of the Hylian, a Zelda fangame, to 3DS.<br />
|[[User:nop90|nop90]]<br />
|[https://github.com/nop90/ZeldaROTH/releases Here]<br />
|Yes<br />
|2016-09-11<br />
|-<br />
| [https://github.com/MrJPGames/Mastermind-3DS Mastermind 3DS]<br />
| A port of Mastermind for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/Mastermind-3DS/releases Here]<br />
| Yes<br />
| 2015-08-15<br />
|-<br />
| [https://pyug.at/PyWeek/2012-09 One Whale Trip]<br />
| Five-lane underwater whale swimming/pearl pickup adventure game in Python.<br />
| [[User:thp|thp]]<br />
| [https://bitbucket.org/pyugat/pyweek1209/downloads/OneWhaleTrip-2016-07-18-3DS.zip Here]<br />
| [https://bitbucket.org/pyugat/pyweek1209/src/bce5156dbee72f38c4fcf5d7b3df9cfb9ddd5b0a/3ds Yes]<br />
| 2016-10-02<br />
|-<br />
| [https://github.com/sergiou87/open-supaplex OpenSupaplex]<br />
| An open source 1:1 reimplementation of [https://en.wikipedia.org/wiki/Supaplex Supaplex] for the 3DS.<br />
| [https://github.com/sergiou87 sergiou87]<br />
| [https://github.com/sergiou87/open-supaplex/releases Here]<br />
| [https://github.com/sergiou87/open-supaplex Yes]<br />
| 2020-06-29<br />
|-<br />
| [https://github.com/gatuno/PaddlePuffle3DS Paddle Puffle 3DS]<br />
| A port of [http://puffles.gatuno.mx Paddle Puffle] for the 3DS.<br />
| [[User:Peanut42|Peanut42]]<br />
| [http://puffles.gatuno.mx/releases/paddlepuffle3ds.zip Here]<br />
| [https://github.com/gatuno/PaddlePuffle3DS Yes]<br />
| 2015-07-05<br />
|-<br />
| [http://david.dantoine.org/proyecto/26/ Pituka Classics]<br />
| Play CPC classics using [http://david.dantoine.org/proyecto/4/ Pituka Emulator-Core] on 3DS.<br />
| [[User:D_Skywalk|D_Skywalk]]<br />
| [http://david.dantoine.org/descargas/72 Rick Dangerous] [http://david.dantoine.org/descargas/2 Core]<br />
| [http://david.dantoine.org/descargas/4 Yes (core)]<br />
| 2016-02-26<br />
|-<br />
| [https://github.com/smealum/portal3DS Portal3DS]<br />
| An adaptation of [https://en.wikipedia.org/wiki/Portal_(video_game) Portal] for the 3DS.<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/portal3DS repo] (Precompiled [http://www.mediafire.com/file/yo463wt6y4tybch/portal3DS.rar here])<br />
| Yes<br />
| 2015-08-18<br />
|-<br />
| [https://github.com/masterfeizz/ctrQuake ctrQuake]<br />
| An unofficial port of Quake for the 3DS, fully playable.<br />
| [[User:MasterFeizz|MasterFeizz]]<br />
| [https://github.com/masterfeizz/ctrQuake/releases Here]<br />
| Yes<br />
| 2016-09-16<br />
|-<br />
| [https://github.com/MrJPGames/Othello-3DS/ Reversi]<br />
| [https://en.wikipedia.org/wiki/Reversi Reversi] for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/Othello-3DS/releases Here]<br />
| Yes<br />
| 2016-03-05<br />
|-<br />
| [https://github.com/landm2000/sokoban Sokoban]<br />
| An unofficial port of the puzzle game [https://en.wikipedia.org/wiki/Sokoban Sokoban] for the 3DS.<br />
| [[User:Landm|Landm]]<br />
| [https://github.com/landm2000/sokoban/tree/master Here]<br />
| Yes<br />
| 2016-03-14<br />
|-<br />
| [https://github.com/TurtleP/SpaceFruit/ Space Fruit]<br />
| Hackathon game by 4 friends ported to 3DS. Asteroids but with fruit.<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/SpaceFruit/releases Here]<br />
| Yes<br />
| 2016-04-09<br />
|-<br />
| [https://github.com/derrekr/srb2_3ds/ SRB2 3DS]<br />
| An unofficial port of [https://wiki.srb2.org/wiki/Version_2.1 Sonic Robo Blast 2 Version 2.1.20] to New 3DS. It was made by derrek, a known vulnerability researcher and homebrew developer. SRB2 2.2.X Versions (2.2.8 is latest at the time this is being made) aren't ported yet and probably won't be.<br />
| [[https://github.com/derrekr/ derrekr]]<br />
| [https://github.com/derrekr/srb2_3ds Here (Don't use 2.2 files!)]<br />
| Yes<br />
| 2018-12-23<br />
|-<br />
| [https://github.com/sgowen/tappy-plane Tappy Plane]<br />
| A port of [https://en.wikipedia.org/wiki/Flappy_Bird Flappy Bird] for 3DS, but with a colorful plane.<br />
| [[User:Sgowen|sgowen]]<br />
| [https://github.com/sgowen/tappy-plane/releases Here]<br />
| Yes<br />
| 2015-11-09<br />
|-<br />
| [https://thp.itch.io/tetrepetete-3ds Tetrepetete 3DS]<br />
| A game with blocks.<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/tetrepetete-3ds Here]<br />
| No<br />
| 2016-06-29<br />
|-<br />
| [https://thp.itch.io/that-rabbit-game-3ds That Rabbit Game 3DS]<br />
| Inverse duck hunt with accelerometer input and stereoscopic 3D.<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/that-rabbit-game-3ds Here]<br />
| No<br />
| 2016-07-04<br />
|-<br />
| [https://github.com/Steveice10/WorldOf3DSand World of 3DSand]<br />
| A port of World of Sand for the 3DS.<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/WorldOf3DSand/releases Here]<br />
| Yes<br />
| 2016-07-12<br />
|-<br />
| [https://github.com/smealum/yeti3DS Yeti3DS]<br />
| A quick and dirty port of Derek Evans' Yeti3D software rendering engine.<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/yeti3DS repo]<br />
| Yes<br />
| 2015-08-07<br />
|-<br />
| [https://thp.itch.io/loonies-8192 Loonies 8192]<br />
| A Mini Retro Puzzle for DOS, the PSP and 3DS (Homebrew)<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/loonies-8192 Here]<br />
| No<br />
| 2019-01-27<br />
|-<br />
| [https://github.com/MrHuu/devilutionX-3ds DevilutionX]<br />
| A 3DS Port of Diablo 1.<br />
| [[User:MrHuu|MrHuu]]<br />
| [https://github.com/MrHuu/devilutionX-3ds Here]<br />
| Yes<br />
| 2020-05-08<br />
|-<br />
|}<br />
<br />
=== Emulators ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| ''[https://github.com/st4rk/3DNES 3DNES]''<br />
| A NES emulator, without sound support. ''No longer under development.''<br />
| st4rk, gdkChan<br />
| [https://github.com/St4rk/3DNES/raw/master/3DNES_old.3dsx Here]<br />
| Yes<br />
| 2015-03-28<br />
|-<br />
| [http://asie.pl/homebrew/#atari800 atari800-3DS]<br />
| An Atari 8-bit home computer emulator.<br />
| asie<br />
| [http://asie.pl/homebrew/#atari800 Here]<br />
| [https://github.com/asiekierka/atari800-3ds Yes]<br />
| 2016-10-29<br />
|-<br />
| [https://github.com/StapleButter/blargSnes blargSnes]<br />
| A Super Nintendo (SNES) emulator. A compatibility list can be found [http://wiki.gbatemp.net/wiki/BlargSnes_Compatibility_List here].<br />
| StapleButter<br />
| [http://blargsnes.kuribo64.net/download/blargSnes_1.3b.zip Here]<br />
| Yes<br />
| 2015-06-12<br />
|-<br />
| [https://github.com/xerpi/CHIP-3DS CHIP-3DS]<br />
| A simple and slow CHIP-8 emulator.<br />
| xerpi<br />
| Build from [https://github.com/xerpi/CHIP-3DS repo] (alt. [https://www.mediafire.com/?y94yjhzf70fsfsi here])<br />
| Yes<br />
| 2015-04-02<br />
|-<br />
| [https://gbatemp.net/threads/chip8-3ds.434425/ CHIP8-2DS]<br />
| CHIP-8 emulator with savestates and touch controls.<br />
| nopy4869<br />
| [https://github.com/nopy4869/CHIP8-2DS/releases Here]<br />
| Yes<br />
| 2016-07-20<br />
|-<br />
| [https://github.com/shinyquagsire23/gpsp CitrAGB]<br />
| Yet another GBA emulator for the 3DS.<br />
| [[User:shinyquagsire23|Shiny Quagsire]]<br />
| Build from [https://github.com/shinyquagsire23/gpsp/tree/master/3ds repo] (alt. [https://www.dropbox.com/s/sxb7x34u58g4zo2/3ds.3dsx?dl=0 here])<br />
| Yes<br />
| 2015-09-21<br />
|-<br />
| [https://github.com/Steveice10/GameYob GameYob]<br />
| A Game Boy (Color) emulator. A compatibility list can be found [http://wiki.gbatemp.net/wiki/GameYob_3DS_Compatibility_List here].<br />
| Drenn/Steveice10<br />
| [https://github.com/Steveice10/GameYob/releases Here]<br />
| Yes<br />
| 2016-07-17<br />
|-<br />
| [https://github.com/mgba-emu/mgba mGBA]<br />
| A GBA emulator that runs well without kernel hax.<br />
| endrift<br />
| [https://mgba.io/downloads.html Here]<br />
| Yes<br />
| 2016-10-13<br />
|-<br />
| [https://github.com/mrdanielps/r3Ddragon r3Ddragon]<br />
| A WIP Virtual Boy emulator for the 3DS based on Reality Boy / Red Dragon.<br />
| mrdanielps<br />
| [https://github.com/mrdanielps/r3Ddragon/releases Here]<br />
| Yes<br />
| 2016-08-16<br />
|-<br />
| [https://github.com/libretro/RetroArch RetroArch]<br />
| A multisystem emulator. (GB, GBA, SNES, Genesis, CPS1, CPS2, etc.)<br />
| libretro<br />
| [http://buildbot.libretro.com/nightly/nintendo/3ds/ Here]<br />
| Yes<br />
| Undergoing rapid development.<br />
|-<br />
| [https://github.com/bubble2k16/snes9x_3ds SNES9x for 3DS]<br />
| A SNES emulator for the old 3DS / 2DS. Optimised from Snes9x 1.43 and runs many games at full speed. Compatibility list [http://wiki.gbatemp.net/wiki/Snes9x_for_3DS here]<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/snes9x_3ds/releases Here]<br />
| Yes<br />
| 2017-02-11<br />
|-<br />
| [https://github.com/bubble2k16/emus3ds_3ds VirtuaNES for 3DS]<br />
| A NES emulator for the old 3DS / 2DS. Optimised from VirtuaNES 0.9.7 and runs many games at full speed.<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/emus3ds/releases Here]<br />
| Yes<br />
| 2017-03-23<br />
|-<br />
| [https://github.com/bubble2k16/emus3ds_3ds TemperPCE for 3DS]<br />
| A PC-Engine/Turbografx-16 emulator for the old 3DS / 2DS. Optimised from Temper runs all games, including CD-ROM and SGX games at full speed.<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/temperpce_3ds/releases Here]<br />
| Yes<br />
| 2017-06-19<br />
|-<br />
|}<br />
<br />
===Theme managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/yellows8/3ds_homemenu_extdatatool 3DS HomeMenu extdata Tool]<br />
| Tool for accessing the SD extdata which Home Menu uses. This essentially allows writing custom themes to extdata which get loaded at Home Menu startup.<br />
| [[User:yellows8|yellows8]]<br />
| [https://github.com/yellows8/3ds_homemenu_extdatatool/releases Here]<br />
| Yes<br />
| 2015-08-17<br />
|-<br />
| [https://github.com/Rinnegatamante/CHMM2 Custom Home Menu Manager 2]<br />
| Theme manager for Nintendo 3DS. Discontinued.<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [http://rinnegatamante.it/CHMM2.rar Here]<br />
| Yes<br />
| 2016-07-04<br />
|-<br />
| [https://github.com/ErmanSayin/Themely/tree/88e93816e3b43a40bcee25b1a7a8c71ef6a37db8 Themely]<br />
| Theme manager for Nintendo 3DS with 3dsthem.es integration.<br />
| ErmanSayin<br />
| [https://github.com/ErmanSayin/Themely/releases/tag/v1.3.1 Here]<br />
| Not anymore, 1.3.1 last FOSS version<br />
| 2017-6-28<br />
|- <br />
|[https://github.com/usagirei/3DS-Theme-Editor Usagi 3DS Theme Editor]<br />
|A simple 3DS theme editor for PC. You will need to have the .NET Library installed on your PC first before you can use it.<br />
|[https://github.com/usagirei usagirei]<br />
|[https://github.com/usagirei/3DS-Theme-Editor/archive/master.zip Here]<br />
|Not sure<br />
|2017.05.28<br />
|-<br />
| Anemone3DS<br />
| New theme and Luma splash screen manager, created to fill the gap left by its predecessors.<br />
| [[User:astronautlevel2]]<br />
| [https://github.com/astronautlevel2/Anemone3DS/releases/ Here]<br />
| Yes<br />
| 2018-5-13<br />
|}<br />
<br />
===Title managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/Steveice10/FBI FBI]<br />
| Open source CIA (un)installer and launcher.<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/FBI/releases?after=2.0.0 Here]<br />
| Yes<br />
| 2015-12-02<br />
|-<br />
| [https://github.com/Steveice10/FBI FBI 2]<br />
| Multipurpose file/title/ticket/save manager<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/FBI/releases Here]<br />
| Yes<br />
| 2018-8-21<br />
|}<br />
<br />
=== Save managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://gbatemp.net/threads/save-data-manager-and-editor-for-firmware-up-to-9-9.396245/ save_manager]<br />
| Proof of concept save exporter/importer<br />
| [[User:profi200|profi200]]<br />
| [Here]<br />
| [https://gist.github.com/profi200/d0d092c11d0eb0692748 Yes]<br />
| 2015-09-13<br />
|-<br />
| [https://github.com/meladroit/svdt svdt]<br />
| Save Data Explorer/Manager<br />
| [[User:meladroit|meladroit]]<br />
| [https://github.com/meladroit/svdt/releases Here]<br />
| Yes<br />
| 2015-10-16<br />
|-<br />
| [JK's Save Manager]<br />
| Save/Extdata Manager<br />
| JK_<br />
| Here]<br />
| [https://github.com/J-D-K/JKSM/ Yes]<br />
| 2016-09-29<br />
|-<br />
| JK's Save Manager for Rosalina<br />
| Modded version of JKSM for use as .3dsx on Luma 8+<br />
| Phalk, JK_<br />
| [https://github.com/Phalk/JKSM/releases Here]<br />
| Yes<br />
| 2017-7-12<br />
|-<br />
| [https://github.com/FlagBrew/PKSM PKSM]<br />
| Save editor for Pokémon generations 3 to 7<br />
| Bernardo Giordano<br />
| [https://github.com/FlagBrew/PKSM/releases Here]<br />
| Yes<br />
| 2020-6-13<br />
|-<br />
| [https://github.com/FlagBrew/Checkpoint Checkpoint]<br />
| Fast and simple homebrew save manager for 3DS and Switch written in C++<br />
| Bernardo Giordano<br />
| [https://github.com/FlagBrew/Checkpoint/releases Here]<br />
| Yes<br />
| 2019-12-9<br />
|-<br />
| [https://github.com/phijor/SpecializeMii/ SpecializeMii]<br />
| Editor for Mii database (specialness)<br />
| phijor<br />
| [https://github.com/phijor/SpecializeMii/releases Here]<br />
| Yes<br />
| 2017-1-22<br />
|-<br />
| [https://github.com/rboninsegna/SpecializeMii/ SpecializeMii]<br />
| Editor for Mii database (specialness and ownership)<br />
| phijor, [[User:Ryccardo|Ryccardo]]<br />
| [https://github.com/rboninsegna/SpecializeMii/releases Here]<br />
| Yes<br />
| 2017-8-13<br />
|}<br />
<br />
=== File servers ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/mtheall/ftpd ftpd (ftBrony)]<br />
| A FTP server.<br />
| [https://github.com/mtheall mtheall]<br />
| [https://github.com/mtheall/ftpd/releases Here]<br />
| Yes<br />
| 2020-05-30<br />
|-<br />
| ''[https://github.com/iamevn/FTP-3DS FTP-3DS]''<br />
| Fork of ftBrony with a Nintendo theme. ''No longer under development and without repo.''<br />
| [[User:iamevn|iamevn]]<br />
| N/A<br />
| Yes (''No source officially available.'')<br />
| N/A<br />
|-<br />
| [https://github.com/FloatingStar/FTP-GMX FTP - Graphic ModifierX Edition]<br />
| Fork of ftpd with aesthetic modifications.<br />
| [[User:FloatingStar|FloatingStar]]<br />
| [https://github.com/FloatingStar/FTP-GMX/releases Here]<br />
| Yes<br />
| 2016-01-27<br />
|-<br />
| [https://github.com/smealum/ftpony ftpony]<br />
| A basic FTP server, useful for testing new homebrew versions without swapping the SD card. ''No longer under (active) development?''<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/ftpony repo]<br />
| Yes<br />
| 2014-11-24<br />
|}<br />
<br />
=== Icon Packs ===<br />
Icon Packs are <code>SMDH</code> Packs for homebrew apps.<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="10%" | Last Updated<br />
|-<br />
| [https://gbatemp.net/threads/icon-pack-simplok-for-the-homebrew-launcher.396750/ Simplok]<br />
| The first 3DS Icon pack.<br />
| [[User:link6155|link6155]]<br />
| [http://1drv.ms/1EJCq2e Here]<br />
| 2015-09-12<br />
|-<br />
| ''[https://gbatemp.net/threads/1lp-icon-pack.402018/ 1LP]''<br />
| Another 3DS Icon pack. ''Repo is dead, no alternate downloads available.''<br />
| [[User:100pcrack|100pcrack]]<br />
| N/A<br />
| 2015-12-22<br />
|-<br />
| [https://gbatemp.net/threads/icon-pack-modern-ui.404366/ Modern UI]<br />
| A simple icon pack with a flat and minimalist design.<br />
| [[User:LouchDaishiteru|LouchDaishiteru]]<br />
| [https://gbatemp.net/threads/icon-pack-modern-ui.404366/ Here]<br />
| 2016-02-15<br />
|}<br />
<br />
=== Demos ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/halcy/nordlicht19 Skate Station]<br />
| A demo for the 3DS featuring music and 3D effects <br />
| SVatG<br />
| [https://aka-san.halcy.de/nordlicht2019/Skate%20Station.zip Here]<br />
| Yes<br />
| July 2019<br />
|-<br />
| cubedemo<br />
| A short demo of Homebrew on the 3DS, with working sound.<br />
| [[User:plutoo|plutoo]]<br />
| [https://mega.co.nz/#!KUQFiQYA!pv8HDEyrmuX6Eyw2hW0opL7gf9Ztmjd9J5pPsvs_rD4 Here]<br />
| No<br />
| N/A<br />
|-<br />
| [http://www.pouet.net/prod.php?which=66607 demo ou mourir]<br />
| Small demo for the 3DS with music and 2D effects<br />
| Desire<br />
| [http://mudlord.info/democrap/dsr_demooumourir.zip Here]<br />
| No<br />
| November 2015<br />
|-<br />
| The Night of Interruptions!<br />
| An independently made short film which can be watched on the Nintendo 3DS.<br />
| [[User:Chukoloco08|Chukoloco08]]<br />
| [https://archive.org/details/the-night-of-interruptions-3ds Here]<br />
| No<br />
| December 2020<br />
|-<br />
| Time for The Chickens to go to Sleep<br />
| An animated bedtime story featuring The Clucking Chickens by Sawyer Ique which can be viewed on the Nintendo 3DS.<br />
| [[User:Chukoloco08|Chukoloco08]]<br />
| [https://sawyer-ique.itch.io/sawyers-stories-time-for-the-chickens-to-go-to-sleep Here]<br />
| No<br />
| May 2021<br />
<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=21560
GPU/Shader Instruction Set
2021-08-13T23:08:41Z
<p>Oreo639: Forgot to fix the register names too</p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Vertex shader<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output registers hold the data to be passed to the later GPU stages and are write-only. Each of the output register is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
Keep in mind that writing to the same output register/component more than once appears appears to cause problems (e.g. GPU hangs).<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0xF<br />
| o0-o15<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0xF<br />
| v0-v15<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=21559
GPU/Shader Instruction Set
2021-08-13T23:05:52Z
<p>Oreo639: Fix SRC/DST mapping</p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Vertex shader<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output registers hold the data to be passed to the later GPU stages and are write-only. Each of the output register is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
Keep in mind that writing to the same output register/component more than once appears appears to cause problems (e.g. GPU hangs).<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0xF<br />
| o0-o6<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0xF<br />
| v0-v7<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=21558
GPU/Shader Instruction Set
2021-08-13T00:55:30Z
<p>Oreo639: </p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Vertex shader<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output registers hold the data to be passed to the later GPU stages and are write-only. Each of the output register is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
Keep in mind that writing to the same output register/component more than once appears appears to cause problems (e.g. GPU hangs).<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x6<br />
| o0-o6<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC1 raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x7<br />
| v0-v7<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=21557
GPU/Shader Instruction Set
2021-08-13T00:52:32Z
<p>Oreo639: </p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Vertex shader<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output registers hold the data to be passed to the later GPU stages and are write-only. Each of the output register is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
Keep in mind that writing to an output register's component twice appears that writing twice appears to cause problems (e.g. GPU hangs).<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x6<br />
| o0-o6<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC1 raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x7<br />
| v0-v7<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21556
SHBIN
2021-08-13T00:45:09Z
<p>Oreo639: </p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]] (with the extension .bcsdr). They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Likely a version number)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Likely a version number)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex and geometry shader outmaps (geometry shader)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float constant register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Label ID<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Unknown (always 1?)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Size of label's location (in words). 0xFFFFFFFF/(uint32_t)-1 if there is no size.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored in a constant table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant register ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Boolean constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Integer constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| floating-point constant register ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Output type (see table below)<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Register ID<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Output attribute component mask (e.g. 5=xz)<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Unknown (Consistently the same number throughout the DVLE, may vary between DVLEs?)<br />
|-<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Type<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
<br />
Keep in mind that the usage of the term "Uniform" here is used as [https://developer.download.nvidia.com/CgTutorial/cg_tutorial_chapter03.html defined by Nvidia] (variable who obtains its initial value from an external environment) and not as defined by RenderMan/GLSL (variables whose values are constant over a shaded surface).<br />
<br />
The uniform table contains a list of all registers whose initial values are derived by an external source along with their layout and associated symbol.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Register index of the start of the uniform<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Register index of the end of the uniform (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=21555
GPU/Shader Instruction Set
2021-08-13T00:38:00Z
<p>Oreo639: Update the output attribute section</p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Vertex shader<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output registers hold the data to be passed to the later GPU stages and are write-only. Each of the output register is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
Keep in mind that writing to an output register that has been previously written to within 8 cycles of the first write appears to cause the GPU to hang. After 8 cycles, all further writes to an output register appear to be ignored and do not cause a GPU hang.<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x6<br />
| o0-o6<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC1 raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x7<br />
| v0-v7<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21554
SHBIN
2021-08-13T00:37:22Z
<p>Oreo639: Update output table</p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]] (with the extension .bcsdr). They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Likely a version number)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Likely a version number)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex and geometry shader outmaps (geometry shader)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float constant register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Label ID<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Unknown (always 1?)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Size of label's location (in words). 0xFFFFFFFF/(uint32_t)-1 if there is no size.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored in a constant table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant register ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Boolean constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Integer constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| floating-point constant register ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Output type (see table below)<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Register ID<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Output attribute component mask (e.g. 5=xz)<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Unknown (Consistently the same number throughout the entire shader?)<br />
|-<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Type<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
<br />
Keep in mind that the usage of the term "Uniform" here is used as [https://developer.download.nvidia.com/CgTutorial/cg_tutorial_chapter03.html defined by Nvidia] (variable who obtains its initial value from an external environment) and not as defined by RenderMan/GLSL (variables whose values are constant over a shaded surface).<br />
<br />
The uniform table contains a list of all registers whose initial values are derived by an external source along with their layout and associated symbol.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Register index of the start of the uniform<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Register index of the end of the uniform (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21548
SHBIN
2021-08-11T20:38:16Z
<p>Oreo639: Update label table and fix typo</p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]] (with the extension .bcsdr). They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Likely a version number)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Likely a version number)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float constant register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Label ID<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Unknown (always 1?)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Size of label's location (in words). 0xFFFFFFFF/(uint32_t)-1 if there is no size.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored in a constant table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant register ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Boolean constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Integer constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| floating-point constant register ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Type<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
<br />
Keep in mind that the usage of the term "Uniform" here is used as [https://developer.download.nvidia.com/CgTutorial/cg_tutorial_chapter03.html defined by Nvidia] (variable who obtains its initial value from an external environment) and not as defined by RenderMan/GLSL (variables whose values are constant over a shaded surface).<br />
<br />
The uniform table contains a list of all registers whose initial values are derived by an external source along with their layout and associated symbol.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Register index of the start of the uniform<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Register index of the end of the uniform (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21534
SHBIN
2021-07-25T04:43:29Z
<p>Oreo639: </p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]] (with the extension .bcsdr). They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Possibly a version number?)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Possibly a version number?)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float constant register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Label ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| ?<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored as in a constant table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant register ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Boolean constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Integer constant register ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| floating-point constant register ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Type<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
<br />
Keep in mind that the usage of the term "Uniform" here is used as [https://developer.download.nvidia.com/CgTutorial/cg_tutorial_chapter03.html defined by Nvidia] (variable who obtains its initial value from an external environment) and not as defined by RenderMan/GLSL (variables whose values are constant over a shaded surface).<br />
<br />
The uniform table contains a list of all registers whose initial values are derived by an external source along with their layout and associated symbol.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Register index of the start of the uniform<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Register index of the end of the uniform (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=21533
GPU/Shader Instruction Set
2021-07-25T04:31:24Z
<p>Oreo639: Fix typo</p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Vertex shader<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output attribute registers hold the data to be passed to the later GPU stages and are write-only. Each of the output attribute register components is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
It appears that writing twice to a component of an output register that was written to before can cause problems (e.g. GPU hangs).<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x6<br />
| o0-o6<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC1 raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x7<br />
| v0-v7<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21532
SHBIN
2021-07-25T02:23:15Z
<p>Oreo639: </p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]] (with the extension .bcsdr). They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Possibly a version number?)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Possibly a version number?)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float constant register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Label ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| ?<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored as in a constant table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant bool ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant integer vector ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant vector ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! ID<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
<br />
Keep in mind that the usage of the term "Uniform" here is used as [https://developer.download.nvidia.com/CgTutorial/cg_tutorial_chapter03.html defined by Nvidia] (variable who obtains its initial value from an external environment) and not as defined by RenderMan/GLSL (variables whose values are constant over a shaded surface).<br />
<br />
The uniform table contains a list of all registers whose initial values are derived by an external source along with their layout and associated symbol.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Register index of the start of the uniform<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Register index of the end of the uniform (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21531
SHBIN
2021-07-25T02:18:30Z
<p>Oreo639: Update usage of the term uniform and add clarification</p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]] (with the extension .bcsdr). They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Possibly a version number?)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Possibly a version number?)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float constant register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Label ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| ?<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored as in a constant table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant bool ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant integer vector ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Constant vector ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! ID<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
<br />
Keep in mind that the usage of the term "Uniform" here is used as [https://developer.download.nvidia.com/CgTutorial/cg_tutorial_chapter03.html defined by Nvidia] (variable who obtains its initial value from an external environment) and not as defined by RenderMan/GLSL (variables whose values are constant over a shaded surface).<br />
<br />
The uniform table contains a list of all registers whose initial values are derived by an external source.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Register index of the start of the uniform<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Register index of the end of the uniform (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=GPU/Shader_Instruction_Set&diff=21530
GPU/Shader Instruction Set
2021-07-25T01:48:09Z
<p>Oreo639: Add register table and idx table</p>
<hr />
<div>[[Category:GPU]]<br />
<br />
== Overview ==<br />
A compiled shader binary is comprised of two parts : the main instruction sequence and the operand descriptor table. These are both sent to the GPU around the same time but using separate [[GPU/Internal_Registers|GPU Commands]]. Instructions (such as format 1 instruction) may reference operand descriptors. When such is the case, the operand descriptor ID is the offset, in words, of the descriptor within the table.<br />
Both instructions and descriptors are coded in little endian.<br />
Basic implementations of the following specification can be found at [https://github.com/smealum/aemstro] and [https://github.com/neobrain/nihstro].<br />
The instruction set seems to have been heavily inspired by Microsoft's vs_3_0 [http://msdn.microsoft.com/en-us/library/windows/desktop/bb172938%28v=vs.85%29.aspx] and the Direct3D shader code [https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891%28v=vs.85%29.aspx].<br />
Please note that this page is being written as the instruction set is reverse engineered; as such it may very well contain mistakes.<br />
<br />
Debug information found in the code.bin of "Ironfall: Invasion" suggests that there may not be more than 512 instructions and 128 operand descriptors in a shader.<br />
<br />
== Nomenclature ==<br />
<br />
* opcode names with I appended to them are the same as their non-I version, except they use the inverted instruction format, giving 7 bits to SRC2 (and access to constant registers) and 5 bits to SRC1<br />
<br />
* opcode names with U appended to them are the same as their non-U version, except they are executed conditionally based on the value of a constant boolean register.<br />
<br />
* opcode names with C appended to them are the same as their non-C version, except they are executed conditionally based on a logical expression specified in the instruction.<br />
<br />
== Instruction formats ==<br />
<br />
Format 1 : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1i : (used for register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xE<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1u : (used for unary register operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 1c : (used for comparison operations)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x7<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x7<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0xC<br />
| 0x7<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x13<br />
| 0x2<br />
| Address register index for SRC1 (IDX_1)<br />
|-<br />
| 0x15<br />
| 0x3<br />
| Comparison operator for Y (CMPY)<br />
|-<br />
| 0x18<br />
| 0x3<br />
| Comparison operator for X (CMPX)<br />
|-<br />
| 0x1B<br />
| 0x5<br />
| Opcode<br />
|}<br />
<br />
Format 2 : (used for flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Condition boolean operator (CONDOP)<br />
|-<br />
| 0x18<br />
| 0x1<br />
| Y reference bit (REFY)<br />
|-<br />
| 0x19<br />
| 0x1<br />
| X reference bit (REFX)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 3 : (used for constant-based conditional flow control instructions)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x8<br />
| Number of instructions ? (NUM)<br />
|-<br />
| 0xA<br />
| 0xC<br />
| Destination offset (in words) (DST)<br />
|-<br />
| 0x16<br />
| 0x4<br />
| Constant ID (BOOL/INT)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 4 : (used for SETEMIT)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Winding flag (FLAG_WINDING)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Primitive emit flag (FLAG_PRIMEMIT)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| Vertex ID (VTXID)<br />
|-<br />
| 0x1A<br />
| 0x6<br />
| Opcode<br />
|}<br />
<br />
Format 5 : (used for MAD)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x5<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xA<br />
| 0x7<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC2 (IDX_2)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
Format 5i : (used for MADI)<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size (bits)<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x5<br />
| Operand descriptor ID (DESC)<br />
|-<br />
| 0x5<br />
| 0x7<br />
| Source 3 register (SRC3)<br />
|-<br />
| 0xC<br />
| 0x5<br />
| Source 2 register (SRC2)<br />
|-<br />
| 0x11<br />
| 0x5<br />
| Source 1 register (SRC1)<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Address register index for SRC3 (IDX_3)<br />
|-<br />
| 0x18<br />
| 0x5<br />
| Destination register (DST)<br />
|-<br />
| 0x1D<br />
| 0x3<br />
| Opcode<br />
|}<br />
<br />
== Instructions ==<br />
Unless noted otherwise, SRC1 and SRC2 refer to their respectively indexed float[4] registers (after swizzling). Similarly, DST refers to its indexed register modulo destination component masking, i.e. an expression like DST=SRC1 might actually just set DST.y to SRC1.y.<br />
<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Opcode<br />
! Format<br />
! Name<br />
! Description<br />
|-<br />
| 0x00<br />
| 1<br />
| ADD<br />
| Adds two vectors component by component; DST[i] = SRC1[i]+SRC2[i] for all i<br />
|-<br />
| 0x01<br />
| 1<br />
| DP3<br />
| Computes dot product on 3-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x02<br />
| 1<br />
| DP4<br />
| Computes dot product on 4-component vectors; DST = SRC1.SRC2<br />
|-<br />
| 0x03<br />
| 1<br />
| DPH<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x04<br />
| 1<br />
| DST<br />
| Equivalent to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb219790.aspx dst] instruction: DST = {1, SRC1[1]*SRC2[1], SRC1[2], SRC2[3]}<br />
|-<br />
| 0x05<br />
| 1u<br />
| EX2<br />
| Computes SRC1's first component exponent with base 2; DST[i] = EXP2(SRC1[0]) for all i<br />
|-<br />
| 0x06<br />
| 1u<br />
| LG2<br />
| Computes SRC1's first component logarithm with base 2; DST[i] = LOG2(SRC1[0]) for all i<br />
|-<br />
| 0x07<br />
| 1u<br />
| LITP<br />
| Appears to be related to Microsoft's [https://msdn.microsoft.com/en-us/library/windows/desktop/bb174703.aspx lit] instruction; DST = clamp(SRC1, min={0, -127.9961, 0, 0}, max={inf, 127.9961, 0, inf}); n.b.: 127.9961 = 0x7FFF / 0x100<br />
|-<br />
| 0x08<br />
| 1<br />
| MUL<br />
| Multiplies two vectors component by component; DST[i] = SRC1[i].SRC2[i] for all i<br />
|-<br />
| 0x09<br />
| 1<br />
| SGE<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0A<br />
| 1<br />
| SLT<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x0B<br />
| 1u<br />
| FLR<br />
| Computes SRC1's floor component by component; DST[i] = FLOOR(SRC1[i]) for all i<br />
|-<br />
| 0x0C<br />
| 1<br />
| MAX<br />
| Takes the max of two vectors, component by component; DST[i] = MAX(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0D<br />
| 1<br />
| MIN<br />
| Takes the min of two vectors, component by component; DST[i] = MIN(SRC1[i], SRC2[i]) for all i<br />
|-<br />
| 0x0E<br />
| 1u<br />
| RCP<br />
| Computes the reciprocal of the vector's first component; DST[i] = 1/SRC1[0] for all i<br />
|-<br />
| 0x0F<br />
| 1u<br />
| RSQ<br />
| Computes the reciprocal of the square root of the vector's first component; DST[i] = 1/sqrt(SRC1[0]) for all i<br />
|-<br />
| 0x10<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x11<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x12<br />
| 1u<br />
| MOVA<br />
| Move to address register; Casts the float value given by SRC1 to an integer (truncating the fractional part) and assigns the result to (a0.x, a0.y, _, _), respecting the destination component mask.<br />
|-<br />
| 0x13<br />
| 1u<br />
| MOV<br />
| Moves value from one register to another; DST = SRC1.<br />
|-<br />
| 0x14<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x15<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x16<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x17<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x18<br />
| 1i<br />
| DPHI<br />
| Computes dot product on a 3-component vector with 1.0 appended to it and a 4-component vector; DST = SRC1.SRC2 (with SRC1 homogenous)<br />
|-<br />
| 0x19<br />
| 1i<br />
| DSTI<br />
| DST with sources swapped.<br />
|-<br />
| 0x1A<br />
| 1i<br />
| SGEI<br />
| Sets output if SRC1 is greater than or equal to SRC2; DST[i] = (SRC1[i] >= SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1B<br />
| 1i<br />
| SLTI<br />
| Sets output if SRC1 is strictly less than SRC2; DST[i] = (SRC1[i] < SRC2[i]) ? 1.0 : 0.0 for all i<br />
|-<br />
| 0x1C<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1D<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1E<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x1F<br />
| ?<br />
| ???<br />
| ?<br />
|-<br />
| 0x20<br />
| 0<br />
| BREAK<br />
| Breaks out of LOOP block; do not use while in nested IF/CALL block inside LOOP block.<br />
|-<br />
| 0x21<br />
| 0<br />
| NOP<br />
| Does literally nothing.<br />
|-<br />
| 0x22<br />
| 0<br />
| END<br />
| Signals the shader unit that processing for this vertex/primitive is done.<br />
|-<br />
| 0x23<br />
| 2<br />
| BREAKC<br />
| If condition (see [[#Conditions|below]] for details) is true, then breaks out of LOOP block.<br />
|-<br />
| 0x24<br />
| 2<br />
| CALL<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions<br />
|-<br />
| 0x25<br />
| 2<br />
| CALLC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST and executes instructions until it reaches DST+NUM instructions, else does nothing.<br />
|-<br />
| 0x26<br />
| 3<br />
| CALLU<br />
| Jumps to DST and executes instructions until it reaches DST+NUM instructions if BOOL is true<br />
|-<br />
| 0x27<br />
| 3<br />
| IFU<br />
| If condition BOOL is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST.<br />
|-<br />
| 0x28<br />
| 2<br />
| IFC<br />
| If condition (see [[#Conditions|below]] for details) is true, then executes instructions until DST, then jumps to DST+NUM; else, jumps to DST<br />
|-<br />
| 0x29<br />
| 3<br />
| LOOP<br />
| Loops over the code between itself and DST (inclusive), performing INT.x+1 iterations in total. First, aL is initialized to INT.y. After each iteration, aL is incremented by INT.z.<br />
|-<br />
| 0x2A<br />
| 0 (no param)<br />
| EMIT<br />
| (geometry shader only) Emits a vertex (and primitive if FLAG_PRIMEMIT was set in the corresponding SETEMIT). SETEMIT must be called before this.<br />
|-<br />
| 0x2B<br />
| 4<br />
| SETEMIT<br />
| (geometry shader only) Sets VTXID, FLAG_WINDING and FLAG_PRIMEMIT for the next EMIT instruction. VTXID is the ID of the vertex about to be emitted within the primitive, while FLAG_PRIMEMIT is zero if we are just emitting a single vertex and non-zero if are emitting a vertex and primitive simultaneously. FLAG_WINDING controls the output primitive's winding. Note that the output vertex buffer (which holds 4 vertices) is '''not''' cleared when the primitive is emitted, meaning that vertices from the previous primitive can be reused for the current one. (this is still a working hypothesis and unconfirmed)<br />
|-<br />
| 0x2C<br />
| 2<br />
| JMPC<br />
| If condition (see [[#Conditions|below]] for details) is true, then jumps to DST, else does nothing.<br />
|-<br />
| 0x2D<br />
| 3<br />
| JMPU<br />
| If condition BOOL is true, then jumps to DST, else does nothing. Having bit 0 of NUM = 1 will invert the test, jumping if BOOL is false instead.<br />
|-<br />
| 0x2E-0x2F<br />
| 1c<br />
| CMP<br />
| Sets booleans cmp.x and cmp.y based on the operand's x and y components and the CMPX and CMPY comparison operators respectively. See [[#Comparison_operator|below]] for details about operators. It's unknown whether CMP respects the destination component mask or not.<br />
|-<br />
| 0x30-0x37<br />
| 5i<br />
| MADI<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|-<br />
| 0x38-0x3F<br />
| 5<br />
| MAD<br />
| Multiplies two vectors and adds a third one component by component; DST[i] = SRC3[i] + SRC2[i].SRC1[i] for all i; this is not an FMA, the intermediate result is rounded<br />
|}<br />
<br />
== Operand descriptors ==<br />
Sizes below are in bits, not bytes.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Destination component mask. Bit 3 = x, 2 = y, 1 = z, 0 = w.<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Source 1 negation bit<br />
|-<br />
| 0x5<br />
| 0x8<br />
| Source 1 component selector<br />
|-<br />
| 0xD<br />
| 0x1<br />
| Source 2 negation bit<br />
|-<br />
| 0xE<br />
| 0x8<br />
| Source 2 component selector<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Source 3 negation bit<br />
|-<br />
| 0x17<br />
| 0x8<br />
| Source 3 component selector<br />
|}<br />
<br />
Component selector :<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Component 3 value<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Component 2 value<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Component 1 value<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Component 0 value<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Value<br />
! Component<br />
|-<br />
| 0x0<br />
| x<br />
|-<br />
| 0x1<br />
| y<br />
|-<br />
| 0x2<br />
| z<br />
|-<br />
| 0x3<br />
| w<br />
|}<br />
<br />
The component selector enables swizzling. For example, component selector 0x1B is equivalent to .xyzw, while 0x55 is equivalent to .yyyy.<br />
<br />
Depending on the current shader opcode, source components are disabled implicitly by setting the destination component mask. For example, ADD o0.xy, r0.xyzw, r1.xyzw will not make use of r0's or r1's z/w components, while DP4 o0.xy, r0.xyzw, r1.xyzw will use all input components regardless of the used destination component mask.<br />
<br />
== Relative addressing ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! IDX raw value<br />
! Register name<br />
|-<br />
| 0x0<br />
| None<br />
|-<br />
| 0x1<br />
| a0.x<br />
|-<br />
| 0x2<br />
| a0.y<br />
|-<br />
| 0x3<br />
| aL<br />
|}<br />
<br />
There are 3 address registers: a0.x, a0.y and aL (loop counter). For format 1 instructions, when IDX != 0, the value of the corresponding address register is added to SRC1's value. For example, if IDX = 2, a0.y = 3 and SRC1 = c8, then instead SRC1+a0.y = c11 will be used for the instruction. It is only possible to use address registers on constant registers, attempting to use them on input attribute or temporary registers results in the address register being ignored (i.e. read as zero).<br />
<br />
a0.x and a0.y are set manually through the MOVA instruction by rounding a float value to integer precision. Hence, they may take negative values.<br />
<br />
aL can only be set indirectly by the LOOP instruction. It is still accessible and valid after exiting a LOOP block, though.<br />
<br />
== Comparison operator ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CMPX/CMPY raw value<br />
! Operator name<br />
! Expression<br />
|-<br />
| 0x0<br />
| EQ<br />
| src1 == src2<br />
|-<br />
| 0x1<br />
| NE<br />
| src1 != src2<br />
|-<br />
| 0x2<br />
| LT<br />
| src1 < src2<br />
|-<br />
| 0x3<br />
| LE<br />
| src1 <= src2<br />
|-<br />
| 0x4<br />
| GT<br />
| src1 > src2<br />
|-<br />
| 0x5<br />
| GE<br />
| src1 >= src2<br />
|-<br />
| 0x6<br />
| ??<br />
| true ?<br />
|-<br />
| 0x7<br />
| ??<br />
| true ?<br />
|}<br />
<br />
6 and 7 seem to always return true.<br />
<br />
== Conditions ==<br />
<br />
A number of format 2 instructions are executed conditionally. These conditions are based on two boolean registers which can be set with CMP : cmp.x and cmp.y.<br />
<br />
Conditional instructions include 3 parameters : CONDOP, REFX and REFY. REFX and REFY are reference values which are tested for equality against cmp.x and cmp.y, respectively. CONDOP describes how the final truth value is constructed from the results of the two tests. There are four conditional expression formats :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! CONDOP raw value<br />
! Expression<br />
! Description<br />
|-<br />
| 0x0<br />
| <nowiki>cmp.x == REFX || cmp.y == REFY</nowiki><br />
| OR<br />
|-<br />
| 0x1<br />
| <nowiki>cmp.x == REFX && cmp.y == REFY</nowiki><br />
| AND<br />
|-<br />
| 0x2<br />
| cmp.x == REFX<br />
| X<br />
|-<br />
| 0x3<br />
| cmp.y == REFY<br />
| Y<br />
|}<br />
<br />
== Registers ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Name<br />
! Format<br />
! Type<br />
! Access<br />
! Written by<br />
! Description<br />
|-<br />
| v0-v15<br />
| vector<br />
| float<br />
| Read only<br />
| Application/Vertex-stream<br />
| Input registers.<br />
|-<br />
| o0-o15<br />
| vector<br />
| float<br />
| Write only<br />
| Vertex shader<br />
| Output registers.<br />
|-<br />
| r0-r15<br />
| vector<br />
| float<br />
| Read/Write<br />
| Vertex shader<br />
| Temporary registers.<br />
|-<br />
| c0-c95<br />
| vector<br />
| float<br />
| Read only<br />
| Application<br />
| Floating-point Constant registers.<br />
|-<br />
| i0-i3<br />
| vector<br />
| integer<br />
| Read only<br />
| Application<br />
| Integer Constant registers. (special purpose)<br />
|-<br />
| b0-b15<br />
| scalar<br />
| boolean<br />
| Read only<br />
| Application<br />
| Boolean Constant registers. (special purpose)<br />
|-<br />
| a0.x & a0.y<br />
| scalar<br />
| integer<br />
| Use/Write<br />
| Vertex shader<br />
| Address registers.<br />
|-<br />
| aL<br />
| scalar<br />
| integer<br />
| Use<br />
| Application<br />
| Loop count register.<br />
|}<br />
<br />
Input attribute registers store the per-vertex data given by the CPU and hence are read-only.<br />
<br />
Output attribute registers hold the data to be passed to the later GPU stages and are write-only. Each of the output attribute register components is assigned a semantic by setting the corresponding [[GPU_Internal_Registers]]. Output registers o7-o15 are only available in vertex shaders.<br />
It appears that writing twice to a component of an output register that was written to before can cause problems (e.g. GPU hangs).<br />
<br />
Temporary registers can be used for intermediate calculations and can be both read and written.<br />
<br />
Constant registers hold data uploaded by the application which remain constant throughout all processed vertices. There are 96 float[4] constant registers (c0-c95), eight boolean constant registers (b0-b7), and four int[4] constant registers (i0-i3).<br />
Many shader instructions which take float arguments can only provide the full 7 bits for one SRC operand. All other source operands can only be used to refer to input attributes or temporary registers and cannot be passed Floating-point Constant registers.<br />
<br />
Address registers and the Loop count register can be used to to provide relative addressing for the designated SRC operand. For more information, see the section on [[#Relative_addressing|relative addressing]].<br />
<br />
DST mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! DST raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x6<br />
| o0-o6<br />
| Output registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|}<br />
<br />
SRC mapping :<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! SRC1 raw value<br />
! Register name<br />
! Description<br />
|-<br />
| 0x0-0x7<br />
| v0-v7<br />
| Input attribute registers.<br />
|-<br />
| 0x10-0x1F<br />
| r0-r15<br />
| Temporary registers.<br />
|-<br />
| 0x20-0x7F<br />
| c0-c95<br />
| Constant registers.<br />
|}<br />
<br />
== Floating-Point Behavior ==<br />
<br />
The PICA200 is not IEEE-compliant. It has positive and negative infinities and NaN, but does not seem to have negative 0. Input and output subnormals are flushed to +0. The internal floating point format seems to be the same as used in shader binaries: 1 sign bit, 7 exponent bits, 16 (explicit) mantissa bits. Several instructions also have behavior that differs from the IEEE functions. Here are the results from some tests done on hardware (s = largest subnormal, n = smallest positive normal):<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Computation<br />
! Result<br />
! Notes<br />
|-<br />
| inf * 0<br />
| 0<br />
| Including inside MUL, MAD, DP4, etc.<br />
|-<br />
| NaN * 0<br />
| NaN<br />
| <br />
|-<br />
| +inf - +inf<br />
| NaN<br />
| Indicates +inf is real inf, not FLT_MAX<br />
|-<br />
| rsq(rcp(-inf))<br />
| +inf<br />
| Indicates that there isn't -0.0.<br />
<br />
|- style="border-top: double"<br />
| rcp(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rcp(-0) = -inf <br />
|-<br />
| rcp(0)<br />
| +inf<br />
| <br />
|-<br />
| rcp(+inf)<br />
| 0<br />
| <br />
|-<br />
| rcp(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| rsq(-0)<br />
| +inf<br />
| no -0 so differs from IEEE where rsq(-0) = -inf <br />
|-<br />
| rsq(-2)<br />
| NaN<br />
| <br />
|-<br />
| rsq(+inf)<br />
| 0<br />
| <br />
|-<br />
| rsq(-inf)<br />
| NaN<br />
| <br />
|-<br />
| rsq(NaN)<br />
| NaN<br />
| <br />
<br />
|- style="border-top: double"<br />
| max(0, +inf)<br />
| +inf<br />
| <br />
|-<br />
| max(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| max(0, NaN)<br />
| NaN<br />
| max violates IEEE but match GLSL spec<br />
|-<br />
| max(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| max(-inf, +inf)<br />
| +inf<br />
| <br />
<br />
|- style="border-top: double"<br />
| min(0, +inf)<br />
| 0<br />
| <br />
|-<br />
| min(0, -inf)<br />
| -inf<br />
| <br />
|-<br />
| min(0, NaN)<br />
| NaN<br />
| min violates IEEE but match GLSL spec<br />
|-<br />
| min(NaN, 0)<br />
| 0<br />
| <br />
|-<br />
| min(-inf, +inf)<br />
| -inf<br />
|<br />
<br />
|- style="border-top: double"<br />
| cmp(s, 0)<br />
| false<br />
| cmp does not flush input subnormals<br />
|-<br />
| max(s, 0)<br />
| s<br />
| max does not flush input or output subnormals<br />
|-<br />
| mul(s, 2)<br />
| 0<br />
| input subnormals are flushed in arithmetic instructions<br />
|-<br />
| mul(n, 0.5)<br />
| 0<br />
| output subnormals are flushed in arithmetic instructions<br />
|}<br />
<br />
1.0 can be multiplied 63 times by 0.5 until the result compares equal zero. This is consistent with a 7-bit exponent and output subnormal flushing.<br />
<br />
== Control Flow ==<br />
<br />
Control flow is implemented using four independent stacks:<br />
<br />
* 4-deep CALL stack<br />
* 8-deep IF stack<br />
* 4-deep LOOP stack<br />
<br />
All stacks are initially empty. After every instruction but before JMP takes effect, the PC is incremented and a copy is sent to each stack. Each stack is checked against its copy of the PC. If an entry is popped from the stack, the copied PC is updated and used for the next check of this stack, although the IF/LOOP stacks can each only pop one entry per instruction, whereas the CALL stack is checked again until it doesn't match or the stack is empty. The updated PC copy with the highest priority wins: LOOP (highest), IF, CALL, JMP, original PC (lowest).<br />
<br />
Special cases:<br />
* JMP overwrites the PC *after* the stacks checks (and only if no stack was popped).<br />
* Executing a BREAK on an empty LOOP stack hangs the GPU.<br />
* A stack overflow discards the oldest element, so you could think of it as a queue or a ring buffer.<br />
* If the CALL stack is popped four times in a row, the fourth update to its copy of the PC is missed (the third PC update will be propagated). Probably a hardware bug.</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21524
SHBIN
2021-05-30T03:43:54Z
<p>Oreo639: </p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]] (with the extension .bcsdr). They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Possibly a version number?)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Possibly a version number?)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float uniform register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Label ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| ?<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored as in constant uniform table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform bool ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform integer vector ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform vector ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! ID<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Variable start register<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Variable end register (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21523
SHBIN
2021-05-30T00:42:23Z
<p>Oreo639: Clarify introduction</p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) format is used to contain compiled and linked shader programs. These can include vertex shaders and geometry shaders. In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or within container formats like, for example, [[CGFX]]. They are typically compiled from .vsh files, .gsh files, and sometimes .asm files.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The binary header specifies the number and location of DVLEs. The program header specifies the generic parts of the shader (i.e. the shader program data, the operand descriptor data, and a filename symbol table). The executable headers specify the contextual details (i.e. entry point, constant values, debug symbols, etc). There may be multiple executable headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following, note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Possibly a version number?)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Possibly a version number?)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float uniform register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Label ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| ?<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored as in constant uniform table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform bool ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform integer vector ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform vector ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! ID<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Variable start register<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Variable end register (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21522
SHBIN
2021-05-25T05:27:34Z
<p>Oreo639: Update DVLP information</p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) file is used to contain compiled and linked shader programs. These can include vertex shaders (typically compiled from .vsh files) and geometry shaders (typically compiled from .gsh files, though .asm have been observed). In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or contained within .bcsdr files. BCSDR files use [[CGFX]] as a container, but the underlying DVLB/DVLP/DVLE structure remains unchanged.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The program header specifies the generic parts of the shader, i.e. the shader program data, the operand descriptor data, and a filename symbol table. The contextual details (entry point, constant values, debug symbols, etc) are specified in an executable header (DVLE). There may be multiple DVLE headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Possibly a version number?)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown (Same value as offset to filename symbol table?)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown (Always zero?)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Possibly a version number?)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float uniform register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Label ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| ?<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored as in constant uniform table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform bool ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform integer vector ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform vector ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! ID<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Variable start register<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Variable end register (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21521
SHBIN
2021-05-24T07:32:11Z
<p>Oreo639: Clarify unknowns in DVLE and DVLP and further fill out DVLP header</p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) file is used to contain compiled and linked shader programs. These can include vertex shaders (typically compiled from .vsh files) and geometry shaders (typically compiled from .gsh files, though .asm have been observed). In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or contained within .bcsdr files. BCSDR files use [[CGFX]] as a container, but the underlying DVLB/DVLP/DVLE structure remains unchanged.<br />
<br />
A SHBIN's structure starts with a binary header (DVLB), then a single program header (DVLP), then one or more executable headers DVLE(s). The program header specifies the generic parts of the shader, i.e. the shader program data, the operand descriptor data, and a filename symbol table. The contextual details (entry point, constant values, debug symbols, etc) are specified in an executable header (DVLE). There may be multiple DVLE headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP section comes directly after the binary header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Unknown, same value as in DVLE. (Possibly a version number?)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to operand descriptor table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of operand descriptor table entries (each entry is 8-bytes long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Unknown<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Unknown<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown, same value as in DVLP. (Possibly a version number?)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float uniform register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Label ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| ?<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored as in constant uniform table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform bool ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform integer vector ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform vector ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! ID<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Variable start register<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Variable end register (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Homebrew_Libraries_and_Tools&diff=21507
Homebrew Libraries and Tools
2021-03-24T05:49:05Z
<p>Oreo639: Remove direct master zip links</p>
<hr />
<div>This is a list of libraries and tools that can be used to develop 3DS Homebrew.<br />
<br />
== Libraries ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [https://github.com/devkitPro/libctru libctru]<br />
| C library for writing user mode ARM11 code for the 3DS (CTR) <br />
| [https://twitter.com/smealum smea] et al.<br />
| [[Setting_up_Development_Environment|See here]]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/devkitPro/citro3d citro3d]<br />
| Stateful PICA200 GPU wrapper library for the Nintendo 3DS<br />
| [https://github.com/fincs fincs]<br />
| [[Setting_up_Development_Environment|See here]]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/devkitPro/citro2d citro2d]<br />
| Library for drawing 2D graphics using the Nintendo 3DS's PICA200 GPU<br />
| [https://github.com/fincs fincs]<br />
| [[Setting_up_Development_Environment|See here]]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/xerpi/sf2dlib sf2dlib]<br />
| Simple and Fast 2D library for the Nintendo 3DS (using libctru and citro3d)<br />
| [https://github.com/xerpi xerpi]<br />
| [https://github.com/xerpi/sf2dlib/ Here]<br />
| Yes<br />
| Deprecated<br />
|-<br />
| [https://github.com/cpp3ds/gl3ds gl3ds]<br />
| OpenGL implementation for Nintendo 3DS using libctru<br />
| [https://github.com/Cruel Cruel] et al.<br />
| [https://github.com/cpp3ds/gl3ds/ Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/machinamentum/Caelina Caelina]<br />
| An OpenGL implementation for (N)3DS<br />
| [https://github.com/machinamentum machinamentum]<br />
| [https://github.com/machinamentum/Caelina/releases/ Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/Myriachan/libkhax libkhax]<br />
| Library for modifying kernel memory on a certain handheld game console.<br />
| [https://github.com/Myriachan Myria] et al.<br />
| [https://github.com/Myriachan/libkhax/ Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/cpp3ds/cpp3ds cpp3ds]<br />
| Object-oriented C++ game library and port of [http://www.sfml-dev.org/ SFML]<br />
| [https://github.com/Cruel Cruel] et al.<br />
| [https://github.com/cpp3ds/cpp3ds/releases/ Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/BtheDestroyer/SpriteTools SpriteTools]<br />
| Extension to SF2D, adding support for things like animations<br />
| [https://github.com/BtheDestroyer BtheDestroyer]<br />
| [https://github.com/BtheDestroyer/SpriteTools/releases/ Here]<br />
| Yes<br />
| Deprecated<br />
|-<br />
|}<br />
<br />
== PC Tools ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [http://devkitpro.org/ devkitARM]<br />
| GCC-based toolchain tuned for homebrew development for ARM-based consoles.<br />
| [https://github.com/WinterMute WinterMute] et al.<br />
| [[Setting_up_Development_Environment|See here]]<br />
| [https://github.com/devkitPro Yes]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/aemstro aemstro]<br />
| Set of tools used to disassemble and assemble shader code for DMP's MAESTRO shader extension used in the 3DS's PICA200 GPU<br />
| [https://twitter.com/smealum smea]<br />
| [https://github.com/smealum/aemstro/ Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/devkitPro/picasso picasso]<br />
| Homebrew PICA200 shader assembler<br />
| [https://github.com/fincs fincs]<br />
| [[Setting_up_Development_Environment|See here]]<br />
| Yes<br />
| Yes<br />
|-<br />
| [http://4dsdev.org/thread.php?id=14 nihstro]<br />
| 3DS shader assembler and disassembler <br />
| [https://github.com/neobrain neobrain]<br />
| [http://4dsdev.org/thread.php?id=14 Here]<br />
| [https://github.com/neobrain/nihstro Yes]<br />
| No<br />
|-<br />
| [https://github.com/Lectem/3ds-cmake 3ds-cmake]<br />
| CMake files for devkitARM and 3DS homebrew development<br />
| [https://github.com/Lectem Lectem]<br />
| [https://github.com/Lectem/3ds-cmake/ Here]<br />
| Yes<br />
| No<br />
|-<br />
| [[Makerom|makerom]]<br />
| Tool which can be used to create NCCH, CCI, and CIA files. <br />
| [[User:3dsguy|3dsguy]], maintained by [https://github.com/profi200 profi200]<br />
| [https://github.com/profi200/Project_CTR/releases/ Here]<br />
| [https://github.com/profi200/Project_CTR/tree/master/makerom Yes]<br />
| Yes<br />
|-<br />
| [https://github.com/Steveice10/bannertool bannertool]<br />
| Tool to create NCCH banners<br />
| [https://github.com/Steveice10 Steveice10]<br />
| [https://github.com/Steveice10/bannertool/releases/ Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/socram8888/amiitool amiitool]<br />
| Tool to decrypt, encrypt and sign amiibo dumps<br />
| [https://github.com/socram8888 socram8888]<br />
| [https://github.com/socram8888/amiitool/releases/ Here]<br />
| Yes<br />
| No<br />
|}<br />
<br />
== 3DS Tools ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [https://github.com/neobrain/braindump braindump]<br />
| Tool to dump ExeFS/RomFS data from games and other applications<br />
| [https://github.com/neobrain neobrain]<br />
| [https://github.com/neobrain/braindump/releases/ Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/citra-emu/uncart uncart]<br />
| Utility to dump game cartridges to the SD card<br />
| [https://github.com/neobrain neobrain] et al.<br />
| Build from [https://github.com/citra-emu/uncart repo]<br />
| Yes<br />
| No<br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Homebrew_Libraries_and_Tools&diff=21506
Homebrew Libraries and Tools
2021-03-24T05:36:26Z
<p>Oreo639: Remove parx-3ds and lovepotion</p>
<hr />
<div>This is a list of libraries and tools that can be used to develop 3DS Homebrew.<br />
<br />
== Libraries ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [https://github.com/devkitPro/libctru libctru]<br />
| C library for writing user mode ARM11 code for the 3DS (CTR) <br />
| [https://twitter.com/smealum smea] et al.<br />
| [[Setting_up_Development_Environment|See here]]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/devkitPro/citro3d citro3d]<br />
| Stateful PICA200 GPU wrapper library for the Nintendo 3DS<br />
| [https://github.com/fincs fincs]<br />
| [https://github.com/devkitPro/citro3d/archive/master.zip Here]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/devkitPro/citro2d citro2d]<br />
| Library for drawing 2D graphics using the Nintendo 3DS's PICA200 GPU<br />
| [https://github.com/fincs fincs]<br />
| [https://github.com/fincs/citro2d/archive/master.zip Here]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/xerpi/sf2dlib sf2dlib]<br />
| Simple and Fast 2D library for the Nintendo 3DS (using libctru and citro3d)<br />
| [https://github.com/xerpi xerpi]<br />
| [https://github.com/xerpi/sf2dlib/archive/master.zip Here]<br />
| Yes<br />
| Deprecated<br />
|-<br />
| [https://github.com/cpp3ds/gl3ds gl3ds]<br />
| OpenGL implementation for Nintendo 3DS using libctru<br />
| [https://github.com/Cruel Cruel] et al.<br />
| [https://github.com/cpp3ds/gl3ds/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/machinamentum/Caelina Caelina]<br />
| An OpenGL implementation for (N)3DS<br />
| [https://github.com/machinamentum machinamentum]<br />
| [https://github.com/machinamentum/Caelina/releases Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/Myriachan/libkhax libkhax]<br />
| Library for modifying kernel memory on a certain handheld game console.<br />
| [https://github.com/Myriachan Myria] et al.<br />
| [https://github.com/Myriachan/libkhax/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/cpp3ds/cpp3ds cpp3ds]<br />
| Object-oriented C++ game library and port of [http://www.sfml-dev.org/ SFML]<br />
| [https://github.com/Cruel Cruel] et al.<br />
| [https://github.com/cpp3ds/cpp3ds/releases Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/BtheDestroyer/SpriteTools SpriteTools]<br />
| Extension to SF2D, adding support for things like animations<br />
| [https://github.com/BtheDestroyer BtheDestroyer]<br />
| [https://github.com/BtheDestroyer/SpriteTools/releases Here]<br />
| Yes<br />
| Deprecated<br />
|-<br />
|}<br />
<br />
== PC Tools ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [http://devkitpro.org/ devkitARM]<br />
| GCC-based toolchain tuned for homebrew development for ARM-based consoles.<br />
| [https://github.com/WinterMute WinterMute] et al.<br />
| [[Setting_up_Development_Environment|See here]]<br />
| [https://github.com/devkitPro Yes]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/aemstro aemstro]<br />
| Set of tools used to disassemble and assemble shader code for DMP's MAESTRO shader extension used in the 3DS's PICA200 GPU<br />
| [https://twitter.com/smealum smea]<br />
| [https://github.com/smealum/aemstro/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/devkitPro/picasso picasso]<br />
| Homebrew PICA200 shader assembler<br />
| [https://github.com/fincs fincs]<br />
| [https://github.com/devkitPro/picasso/releases Here]<br />
| Yes<br />
| Yes<br />
|-<br />
| [http://4dsdev.org/thread.php?id=14 nihstro]<br />
| 3DS shader assembler and disassembler <br />
| [https://github.com/neobrain neobrain]<br />
| [http://4dsdev.org/thread.php?id=14 Here]<br />
| [https://github.com/neobrain/nihstro Yes]<br />
| No<br />
|-<br />
| [https://github.com/Lectem/3ds-cmake 3ds-cmake]<br />
| CMake files for devkitARM and 3DS homebrew development<br />
| [https://github.com/Lectem Lectem]<br />
| [https://github.com/Lectem/3ds-cmake/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [[Makerom|makerom]]<br />
| Tool which can be used to create NCCH, CCI, and CIA files. <br />
| [[User:3dsguy|3dsguy]], maintained by [https://github.com/profi200 profi200]<br />
| [https://github.com/profi200/Project_CTR/archive/master.zip Here]<br />
| [https://github.com/profi200/Project_CTR/tree/master/makerom Yes]<br />
| Yes<br />
|-<br />
| [https://github.com/Steveice10/bannertool bannertool]<br />
| Tool to create NCCH banners<br />
| [https://github.com/Steveice10 Steveice10]<br />
| [https://github.com/Steveice10/bannertool/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/socram8888/amiitool amiitool]<br />
| Tool to decrypt, encrypt and sign amiibo dumps<br />
| [https://github.com/socram8888 socram8888]<br />
| [https://github.com/socram8888/amiitool/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|}<br />
<br />
== 3DS Tools ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [https://github.com/neobrain/braindump braindump]<br />
| Tool to dump ExeFS/RomFS data from games and other applications<br />
| [https://github.com/neobrain neobrain]<br />
| [https://github.com/neobrain/braindump/releases Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/citra-emu/uncart uncart]<br />
| Utility to dump game cartridges to the SD card<br />
| [https://github.com/neobrain neobrain] et al.<br />
| Build from [https://github.com/citra-emu/uncart repo]<br />
| Yes<br />
| No<br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Homebrew_Libraries_and_Tools&diff=21505
Homebrew Libraries and Tools
2021-03-24T05:33:57Z
<p>Oreo639: Update links and add citro2d</p>
<hr />
<div>This is a list of libraries and tools that can be used to develop 3DS Homebrew.<br />
<br />
== Libraries ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [https://github.com/devkitPro/libctru libctru]<br />
| C library for writing user mode ARM11 code for the 3DS (CTR) <br />
| [https://twitter.com/smealum smea] et al.<br />
| [[Setting_up_Development_Environment|See here]]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/devkitPro/citro3d citro3d]<br />
| Stateful PICA200 GPU wrapper library for the Nintendo 3DS<br />
| [https://github.com/fincs fincs]<br />
| [https://github.com/devkitPro/citro3d/archive/master.zip Here]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/devkitPro/citro2d citro2d]<br />
| Library for drawing 2D graphics using the Nintendo 3DS's PICA200 GPU<br />
| [https://github.com/fincs fincs]<br />
| [https://github.com/fincs/citro2d/archive/master.zip Here]<br />
| Yes<br />
| Yes<br />
|-<br />
| [https://github.com/xerpi/sf2dlib sf2dlib]<br />
| Simple and Fast 2D library for the Nintendo 3DS (using libctru and citro3d)<br />
| [https://github.com/xerpi xerpi]<br />
| [https://github.com/xerpi/sf2dlib/archive/master.zip Here]<br />
| Yes<br />
| Deprecated<br />
|-<br />
| [https://github.com/cpp3ds/gl3ds gl3ds]<br />
| OpenGL implementation for Nintendo 3DS using libctru<br />
| [https://github.com/Cruel Cruel] et al.<br />
| [https://github.com/cpp3ds/gl3ds/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/machinamentum/Caelina Caelina]<br />
| An OpenGL implementation for (N)3DS<br />
| [https://github.com/machinamentum machinamentum]<br />
| [https://github.com/machinamentum/Caelina/releases Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/Parx-3DS Three-DS, computers]<br />
| Canvas/GDI Parx-Pas tested in FreePascal, public stubs <br />
| [https://twitter.com/Kenny_D_Lee Kenneth Dwayne Lee]<br />
| [http://flying-dutchmen.github.io/3DS-Sails Here]<br />
| No<br />
| No<br />
|-<br />
| [https://github.com/Myriachan/libkhax libkhax]<br />
| Library for modifying kernel memory on a certain handheld game console.<br />
| [https://github.com/Myriachan Myria] et al.<br />
| [https://github.com/Myriachan/libkhax/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/cpp3ds/cpp3ds cpp3ds]<br />
| Object-oriented C++ game library and port of [http://www.sfml-dev.org/ SFML]<br />
| [https://github.com/Cruel Cruel] et al.<br />
| [https://github.com/cpp3ds/cpp3ds/releases Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/BtheDestroyer/SpriteTools SpriteTools]<br />
| Extension to SF2D, adding support for things like animations<br />
| [https://github.com/BtheDestroyer BtheDestroyer]<br />
| [https://github.com/BtheDestroyer/SpriteTools/releases Here]<br />
| Yes<br />
| Deprecated<br />
|-<br />
| [https://github.com/TurtleP/LovePotion LovePotion]<br />
| Love2d port, a lua game engine <br />
| [https://github.com/TurtleP/ TurtleP]<br />
| [https://github.com/TurtleP/LovePotion/releases Here]<br />
| Yes<br />
| Yes<br />
|}<br />
<br />
== PC Tools ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [http://devkitpro.org/ devkitARM]<br />
| GCC-based toolchain tuned for homebrew development for ARM-based consoles.<br />
| [https://github.com/WinterMute WinterMute] et al.<br />
| [[Setting_up_Development_Environment|See here]]<br />
| [https://github.com/devkitPro Yes]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/aemstro aemstro]<br />
| Set of tools used to disassemble and assemble shader code for DMP's MAESTRO shader extension used in the 3DS's PICA200 GPU<br />
| [https://twitter.com/smealum smea]<br />
| [https://github.com/smealum/aemstro/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/devkitPro/picasso picasso]<br />
| Homebrew PICA200 shader assembler<br />
| [https://github.com/fincs fincs]<br />
| [https://github.com/devkitPro/picasso/releases Here]<br />
| Yes<br />
| Yes<br />
|-<br />
| [http://4dsdev.org/thread.php?id=14 nihstro]<br />
| 3DS shader assembler and disassembler <br />
| [https://github.com/neobrain neobrain]<br />
| [http://4dsdev.org/thread.php?id=14 Here]<br />
| [https://github.com/neobrain/nihstro Yes]<br />
| No<br />
|-<br />
| [https://github.com/Lectem/3ds-cmake 3ds-cmake]<br />
| CMake files for devkitARM and 3DS homebrew development<br />
| [https://github.com/Lectem Lectem]<br />
| [https://github.com/Lectem/3ds-cmake/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [[Makerom|makerom]]<br />
| Tool which can be used to create NCCH, CCI, and CIA files. <br />
| [[User:3dsguy|3dsguy]], maintained by [https://github.com/profi200 profi200]<br />
| [https://github.com/profi200/Project_CTR/archive/master.zip Here]<br />
| [https://github.com/profi200/Project_CTR/tree/master/makerom Yes]<br />
| Yes<br />
|-<br />
| [https://github.com/Steveice10/bannertool bannertool]<br />
| Tool to create NCCH banners<br />
| [https://github.com/Steveice10 Steveice10]<br />
| [https://github.com/Steveice10/bannertool/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/socram8888/amiitool amiitool]<br />
| Tool to decrypt, encrypt and sign amiibo dumps<br />
| [https://github.com/socram8888 socram8888]<br />
| [https://github.com/socram8888/amiitool/archive/master.zip Here]<br />
| Yes<br />
| No<br />
|}<br />
<br />
== 3DS Tools ==<br />
{| class="wikitable" border="1" width="100%"<br />
! width="16%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="8%" | Download<br />
! width="8%" | Open-Source<br />
! width="8%" | Maintained<br />
|-<br />
| [https://github.com/neobrain/braindump braindump]<br />
| Tool to dump ExeFS/RomFS data from games and other applications<br />
| [https://github.com/neobrain neobrain]<br />
| [https://github.com/neobrain/braindump/releases Here]<br />
| Yes<br />
| No<br />
|-<br />
| [https://github.com/citra-emu/uncart uncart]<br />
| Utility to dump game cartridges to the SD card<br />
| [https://github.com/neobrain neobrain] et al.<br />
| Build from [https://github.com/citra-emu/uncart repo]<br />
| Yes<br />
| No<br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=SHBIN&diff=21334
SHBIN
2020-09-14T04:41:57Z
<p>Oreo639: Fill in missing DVLE sections based on information from picasso and libctru</p>
<hr />
<div>[[Category:File formats]]<br />
<br />
The SHBIN (SHader BINary) file is used to contain compiled and linked shader programs. These can include vertex shaders (typically compiled from .vsh files) and geometry shaders (typically compiled from .gsh files, though .asm have been observed). In commercial applications, SHBIN files can be found as standalone files with the extension .shbin, or contained within .bcsdr files. BCSDR files use CGFX as a container, but the underlying DVLB/DVLP/DVLE structure remains unchanged.<br />
<br />
A SHBIN's structure starts with a generic header (DVLB), then a single program header (DVLP), then DVLE(s). The program header specifies the generic parts of the shader, i.e. the shader program data, the operand descriptor data, and a filename symbol table. The contextual details (entry point, constant values, debug symbols, etc) are specified in an executable header (DVLE). There may be multiple DVLE headers, so in this sense multiple shaders sharing the same program code can be stored in a single SHBIN. Hence for the following note the distinction between "program" and "executable".<br />
<br />
For a description of the instruction set, see the following page : [[Shader Instruction Set]]<br />
<br />
== Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLB"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| N = number of DVLEs in SHBIN<br />
|-<br />
| 0x8<br />
| 0x4*N<br />
| DVLE offset table; each offset is a u32 relative to the start of the DVLB section<br />
|-<br />
|}<br />
<br />
The DVLP file comes directly after the header.<br />
<br />
== DVLP ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLP"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| ? (Maybe a version number?)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset (relative to DVLP start) to the compiled shader binary blob<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLP start) to filename symbol table<br />
|-<br />
|}<br />
<br />
== DVLE ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Magic "DVLE"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Unknown<br />
|-<br />
| 0x6<br />
| 0x1<br />
| Shader type (0x0 = vertex shader, 0x1 = geometry shader; might contain other flags)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| true = merge vertex/geometry shader outmaps ('dummy' output attribute is present)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Executable's main offset in binary blob (in words)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Executable's program's endmain offset in binary blob (in words)<br />
|-<br />
| 0x10<br />
| 0x2<br />
| Bitmask of used input registers<br />
|-<br />
| 0x12<br />
| 0x2<br />
| Bitmask of used output registers<br />
|-<br />
| 0x14<br />
| 0x1<br />
| Geometry shader type (point = 0x0, variable/subdivide = 0x1, fixed/particle = 0x2)<br />
|-<br />
| 0x15<br />
| 0x1<br />
| Starting float uniform register number for storing the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x16<br />
| 0x1<br />
| Number of fully-defined vertices in the variable-size primitive vertex array (geometry shader, variable mode)<br />
|-<br />
| 0x17<br />
| 0x1<br />
| Number of vertices in the fixed-size primitive vertex array (geometry shader, fixed mode)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset (relative to DVLE start) to constant table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset (relative to DVLE start) to label table<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLE start) to output register table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset (relative to DVLE start) to uniform table<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset (relative to DVLE start) to symbol table<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
=== Label Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Label ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset (relative to shader program blob start) to label's location, in words<br />
|-<br />
| 0x8<br />
| 0x4<br />
| ?<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to label's symbol<br />
|-<br />
|}<br />
<br />
=== Constant Table Entry ===<br />
<br />
Each executable's constants are stored as in constant uniform table. This information is used by ctrulib's SHDR framework to automatically send those values to the GPU when changing to a given program. An entry is constituted by a header and the constant data, the latter of which uses a format specific to the constant type.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| Constant type (0=bool, 1=ivec4, 2=vec4)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform ID<br />
|}<br />
<br />
Corresponding constant entry formats:<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x0<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform bool ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| Value (boolean)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x1<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform integer vector ID<br />
|-<br />
| 0x4<br />
| 0x1<br />
| x (u8)<br />
|-<br />
| 0x5<br />
| 0x1<br />
| y (u8)<br />
|-<br />
| 0x6<br />
| 0x1<br />
| z (u8)<br />
|-<br />
| 0x7<br />
| 0x1<br />
| w (u8)<br />
|}<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x1<br />
| 0x2<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Uniform vector ID<br />
|-<br />
| 0x4<br />
| 0x4<br />
| x (float24)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| y (float24)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| z (float24)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| w (float24)<br />
|}<br />
<br />
=== Output Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Bit<br />
! Description<br />
|-<br />
| 0-3<br />
| Output type (see table below)<br />
|-<br />
| 16-19<br />
| Register ID<br />
|-<br />
| 32-35<br />
| Output attribute component mask (e.g. 5=xz)<br />
|}<br />
<br />
Output types :<br />
{| class="wikitable" border="1"<br />
|-<br />
! ID<br />
! Description<br />
|-<br />
| 0x0<br />
| result.position<br />
|-<br />
| 0x1<br />
| result.normalquat<br />
|-<br />
| 0x2<br />
| result.color<br />
|-<br />
| 0x3<br />
| result.texcoord0<br />
|-<br />
| 0x4<br />
| result.texcoord0w<br />
|-<br />
| 0x5<br />
| result.texcoord1<br />
|-<br />
| 0x6<br />
| result.texcoord2<br />
|-<br />
| 0x7<br />
| ?<br />
|-<br />
| 0x8<br />
| result.view<br />
|}<br />
<br />
=== Uniform Table Entry ===<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Offset (relative to DVLE symbol table start) to variable's symbol<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Variable start register<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Variable end register (equal to start register for non-arrays)<br />
|-<br />
|}<br />
<br />
The register indices refer to a unified register space for non-output registers. The mapping of register index values to registers is the following:<br />
{| class="wikitable" border="1"<br />
|-<br />
! Values<br />
! Registers<br />
|-<br />
| 0x00-0x0F<br />
| v0-v15<br />
|-<br />
| 0x10-0x6F<br />
| c0-c95<br />
|-<br />
| 0x70-0x73<br />
| i0-i3<br />
|-<br />
| 0x78-0x87<br />
| b0-b15<br />
|-<br />
|}<br />
<br />
== DVOJ ==<br />
There is another file format for shaders, which starts with the string "DVOJ". This format seems to be used for unlinked shader objects. It seems likely that one or multiple DVOJs can be linked to a DVLB file, similarly to the C compilation model.<br />
<br />
Structurally, a DVOJ header captures all information there is about a single shader instance. It uses the same fields like the DVLB, DVLP, and DVLE structures, but also stores two unknown blocks of data. It seems that the entry point of a DVOJ is always the first shader instruction.<br />
<br />
All offsets in the following table are given relative to the DVOJ start.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x00<br />
| 0x4<br />
| Magic "DVOJ"<br />
|-<br />
| 0x04<br />
| 0x4<br />
| Unknown. Seems to be related to the DVLE shader type.<br />
|-<br />
| 0x08<br />
| 0x4<br />
| Unknown.<br />
|-<br />
| 0x0C<br />
| 0x4<br />
| Padding? (usually 0xFFFFFFFF)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to constant table<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Number of entries in constant table (each entry is 0x14-byte long)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Offset to label table<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Number of entries in label table (each entry is 0x10-byte long)<br />
|-<br />
| 0x20<br />
| 0x4<br />
| Offset to the compiled shader binary blob <br />
|-<br />
| 0x24<br />
| 0x4<br />
| Size of compiled shader binary blob, in words<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Offset (relative to DVLP start) to shader instruction extension table<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Number of shader instruction extension table entries (each entry is 8-byte long)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Offset to unknown block 1<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Number of items in unknown block 1 (each item is 8-byte long). This seems to be equal to the total number of instructions.<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Offset to unknown block 2<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Number of items in unknown block 2 (each item is 12-byte long). This seems to be equal to the number of instructions taking arguments (i.e. excluding NOP, END, ...)<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Offset to output register table<br />
|-<br />
| 0x44<br />
| 0x4<br />
| Number of entries in output register table (each entry is 0x8-byte long)<br />
|-<br />
| 0x48<br />
| 0x4<br />
| Offset to uniform table<br />
|-<br />
| 0x4C<br />
| 0x4<br />
| Number of entries in uniform table (each entry is 0x8-byte long)<br />
|-<br />
| 0x50<br />
| 0x4<br />
| Offset to symbol table<br />
|-<br />
| 0x54<br />
| 0x4<br />
| Size of symbol table (in bytes)<br />
|-<br />
|}<br />
<br />
<br />
=== Unknown Block 1 Item ===<br />
A wild guess is that this denotes shader source line information. Take the information with a grain of salt, though, since it hasn't been backed by any empirical data so far.<br />
<br />
The index N of the item within Unknown Block 1 corresponds to the Nth instruction in the shader binary.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Byte offset within symbol table pointing to a source shader filename.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Line number of the corresponding shader instruction within the shader source code.<br />
|-<br />
|}<br />
<br />
=== Unknown Block 2 Item ===<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! Offset<br />
! Size<br />
! Description<br />
|-<br />
| 0x0<br />
| 0x4<br />
| This seems to be an index of a shader instruction. All non-nullary instructions seem to be referenced exactly once.<br />
|-<br />
| 0x4<br />
| 0x4<br />
| <br />
|-<br />
| 0x8<br />
| 0x4<br />
| <br />
|-<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Homebrew_Applications&diff=21282
Homebrew Applications
2020-06-04T00:17:09Z
<p>Oreo639: /* Save managers */ Update PKSM url and add Checkpoint</p>
<hr />
<div>== Installing ==<br />
Applications are installed by copying the necessary files directly to the <code>3ds/</code> folder in the root of the SD card (preferred for new designs), or in a subdirectory of <code>3ds/</code>, in which case said subfolder must be named identically to its executable. Most applications come with two files:<br />
* <code>[appname].3dsx</code>: The executable.<br />
* <code>[appname].smdh</code>: The icon/metadata. (Not required in any case, and may be integrated into the <code>.3dsx</code>)<br />
* <code>[appname].xml</code>: The list of supported targets (i.e. installed titles which the app supports replacing in memory at runtime, thus inheriting its permissions), and of any arguments to be passed to the .3dsx. (Optional)<br />
<br />
A standalone .xml file can point to a differently-named .3dsx, launching it with potentially different arguments so that a single application can run in different modes.<br />
<br />
The [[Homebrew Launcher]] will scan the SD card for all <code>.3dsx</code> files, but will only display an icon for those who have one according to the format described above. Recent enough versions can freely navigate the filesystem to select an application.<br />
<br />
== List ==<br />
<br />
=== Launchers ===<br />
<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="10%" | Open-Source<br />
|-<br />
| [https://github.com/fincs/new-hbmenu Homebrew Launcher]<br />
| Run homebrew on your 3DS! Compatible with Rosalina and all prior 3dsx loading solutions<br />
| [https://devkitpro.org devkitPro]<br />
| [https://github.com/fincs/new-hbmenu/releases Here]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/3ds_hb_menu Homebrew Starter Pack]<br />
| Everything to get you started.<br />
| [[User:smea|smea]]<br />
| [https://smealum.github.io/ninjhax2/starter.zip Here]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/3ds_hb_menu Homebrew Launcher (v1.x)]<br />
| The old version of the 3DS Homebrew Launcher, originally created for ninjhax 1.x (Discontinued)<br />
| [[User:smea|smea]]<br />
| [https://smealum.github.io/ninjhax2/boot.3dsx Here]<br />
| Yes<br />
|-<br />
| [http://gbatemp.net/threads/release-homebrew-launcher-with-grid-layout.397527/ Mashers' HBL]<br />
| Homebrew Launcher with grid and folder support. (Discontinued)<br />
| [[User:Mashers|Mashers]]<br />
| [https://github.com/d0k3/3DS-Extended-Homebrew-Starter-Pack/blob/35b8ab7dc40cb550b6ea45da319cdd0a0a3b2b54/boot.3dsx Here]<br />
| Lost in masher's retirement<br />
|}<br />
<br />
=== Applications ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/VideahGams/3dsfetch 3dsfetch]<br />
| Small 3DS version of a popular Linux ricing script called screenfetch.<br />
| [[User:VideahGams|VideahGams]]<br />
| [https://github.com/VideahGams/3dsfetch/tree/master Here]<br />
| Yes<br />
| 2015-09-17<br />
|-<br />
| [https://github.com/JohnodonCode/TSI9 TSI9]<br />
| A simple program for detecting touch screen input.<br />
| [[User:Johnodon|Johnodon]]<br />
| [https://github.com/JohnodonCode/TSI9/releases Here]<br />
| Yes<br />
| 2020-1-18<br />
|-<br />
| [https://github.com/joel16/3DSident/ 3DSident]<br />
| Identity tool for the Nintendo 3DS heavily inspired by PSPident.<br />
| [[User:Joel16|Joel16]]<br />
| [https://github.com/joel16/3DSident/releases Here]<br />
| Yes<br />
| 2018-8-2<br />
|-<br />
| [https://gbatemp.net/threads/release-clear-mac-filter.515882/ Clear MAC Filter]<br />
| Reset 8-hour per-console StreetPass rate limiting<br />
| tastymeatball<br />
| [https://gbatemp.net/threads/release-clear-mac-filter.515882/ Here]<br />
| Yes<br />
| 2018-8-24<br />
|-<br />
| [https://github.com/CPunch/CtrRGBPATTY/releases CtrRGBPATTY]<br />
| Generate patches that edit LED notifications<br />
| CPunch<br />
| [https://github.com/CPunch/CtrRGBPATTY/releases Here]<br />
| Yes<br />
| 2017-11-3<br />
|-<br />
| [https://github.com/plutooo/ctrrpc ctrrpc]<br />
| A small and easily extensible RPC server/client written in C/Python. Allows you to quickly poke service-commands and <code>syscall</code>s over Wi-Fi from a Python shell on your PC. Useful during reverse-engineering. ''No longer under (active) development?''<br />
| [[User:plutooo|plutoo]]<br />
| Build from [https://github.com/plutooo/ctrrpc repo]<br />
| Yes<br />
| 2014-11-10<br />
|-<br />
| [https://github.com/yellows8/ctr-streaming-server ctr-streaming-server]<br />
| A 3DS homebrew audio/video playback server. It can also send [[HID_Shared_Memory|HID]] state to the client (see the README) when enabled. The included <code>parse_hidstream</code> tool can be used to parse that HID data to simulate keyboard/mouse input events, via Linux <code>uinput</code>. ''No longer under (active) development?''<br />
| [[User:yellows8|yellows8]]<br />
| Build from [https://github.com/yellows8/ctr-streaming-server repo]<br />
| Yes<br />
| 2014-11-20<br />
|-<br />
| [https://github.com/DownloadMii/DownloadMii-3DS DownloadMii]<br />
| A WIP repo-based online marketplace for homebrew applications & games.<br />
| [[User:filfat|filfat]]<br />
| Build from [https://github.com/DownloadMii/DownloadMii-3DS repo]<br />
| Yes<br />
| 2015-11-24<br />
|-<br />
| [https://github.com/linoma/fb43ds fb43ds]<br />
| A simple 3DS Facebook chat client<br />
| [[User:linoma|linoma]]<br />
| Build from [https://github.com/linoma/fb43ds repo]<br />
| Yes<br />
| 2015-04-07<br />
|-<br />
| [https://github.com/iamevn/for-anyone-who-walks-a-lot for-anyone-who-walks-a-lot]<br />
| Tool to get past the 10 coin per day limit on earning Play Coins by walking.<br />
| [[User:iamevn|iamevn]]<br />
| [https://github.com/iamevn/for-anyone-who-walks-a-lot/releases Here]<br />
| Yes<br />
| 2016-03-26<br />
|-<br />
| [https://github.com/zeta0134/3ds-homebrew-browser Homebrew Browser]<br />
| Download homebrew from the internet!<br />
| [[User:cromo|cromo]], [[User:zeta0134|zeta0134]]<br />
| [https://github.com/zeta0134/3ds-homebrew-browser/releases Here]<br />
| Yes<br />
| 2015-10-07<br />
|-<br />
| [https://github.com/MrJPGames/NFCReader NFCReader]<br />
| Allows you to use your 3DS as a NFC/RFID UID Scanner.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/NFCReader/releases Here]<br />
| Yes<br />
| 2017-01-21<br />
|-<br />
| [https://github.com/SciresM/ScreenInfo ScreenInfo]<br />
| Identify whether New 3DS LCD panels are TN or IPS.<br />
| [[User:SciresM|SciresM]]<br />
| [https://github.com/SciresM/ScreenInfo/releases Here]<br />
| Yes<br />
| 2016-09-04<br />
|}<br />
<br />
=== Game Engines ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/TurtleP/LovePotion Löve Potion]<br />
| [https://love2d.org/ LOVE2D] for 3DS Homebrew.<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/LovePotion/releases Here]<br />
| [https://github.com/TurtleP/LovePotion Yes]<br />
| 2018-08-27<br />
|-<br />
| [https://ctrulua.github.io/ ctrµLua]<br />
| A Lua interpreter for 3DS, brought to life by the remnants of the µLua community.<br />
| [[User:Firew0lf|Firew0lf]], Reuh, Negi<br />
| [https://github.com/ctruLua/ctruLua/releases Here]<br />
| Yes<br />
| 2016-06-27<br />
|-<br />
| [https://blog.easyrpg.org/2016/05/player-for-nintendo-3ds/ EasyRPG Player]<br />
| RPG Maker 2000/2003 interpreter<br />
| [[User:Rinnegatamante|Rinnegatamante]] & EasyRPG Team<br />
| [https://easyrpg.org/player/downloads/ Here]<br />
| [https://github.com/EasyRPG/Player Yes]<br />
| 2019-03-03<br />
|-<br />
| [https://github.com/Rinnegatamante/lpp-3ds LuaPlayer+ 3DS]<br />
| First Lua interpreter 3DS homebrew, under Lua 5.3.1<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [https://github.com/Rinnegatamante/lpp-3ds/releases Here]<br />
| Yes<br />
| 2016-09-21<br />
|-<br />
| [http://vault.digitalmzx.net MegaZeux 3DS]<br />
| A port of the MegaZeux GCS to the 3DS.<br />
| MegaZeux developers<br />
| [http://vault.digitalmzx.net Here]<br />
| [https://github.com/AliceLR/megazeux Yes]<br />
| 2018-03-04<br />
|}<br />
<br />
=== Games ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [http://gbatemp.net/threads/release-100-boxes-2ds.384714/ 100 Boxes 2DS]<br />
| A remake of homebrew "100 Boxes puzzle" for DS and GBA.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/100Boxes2DS/100_Boxes_2DS.rar Here]<br />
| No<br />
| 2015-11-11<br />
|-<br />
| [https://github.com/MrJPGames/2048-3D 2048-3D]<br />
| A port of the popular game 2048 for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/2048-3D/releases Here]<br />
| Yes<br />
| 2016-02-12<br />
|-<br />
| ''[https://github.com/smealum/3dscraft 3DSCraft]''<br />
| A Minecraft port for the 3DS. ''No longer under (active) development?''<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/3dscraft repo] (alt. [https://smealum.github.io/3dscraft/downloads/3dscraft_141120.zip here])<br />
| Yes<br />
| 2014-11-20<br />
|-<br />
| [https://github.com/markwinap/3DS_Nyan_Cat 3DS Nyan Cat]<br />
| A port of Nyan Cat for the 3DS, using <code>LIBSF2D</code>.<br />
| [[User:markwinap|markwinap]]<br />
| Build from [https://github.com/markwinap/3DS_Nyan_Cat repo] (alt. [https://www.dropbox.com/s/e400my3xm0zw74r/nyan_cat.zip?dl=0 here])<br />
| Yes<br />
| 2015-05-26<br />
|-<br />
| [https://gbatemp.net/threads/preview-ld-34-port-antibounce.406361 Antibounce]<br />
| "Move your player to bounce around and collect coins. Go between screens through the holes in the sides of the floor. 3D can also be enabled."<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/Antibounce/releases Here]<br />
| Yes<br />
| 2015-12-23<br />
|-<br />
| [https://github.com/Magicrafter13/Breakout Breakout]<br />
| "A 3ds Breakout Clone."<br />
| [[User:Magicrafter13|Magicrafter13]]<br />
| [https://github.com/Magicrafter13/Breakout/releases Here]<br />
| Yes<br />
| 2017-10-17<br />
|-<br />
| ''[https://github.com/UnsureSherlock/checkers3ds checkers3ds]''<br />
| A checkers game in glorious ASCII. ''No longer under development.''<br />
| [[User:UnsureSherlock|UnsureSherlock]]<br />
| Build from [https://github.com/UnsureSherlock/checkers3ds repo]<br />
| Yes<br />
| 2016-02-25<br />
|-<br />
| [https://github.com/Kaisogen/CookieCollector-3DS- Cookie Collector]<br />
| A tiny adaptation of the popular [https://en.wikipedia.org/wiki/Cookie_Clicker Cookie Clicker] game for the 3DS.<br />
| [[User:Kaisogen|Kaisogen]]<br />
| [https://github.com/Kaisogen/CookieCollector-3DS-/releases Here]<br />
| Yes<br />
| 2017-06-04<br />
|-<br />
| [https://github.com/TheMachinumps/Cookie_Clicker_3DS Cookie Clicker 3DS]<br />
| A simple Cookie Clicker type of game inspired by [[User:Kaisogen|Kaisogen]]'s Cookie Collector<br />
| [[User:TheMachinumps|TheMachinumps]]<br />
| [https://github.com/TheMachinumps/Cookie_Clicker_3DS/releases Here]<br />
| Yes<br />
| 2016-08-27<br />
|-<br />
| [https://gbatemp.net/threads/release-drawattack-networked-drawing-game.402291/ DrawAttack]<br />
| Online multiplayer drawing game, like Pictionary.<br />
| [[User:Cruel|Cruel]]<br />
| [https://github.com/Cruel/DrawAttack/releases Here]<br />
| Yes<br />
| 2016-04-17<br />
|-<br />
| [https://github.com/masterfeizz/EDuke3D EDuke3D]<br />
| An unofficial port of EDuke32 for the 3DS.<br />
| [[User:MasterFeizz|MasterFeizz]]<br />
| [https://github.com/masterfeizz/EDuke3D/releases Here]<br />
| Yes<br />
| 2016-05-09<br />
|-<br />
| [https://gbatemp.net/threads/release-hamsters-2ds.383457/ Hamsters 2DS]<br />
| A text-based hamster breeding game.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Hamsters2DS/Hamsters_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://github.com/BHSPitMonkey/Helii3DS Helii]<br />
| A port of [https://github.com/BHSPitMonkey/Helii3D Helii] for the 3DS.<br />
| [[User:BHSPitMonkey|BHSPitMonkey]]<br />
| [https://github.com/BHSPitMonkey/Helii3DS/releases Here]<br />
| Yes<br />
| 2015-09-18<br />
|-<br />
| [https://github.com/sgowen/insectoid-defense Insectoid Defense]<br />
| A Sci-Fi Tower Defense game.<br />
| [[User:Sgowen|sgowen]]<br />
| [https://github.com/sgowen/insectoid-defense/releases Here]<br />
| Yes<br />
| 2015-11-09<br />
|-<br />
| [https://github.com/VideahGams/NumberFucker3DS NumberFucker3DS]<br />
| Simple math game, originally used as a debug game for LövePotion.<br />
| [[User:VideahGams|VideahGams]]<br />
| [https://github.com/VideahGams/NumberFucker3DS Here]<br />
| Yes<br />
| 2015-09-19<br />
|-<br />
|[https://gbatemp.net/threads/release-zelda-roth-for-3ds.425503/ Zelda ROTH for 3DS]<br />
|A port of Legend of Zelda: Return of the Hylian, a Zelda fangame, to 3DS.<br />
|[[User:nop90|nop90]]<br />
|[https://github.com/nop90/ZeldaROTH/releases Here]<br />
|Yes<br />
|2016-09-11<br />
|-<br />
| [https://gbatemp.net/threads/release-mastermind-3ds.394710/ Mastermind 3DS]<br />
| A port of Mastermind for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/Mastermind-3DS/releases Here]<br />
| Yes<br />
| 2015-08-15<br />
|-<br />
| [http://gbatemp.net/threads/release-minesweeper-2ds.384185/ Minesweeper 2DS]<br />
| A port of Minesweeper for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Minesweeper2DS/Minesweeper_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://pyug.at/PyWeek/2012-09 One Whale Trip]<br />
| Five-lane underwater whale swimming/pearl pickup adventure game in Python.<br />
| [[User:thp|thp]]<br />
| [https://bitbucket.org/pyugat/pyweek1209/downloads/OneWhaleTrip-2016-07-18-3DS.zip Here]<br />
| [https://bitbucket.org/pyugat/pyweek1209/src/bce5156dbee72f38c4fcf5d7b3df9cfb9ddd5b0a/3ds Yes]<br />
| 2016-10-02<br />
|-<br />
| [http://gbatemp.net/threads/release-paddle-puffle-3ds.392215/ Paddle Puffle 3DS]<br />
| A port of [http://puffles.gatuno.mx Paddle Puffle] for the 3DS.<br />
| [[User:Peanut42|Peanut42]]<br />
| [http://puffles.gatuno.mx/releases/paddlepuffle3ds.zip Here]<br />
| [https://github.com/gatuno/PaddlePuffle3DS Yes]<br />
| 2015-07-05<br />
|-<br />
| [http://david.dantoine.org/proyecto/26/ Pituka Classics]<br />
| Play CPC classics using [http://david.dantoine.org/proyecto/4/ Pituka Emulator-Core] on 3DS.<br />
| [[User:D_Skywalk|D_Skywalk]]<br />
| [http://david.dantoine.org/descargas/72 Rick Dangerous] [http://david.dantoine.org/descargas/2 Core]<br />
| [http://david.dantoine.org/descargas/4 Yes (core)]<br />
| 2016-02-26<br />
|-<br />
| [http://gbatemp.net/threads/release-pixel-shuffle-2ds.398540/ Pixel Shuffle 2DS]<br />
| An adaptation of the puzzle game [http://www.gimme5games.com/play-game/pixelshuffle Pixel Shuffle] for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/PixelShuffle2DS/Pixel_Shuffle_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [http://gbatemp.net/threads/release-pixel-swap-2ds.395749/ Pixel Swap 2DS]<br />
| An adaptation of puzzle games Pixel Swap 1 & 2 for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/PixelSwap2DS/Pixel_Swap_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://github.com/smealum/portal3DS Portal3DS]<br />
| An adaptation of [https://en.wikipedia.org/wiki/Portal_(video_game) Portal] for the 3DS.<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/portal3DS repo] (Precompiled [http://www.mediafire.com/file/yo463wt6y4tybch/portal3DS.rar here])<br />
| Yes<br />
| 2015-08-18<br />
|-<br />
| [https://github.com/masterfeizz/ctrQuake ctrQuake]<br />
| An unofficial port of Quake for the 3DS, fully playable.<br />
| [[User:MasterFeizz|MasterFeizz]]<br />
| [https://github.com/masterfeizz/ctrQuake/releases Here]<br />
| Yes<br />
| 2016-09-16<br />
|-<br />
| [https://gbatemp.net/threads/release-reversi-othello-for-3ds.395442/ Reversi]<br />
| [https://en.wikipedia.org/wiki/Reversi Reversi] for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/Othello-3DS/releases Here]<br />
| Yes<br />
| 2016-03-05<br />
|-<br />
| [https://github.com/landm2000/sokoban Sokoban]<br />
| An unofficial port of the puzzle game [https://en.wikipedia.org/wiki/Sokoban Sokoban] for the 3DS.<br />
| [[User:Landm|Landm]]<br />
| [https://github.com/landm2000/sokoban/tree/master Here]<br />
| Yes<br />
| 2016-03-14<br />
|-<br />
| [https://gbatemp.net/threads/release-space-fruit.399088/ Space Fruit]<br />
| Hackathon game by 4 friends ported to 3DS. Asteroids but with fruit.<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/SpaceFruit/releases Here]<br />
| Yes<br />
| 2016-04-09<br />
|-<br />
| [https://github.com/sgowen/tappy-plane Tappy Plane]<br />
| A port of [https://en.wikipedia.org/wiki/Flappy_Bird Flappy Bird] for 3DS, but with a colorful plane.<br />
| [[User:Sgowen|sgowen]]<br />
| [https://github.com/sgowen/tappy-plane/releases Here]<br />
| Yes<br />
| 2015-11-09<br />
|-<br />
| [https://thp.itch.io/tetrepetete-3ds Tetrepetete 3DS]<br />
| A game with blocks.<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/tetrepetete-3ds Here]<br />
| No<br />
| 2016-06-29<br />
|-<br />
| [http://gbatemp.net/threads/release-tilemap-2ds.386733/ TileMap 2DS]<br />
| An adaptation of the puzzle game TileMap for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/TileMap2DS/TileMap_2DS.rar Here]<br />
| No<br />
| 2015-11-03<br />
|-<br />
| [http://gbatemp.net/threads/release-tiles-2ds.385796/ Tiles 2DS]<br />
| An adaptation of the puzzle game Lights Out for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Tiles2DS/Tiles_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://thp.itch.io/that-rabbit-game-3ds That Rabbit Game 3DS]<br />
| Inverse duck hunt with accelerometer input and stereoscopic 3D.<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/that-rabbit-game-3ds Here]<br />
| No<br />
| 2016-07-04<br />
|-<br />
| [http://gbatemp.net/threads/trucmuche-2ds-09.404859// Trucmuche 2DS 09]<br />
| An adaptation of the hidden objects game Trucmuche for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Trucmuche2DS09/Trucmuche_2DS_09.rar Here]<br />
| No<br />
| 2015-12-03<br />
|-<br />
| [https://github.com/Steveice10/WorldOf3DSand World of 3DSand]<br />
| A port of World of Sand for the 3DS.<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/WorldOf3DSand/releases Here]<br />
| Yes<br />
| 2016-07-12<br />
|-<br />
| [https://github.com/smealum/yeti3DS Yeti3DS]<br />
| A quick and dirty port of Derek Evans' Yeti3D software rendering engine.<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/yeti3DS repo]<br />
| Yes<br />
| 2015-08-07<br />
|-<br />
| [https://thp.itch.io/loonies-8192 Loonies 8192]<br />
| A Mini Retro Puzzle for DOS, the PSP and 3DS (Homebrew)<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/loonies-8192 Here]<br />
| No<br />
| 2019-01-27<br />
|-<br />
|}<br />
<br />
=== Emulators ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| ''[https://github.com/st4rk/3DNES 3DNES]''<br />
| A NES emulator, without sound support. ''No longer under development.''<br />
| st4rk, gdkChan<br />
| [https://github.com/St4rk/3DNES/raw/master/3DNES_old.3dsx Here]<br />
| Yes<br />
| 2015-03-28<br />
|-<br />
| [http://asie.pl/homebrew/#atari800 atari800-3DS]<br />
| An Atari 8-bit home computer emulator.<br />
| asie<br />
| [http://asie.pl/homebrew/#atari800 Here]<br />
| [https://github.com/asiekierka/atari800-3ds Yes]<br />
| 2016-10-29<br />
|-<br />
| [https://github.com/StapleButter/blargSnes blargSnes]<br />
| A Super Nintendo (SNES) emulator. A compatibility list can be found [http://wiki.gbatemp.net/wiki/BlargSnes_Compatibility_List here].<br />
| StapleButter<br />
| [http://blargsnes.kuribo64.net/download/blargSnes_1.3b.zip Here]<br />
| Yes<br />
| 2015-06-12<br />
|-<br />
| [https://github.com/xerpi/CHIP-3DS CHIP-3DS]<br />
| A simple and slow CHIP-8 emulator.<br />
| xerpi<br />
| Build from [https://github.com/xerpi/CHIP-3DS repo] (alt. [https://www.mediafire.com/?y94yjhzf70fsfsi here])<br />
| Yes<br />
| 2015-04-02<br />
|-<br />
| [https://gbatemp.net/threads/chip8-3ds.434425/ CHIP8-2DS]<br />
| CHIP-8 emulator with savestates and touch controls.<br />
| nopy4869<br />
| [https://github.com/nopy4869/CHIP8-2DS/releases Here]<br />
| Yes<br />
| 2016-07-20<br />
|-<br />
| [https://github.com/shinyquagsire23/gpsp CitrAGB]<br />
| Yet another GBA emulator for the 3DS.<br />
| [[User:shinyquagsire23|Shiny Quagsire]]<br />
| Build from [https://github.com/shinyquagsire23/gpsp/tree/master/3ds repo] (alt. [https://www.dropbox.com/s/sxb7x34u58g4zo2/3ds.3dsx?dl=0 here])<br />
| Yes<br />
| 2015-09-21<br />
|-<br />
| [https://github.com/Steveice10/GameYob GameYob]<br />
| A Game Boy (Color) emulator. A compatibility list can be found [http://wiki.gbatemp.net/wiki/GameYob_3DS_Compatibility_List here].<br />
| Drenn/Steveice10<br />
| [https://github.com/Steveice10/GameYob/releases Here]<br />
| Yes<br />
| 2016-07-17<br />
|-<br />
| [https://github.com/mgba-emu/mgba mGBA]<br />
| A GBA emulator that runs well without kernel hax.<br />
| endrift<br />
| [https://mgba.io/downloads.html Here]<br />
| Yes<br />
| 2016-10-13<br />
|-<br />
| [https://github.com/mrdanielps/r3Ddragon r3Ddragon]<br />
| A WIP Virtual Boy emulator for the 3DS based on Reality Boy / Red Dragon.<br />
| mrdanielps<br />
| [https://github.com/mrdanielps/r3Ddragon/releases Here]<br />
| Yes<br />
| 2016-08-16<br />
|-<br />
| [https://github.com/libretro/RetroArch RetroArch]<br />
| A multisystem emulator. (GB, GBA, SNES, Genesis, CPS1, CPS2, etc.)<br />
| libretro<br />
| [http://buildbot.libretro.com/nightly/nintendo/3ds/ Here]<br />
| Yes<br />
| Undergoing rapid development.<br />
|-<br />
| [https://github.com/bubble2k16/snes9x_3ds SNES9x for 3DS]<br />
| A SNES emulator for the old 3DS / 2DS. Optimised from Snes9x 1.43 and runs many games at full speed. Compatibility list [http://wiki.gbatemp.net/wiki/Snes9x_for_3DS here]<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/snes9x_3ds/releases Here]<br />
| Yes<br />
| 2017-02-11<br />
|-<br />
| [https://github.com/bubble2k16/emus3ds_3ds VirtuaNES for 3DS]<br />
| A NES emulator for the old 3DS / 2DS. Optimised from VirtuaNES 0.9.7 and runs many games at full speed.<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/emus3ds/releases Here]<br />
| Yes<br />
| 2017-03-23<br />
|-<br />
| [https://github.com/bubble2k16/emus3ds_3ds TemperPCE for 3DS]<br />
| A PC-Engine/Turbografx-16 emulator for the old 3DS / 2DS. Optimised from Temper runs all games, including CD-ROM and SGX games at full speed.<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/temperpce_3ds/releases Here]<br />
| Yes<br />
| 2017-06-19<br />
|-<br />
|}<br />
<br />
===Theme managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/yellows8/3ds_homemenu_extdatatool 3DS HomeMenu extdata Tool]<br />
| Tool for accessing the SD extdata which Home Menu uses. This essentially allows writing custom themes to extdata which get loaded at Home Menu startup.<br />
| [[User:yellows8|yellows8]]<br />
| [https://github.com/yellows8/3ds_homemenu_extdatatool/releases Here]<br />
| Yes<br />
| 2015-08-17<br />
|-<br />
| [https://github.com/Rinnegatamante/CHMM2 Custom Home Menu Manager 2]<br />
| Theme manager for Nintendo 3DS. Discontinued.<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [http://rinnegatamante.it/CHMM2.rar Here]<br />
| Yes<br />
| 2016-07-04<br />
|-<br />
| [https://github.com/ErmanSayin/Themely/tree/88e93816e3b43a40bcee25b1a7a8c71ef6a37db8 Themely]<br />
| Theme manager for Nintendo 3DS with 3dsthem.es integration.<br />
| ErmanSayin<br />
| [https://github.com/ErmanSayin/Themely/releases/tag/v1.3.1 Here]<br />
| Not anymore, 1.3.1 last FOSS version<br />
| 2017-6-28<br />
|- <br />
|[https://github.com/usagirei/3DS-Theme-Editor Usagi 3DS Theme Editor]<br />
|A simple 3DS theme editor for PC. You will need to have the .NET Library installed on your PC first before you can use it.<br />
|[https://github.com/usagirei usagirei]<br />
|[https://github.com/usagirei/3DS-Theme-Editor/archive/master.zip Here]<br />
|Not sure<br />
|2017.05.28<br />
|-<br />
| [https://gbatemp.net/threads/release-anemone3ds-a-complete-theme-and-splash-manager-for-your-3ds.482804/ Anemone3DS]<br />
| New theme and Luma splash screen manager, created to fill the gap left by its predecessors.<br />
| [[User:astronautlevel2]]<br />
| [https://github.com/astronautlevel2/Anemone3DS/releases/ Here]<br />
| Yes<br />
| 2018-5-13<br />
|}<br />
<br />
===Title managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/Steveice10/FBI FBI]<br />
| Open source CIA (un)installer and launcher.<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/FBI/releases?after=2.0.0 Here]<br />
| Yes<br />
| 2015-12-02<br />
|-<br />
| [https://github.com/Steveice10/FBI FBI 2]<br />
| Multipurpose file/title/ticket/save manager<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/FBI/releases Here]<br />
| Yes<br />
| 2018-8-21<br />
|-<br />
| [https://gbatemp.net/threads/no-longer-working-community-freeshop-fork-open-source-eshop-alternative.483159/ FreeShop]<br />
| GUI CDN title installer<br />
| TheCruel/arc13/Paul/evi<br />
| [https://notabug.org/evi/freeShop/releases Here]<br />
| Yes<br />
| 2018-5-17<br />
|-<br />
| [https://gbatemp.net/threads/release-nasa-universal-cia-manager-for-fw-4-1-10-3.409806/ NASA]<br />
| Universal CIA Manager for FWs 4.1 - 10.7<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [http://rinnegatamante.it/site/3ds_hbs.php Here]<br />
| No<br />
| 2016-04-13<br />
|}<br />
<br />
Note: downloading non-system applications from CDN is broken in any known homebrew, regardless of whether a signed ticket is installed or not (See also: [[11.8.0-41#Server-side_changes]])<br />
<br />
=== Save managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://gbatemp.net/threads/save-data-manager-and-editor-for-firmware-up-to-9-9.396245/ save_manager]<br />
| Proof of concept save exporter/importer<br />
| [[User:profi200|profi200]]<br />
| [http://gbatemp.net/attachments/save_manager_-with_smdh-zip.24349/ Here]<br />
| [https://gist.github.com/profi200/d0d092c11d0eb0692748 Yes]<br />
| 2015-09-13<br />
|-<br />
| [https://github.com/meladroit/svdt svdt]<br />
| Save Data Explorer/Manager<br />
| [[User:meladroit|meladroit]]<br />
| [https://github.com/meladroit/svdt/releases Here]<br />
| Yes<br />
| 2015-10-16<br />
|-<br />
| [https://gbatemp.net/threads/release-jks-savemanager-homebrew-cia-save-manager.413143/ JK's Save Manager]<br />
| Save/Extdata Manager<br />
| JK_<br />
| [https://gbatemp.net/threads/release-jks-savemanager-homebrew-cia-save-manager.413143/ Here]<br />
| [https://github.com/J-D-K/JKSM/ Yes]<br />
| 2016-09-29<br />
|-<br />
| JK's Save Manager for Rosalina<br />
| Modded version of JKSM for use as .3dsx on Luma 8+<br />
| Phalk, JK_<br />
| [https://github.com/Phalk/JKSM/releases Here]<br />
| Yes<br />
| 2017-7-12<br />
|-<br />
| [https://github.com/FlagBrew/PKSM PKSM]<br />
| Save editor for Pokémon generations 4 to 7<br />
| Bernardo Giordano<br />
| [https://github.com/FlagBrew/PKSM/releases Here]<br />
| Yes<br />
| 2020-1-29<br />
|-<br />
| [https://github.com/FlagBrew/Checkpoint Checkpoint]<br />
| Fast and simple homebrew save manager for 3DS and Switch written in C++<br />
| Bernardo Giordano<br />
| [https://github.com/FlagBrew/Checkpoint/releases Here]<br />
| Yes<br />
| 2019-12-9<br />
|-<br />
| [https://github.com/phijor/SpecializeMii/ SpecializeMii]<br />
| Editor for Mii database (specialness)<br />
| phijor<br />
| [https://github.com/phijor/SpecializeMii/releases Here]<br />
| Yes<br />
| 2017-1-22<br />
|-<br />
| [https://github.com/rboninsegna/SpecializeMii/ SpecializeMii]<br />
| Editor for Mii database (specialness and ownership)<br />
| phijor, [[User:Ryccardo|Ryccardo]]<br />
| [https://github.com/rboninsegna/SpecializeMii/releases Here]<br />
| Yes<br />
| 2017-8-13<br />
|}<br />
<br />
=== File servers ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/mtheall/ftpd ftpd (ftBrony)]<br />
| A FTP server.<br />
| [https://github.com/mtheall mtheall]<br />
| [https://github.com/mtheall/ftpd/releases Here]<br />
| Yes<br />
| 2016-09-17<br />
|-<br />
| ''[https://github.com/iamevn/FTP-3DS FTP-3DS]''<br />
| Fork of ftBrony with a Nintendo theme. ''No longer under development and without repo.''<br />
| [[User:iamevn|iamevn]]<br />
| N/A<br />
| Yes (''No source officially available.'')<br />
| N/A<br />
|-<br />
| [https://github.com/FloatingStar/FTP-GMX FTP - Graphic ModifierX Edition]<br />
| Fork of ftpd with aesthetic modifications.<br />
| [[User:FloatingStar|FloatingStar]]<br />
| [https://github.com/FloatingStar/FTP-GMX/releases Here]<br />
| Yes<br />
| 2016-01-27<br />
|-<br />
| [https://github.com/smealum/ftpony ftpony]<br />
| A basic FTP server, useful for testing new homebrew versions without swapping the SD card. ''No longer under (active) development?''<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/ftpony repo] (alt. [https://mega.co.nz/#!nchBkL7B!T3vXnX4q8Uwp6APYYTDSZi2bkm25la-Qyz6j4CjsllI here])<br />
| Yes<br />
| 2014-11-24<br />
|}<br />
<br />
=== Icon Packs ===<br />
Icon Packs are <code>SMDH</code> Packs for homebrew apps.<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="10%" | Last Updated<br />
|-<br />
| [https://gbatemp.net/threads/icon-pack-simplok-for-the-homebrew-launcher.396750/ Simplok]<br />
| The first 3DS Icon pack.<br />
| [[User:link6155|link6155]]<br />
| [http://1drv.ms/1EJCq2e Here]<br />
| 2015-09-12<br />
|-<br />
| ''[https://gbatemp.net/threads/1lp-icon-pack.402018/ 1LP]''<br />
| Another 3DS Icon pack. ''Repo is dead, no alternate downloads available.''<br />
| [[User:100pcrack|100pcrack]]<br />
| N/A<br />
| 2015-12-22<br />
|-<br />
| [https://gbatemp.net/threads/icon-pack-modern-ui.404366/ Modern UI]<br />
| A simple icon pack with a flat and minimalist design.<br />
| [[User:LouchDaishiteru|LouchDaishiteru]]<br />
| [https://gbatemp.net/threads/icon-pack-modern-ui.404366/ Here]<br />
| 2016-02-15<br />
|}<br />
<br />
=== Demos ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/halcy/nordlicht19 Skate Station]<br />
| A demo for the 3DS featuring music and 3D effects <br />
| SVatG<br />
| [https://aka-san.halcy.de/nordlicht2019/Skate%20Station.zip Here]<br />
| Yes<br />
| July 2019<br />
|-<br />
| cubedemo<br />
| A short demo of Homebrew on the 3DS, with working sound.<br />
| [[User:plutoo|plutoo]]<br />
| [https://mega.co.nz/#!KUQFiQYA!pv8HDEyrmuX6Eyw2hW0opL7gf9Ztmjd9J5pPsvs_rD4 Here]<br />
| No<br />
| N/A<br />
|-<br />
| [https://gbatemp.net/threads/release-3ds-rgb-led-test-program.441633/ MCU Bricker / LED Rave]<br />
| Make the notification LED glow in different colors<br />
| [[User:MarcusD]]<br />
| [https://gbatemp.net/attachments/rgb-zip.124119/ Here]<br />
| Yes, but down<br />
| Late 2016?<br />
|-<br />
| Spine 2D<br />
| Demo of [http://esotericsoftware.com/ Spine]'s 2D skeletal animations<br />
| [[User:Cruel|Cruel]]<br />
| [https://mega.nz/#!Xg411B5R!kcVHP69Ilggmjh4q5OYmr2cFvf5UGdHWA98-_VttDTo 3DSX]; [https://mega.nz/#!z8gxHSQb!H0as1A4wqYrdKBhXJwdYik7nPd_msXJhz5N1CeZm1Iw CIA]<br />
| No<br />
| N/A<br />
|-<br />
| [http://www.pouet.net/prod.php?which=66607 demo ou mourir]<br />
| Small demo for the 3DS with music and 2D effects<br />
| Desire<br />
| [http://mudlord.info/democrap/dsr_demooumourir.zip Here]<br />
| No<br />
| November 2015<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Homebrew_Applications&diff=21153
Homebrew Applications
2020-01-19T16:23:31Z
<p>Oreo639: Correct mistaken changes to the wrong application in the previous change to this page</p>
<hr />
<div>== Installing ==<br />
Applications are installed by copying the necessary files directly to the <code>3ds/</code> folder in the root of the SD card (preferred for new designs), or in a subdirectory of <code>3ds/</code>, in which case said subfolder must be named identically to its executable. Most applications come with two files:<br />
* <code>[appname].3dsx</code>: The executable.<br />
* <code>[appname].smdh</code>: The icon/metadata. (Not required in any case, and may be integrated into the <code>.3dsx</code>)<br />
* <code>[appname].xml</code>: The list of supported targets (i.e. installed titles which the app supports replacing in memory at runtime, thus inheriting its permissions), and of any arguments to be passed to the .3dsx. (Optional)<br />
<br />
A standalone .xml file can point to a differently-named .3dsx, launching it with potentially different arguments so that a single application can run in different modes.<br />
<br />
The [[Homebrew Launcher]] will scan the SD card for all <code>.3dsx</code> files, but will only display an icon for those who have one according to the format described above. Recent enough versions can freely navigate the filesystem to select an application.<br />
<br />
== List ==<br />
<br />
=== Launchers ===<br />
<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="10%" | Open-Source<br />
|-<br />
| [https://github.com/fincs/new-hbmenu Homebrew Launcher]<br />
| Run homebrew on your 3DS! Compatible with Rosalina and all prior 3dsx loading solutions<br />
| [https://devkitpro.org devkitPro]<br />
| [https://github.com/fincs/new-hbmenu/releases Here]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/3ds_hb_menu Homebrew Starter Pack]<br />
| Everything to get you started.<br />
| [[User:smea|smea]]<br />
| [https://smealum.github.io/ninjhax2/starter.zip Here]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/3ds_hb_menu Homebrew Launcher (v1.x)]<br />
| The old version of the 3DS Homebrew Launcher, originally created for ninjhax 1.x (Discontinued)<br />
| [[User:smea|smea]]<br />
| [https://smealum.github.io/ninjhax2/boot.3dsx Here]<br />
| Yes<br />
|-<br />
| [http://gbatemp.net/threads/release-homebrew-launcher-with-grid-layout.397527/ Mashers' HBL]<br />
| Homebrew Launcher with grid and folder support. (Discontinued)<br />
| [[User:Mashers|Mashers]]<br />
| [https://github.com/d0k3/3DS-Extended-Homebrew-Starter-Pack/blob/35b8ab7dc40cb550b6ea45da319cdd0a0a3b2b54/boot.3dsx Here]<br />
| Lost in masher's retirement<br />
|}<br />
<br />
=== Applications ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/VideahGams/3dsfetch 3dsfetch]<br />
| Small 3DS version of a popular Linux ricing script called screenfetch.<br />
| [[User:VideahGams|VideahGams]]<br />
| [https://github.com/VideahGams/3dsfetch/tree/master Here]<br />
| Yes<br />
| 2015-09-17<br />
|-<br />
| [https://github.com/JohnodonCode/TSI9 TSI9]<br />
| A simple program for detecting touch screen input.<br />
| [[User:Johnodon|Johnodon]]<br />
| [https://github.com/JohnodonCode/TSI9/releases Here]<br />
| Yes<br />
| 2020-1-18<br />
|-<br />
| [https://github.com/joel16/3DSident/ 3DSident]<br />
| Identity tool for the Nintendo 3DS heavily inspired by PSPident.<br />
| [[User:Joel16|Joel16]]<br />
| [https://github.com/joel16/3DSident/releases Here]<br />
| Yes<br />
| 2018-8-2<br />
|-<br />
| [https://gbatemp.net/threads/release-clear-mac-filter.515882/ Clear MAC Filter]<br />
| Reset 8-hour per-console StreetPass rate limiting<br />
| tastymeatball<br />
| [https://gbatemp.net/threads/release-clear-mac-filter.515882/ Here]<br />
| Yes<br />
| 2018-8-24<br />
|-<br />
| [https://github.com/CPunch/CtrRGBPATTY/releases CtrRGBPATTY]<br />
| Generate patches that edit LED notifications<br />
| CPunch<br />
| [https://github.com/CPunch/CtrRGBPATTY/releases Here]<br />
| Yes<br />
| 2017-11-3<br />
|-<br />
| [https://github.com/plutooo/ctrrpc ctrrpc]<br />
| A small and easily extensible RPC server/client written in C/Python. Allows you to quickly poke service-commands and <code>syscall</code>s over Wi-Fi from a Python shell on your PC. Useful during reverse-engineering. ''No longer under (active) development?''<br />
| [[User:plutooo|plutoo]]<br />
| Build from [https://github.com/plutooo/ctrrpc repo]<br />
| Yes<br />
| 2014-11-10<br />
|-<br />
| [https://github.com/yellows8/ctr-streaming-server ctr-streaming-server]<br />
| A 3DS homebrew audio/video playback server. It can also send [[HID_Shared_Memory|HID]] state to the client (see the README) when enabled. The included <code>parse_hidstream</code> tool can be used to parse that HID data to simulate keyboard/mouse input events, via Linux <code>uinput</code>. ''No longer under (active) development?''<br />
| [[User:yellows8|yellows8]]<br />
| Build from [https://github.com/yellows8/ctr-streaming-server repo]<br />
| Yes<br />
| 2014-11-20<br />
|-<br />
| [https://github.com/DownloadMii/DownloadMii-3DS DownloadMii]<br />
| A WIP repo-based online marketplace for homebrew applications & games.<br />
| [[User:filfat|filfat]]<br />
| Build from [https://github.com/DownloadMii/DownloadMii-3DS repo]<br />
| Yes<br />
| 2015-11-24<br />
|-<br />
| [https://github.com/linoma/fb43ds fb43ds]<br />
| A simple 3DS Facebook chat client<br />
| [[User:linoma|linoma]]<br />
| Build from [https://github.com/linoma/fb43ds repo]<br />
| Yes<br />
| 2015-04-07<br />
|-<br />
| [https://github.com/iamevn/for-anyone-who-walks-a-lot for-anyone-who-walks-a-lot]<br />
| Tool to get past the 10 coin per day limit on earning Play Coins by walking.<br />
| [[User:iamevn|iamevn]]<br />
| [https://github.com/iamevn/for-anyone-who-walks-a-lot/releases Here]<br />
| Yes<br />
| 2016-03-26<br />
|-<br />
| [https://github.com/zeta0134/3ds-homebrew-browser Homebrew Browser]<br />
| Download homebrew from the internet!<br />
| [[User:cromo|cromo]], [[User:zeta0134|zeta0134]]<br />
| [https://github.com/zeta0134/3ds-homebrew-browser/releases Here]<br />
| Yes<br />
| 2015-10-07<br />
|-<br />
| [https://github.com/MrJPGames/NFCReader NFCReader]<br />
| Allows you to use your 3DS as a NFC/RFID UID Scanner.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/NFCReader/releases Here]<br />
| Yes<br />
| 2017-01-21<br />
|-<br />
| [https://github.com/SciresM/ScreenInfo ScreenInfo]<br />
| Identify whether New 3DS LCD panels are TN or IPS.<br />
| [[User:SciresM|SciresM]]<br />
| [https://github.com/SciresM/ScreenInfo/releases Here]<br />
| Yes<br />
| 2016-09-04<br />
|}<br />
<br />
=== Game Engines ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/TurtleP/LovePotion Löve Potion]<br />
| [https://love2d.org/ LOVE2D] for 3DS Homebrew.<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/LovePotion/releases Here]<br />
| [https://github.com/TurtleP/LovePotion Yes]<br />
| 2018-08-27<br />
|-<br />
| [https://ctrulua.github.io/ ctrµLua]<br />
| A Lua interpreter for 3DS, brought to life by the remnants of the µLua community.<br />
| [[User:Firew0lf|Firew0lf]], Reuh, Negi<br />
| [https://github.com/ctruLua/ctruLua/releases Here]<br />
| Yes<br />
| 2016-06-27<br />
|-<br />
| [https://blog.easyrpg.org/2016/05/player-for-nintendo-3ds/ EasyRPG Player]<br />
| RPG Maker 2000/2003 interpreter<br />
| [[User:Rinnegatamante|Rinnegatamante]] & EasyRPG Team<br />
| [https://easyrpg.org/player/downloads/ Here]<br />
| [https://github.com/EasyRPG/Player Yes]<br />
| 2019-03-03<br />
|-<br />
| [https://github.com/Rinnegatamante/lpp-3ds LuaPlayer+ 3DS]<br />
| First Lua interpreter 3DS homebrew, under Lua 5.3.1<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [https://github.com/Rinnegatamante/lpp-3ds/releases Here]<br />
| Yes<br />
| 2016-09-21<br />
|-<br />
| [http://vault.digitalmzx.net MegaZeux 3DS]<br />
| A port of the MegaZeux GCS to the 3DS.<br />
| MegaZeux developers<br />
| [http://vault.digitalmzx.net Here]<br />
| [https://github.com/AliceLR/megazeux Yes]<br />
| 2018-03-04<br />
|}<br />
<br />
=== Games ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [http://gbatemp.net/threads/release-100-boxes-2ds.384714/ 100 Boxes 2DS]<br />
| A remake of homebrew "100 Boxes puzzle" for DS and GBA.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/100Boxes2DS/100_Boxes_2DS.rar Here]<br />
| No<br />
| 2015-11-11<br />
|-<br />
| [https://github.com/MrJPGames/2048-3D 2048-3D]<br />
| A port of the popular game 2048 for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/2048-3D/releases Here]<br />
| Yes<br />
| 2016-02-12<br />
|-<br />
| ''[https://github.com/smealum/3dscraft 3DSCraft]''<br />
| A Minecraft port for the 3DS. ''No longer under (active) development?''<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/3dscraft repo] (alt. [https://smealum.github.io/3dscraft/downloads/3dscraft_141120.zip here])<br />
| Yes<br />
| 2014-11-20<br />
|-<br />
| [https://github.com/markwinap/3DS_Nyan_Cat 3DS Nyan Cat]<br />
| A port of Nyan Cat for the 3DS, using <code>LIBSF2D</code>.<br />
| [[User:markwinap|markwinap]]<br />
| Build from [https://github.com/markwinap/3DS_Nyan_Cat repo] (alt. [https://www.dropbox.com/s/e400my3xm0zw74r/nyan_cat.zip?dl=0 here])<br />
| Yes<br />
| 2015-05-26<br />
|-<br />
| [https://gbatemp.net/threads/preview-ld-34-port-antibounce.406361 Antibounce]<br />
| "Move your player to bounce around and collect coins. Go between screens through the holes in the sides of the floor. 3D can also be enabled."<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/Antibounce/releases Here]<br />
| Yes<br />
| 2015-12-23<br />
|-<br />
| [https://github.com/Magicrafter13/Breakout Breakout]<br />
| "A 3ds Breakout Clone."<br />
| [[User:Magicrafter13|Magicrafter13]]<br />
| [https://github.com/Magicrafter13/Breakout/releases Here]<br />
| Yes<br />
| 2017-10-17<br />
|-<br />
| ''[https://github.com/UnsureSherlock/checkers3ds checkers3ds]''<br />
| A checkers game in glorious ASCII. ''No longer under development.''<br />
| [[User:UnsureSherlock|UnsureSherlock]]<br />
| Build from [https://github.com/UnsureSherlock/checkers3ds repo]<br />
| Yes<br />
| 2016-02-25<br />
|-<br />
| [https://github.com/Kaisogen/CookieCollector-3DS- Cookie Collector]<br />
| A tiny adaptation of the popular [https://en.wikipedia.org/wiki/Cookie_Clicker Cookie Clicker] game for the 3DS.<br />
| [[User:Kaisogen|Kaisogen]]<br />
| [https://github.com/Kaisogen/CookieCollector-3DS-/releases Here]<br />
| Yes<br />
| 2017-06-04<br />
|-<br />
| [https://github.com/TheMachinumps/Cookie_Clicker_3DS Cookie Clicker 3DS]<br />
| A simple Cookie Clicker type of game inspired by [[User:Kaisogen|Kaisogen]]'s Cookie Collector<br />
| [[User:TheMachinumps|TheMachinumps]]<br />
| [https://github.com/TheMachinumps/Cookie_Clicker_3DS/releases Here]<br />
| Yes<br />
| 2016-08-27<br />
|-<br />
| [https://gbatemp.net/threads/release-drawattack-networked-drawing-game.402291/ DrawAttack]<br />
| Online multiplayer drawing game, like Pictionary.<br />
| [[User:Cruel|Cruel]]<br />
| [https://github.com/Cruel/DrawAttack/releases Here]<br />
| Yes<br />
| 2016-04-17<br />
|-<br />
| [https://github.com/masterfeizz/EDuke3D EDuke3D]<br />
| An unofficial port of EDuke32 for the 3DS.<br />
| [[User:MasterFeizz|MasterFeizz]]<br />
| [https://github.com/masterfeizz/EDuke3D/releases Here]<br />
| Yes<br />
| 2016-05-09<br />
|-<br />
| [https://gbatemp.net/threads/release-hamsters-2ds.383457/ Hamsters 2DS]<br />
| A text-based hamster breeding game.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Hamsters2DS/Hamsters_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://github.com/BHSPitMonkey/Helii3DS Helii]<br />
| A port of [https://github.com/BHSPitMonkey/Helii3D Helii] for the 3DS.<br />
| [[User:BHSPitMonkey|BHSPitMonkey]]<br />
| [https://github.com/BHSPitMonkey/Helii3DS/releases Here]<br />
| Yes<br />
| 2015-09-18<br />
|-<br />
| [https://github.com/sgowen/insectoid-defense Insectoid Defense]<br />
| A Sci-Fi Tower Defense game.<br />
| [[User:Sgowen|sgowen]]<br />
| [https://github.com/sgowen/insectoid-defense/releases Here]<br />
| Yes<br />
| 2015-11-09<br />
|-<br />
| [https://github.com/VideahGams/NumberFucker3DS NumberFucker3DS]<br />
| Simple math game, originally used as a debug game for LövePotion.<br />
| [[User:VideahGams|VideahGams]]<br />
| [https://github.com/VideahGams/NumberFucker3DS Here]<br />
| Yes<br />
| 2015-09-19<br />
|-<br />
|[https://gbatemp.net/threads/release-zelda-roth-for-3ds.425503/ Zelda ROTH for 3DS]<br />
|A port of Legend of Zelda: Return of the Hylian, a Zelda fangame, to 3DS.<br />
|[[User:nop90|nop90]]<br />
|[https://github.com/nop90/ZeldaROTH/releases Here]<br />
|Yes<br />
|2016-09-11<br />
|-<br />
| [https://gbatemp.net/threads/release-mastermind-3ds.394710/ Mastermind 3DS]<br />
| A port of Mastermind for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/Mastermind-3DS/releases Here]<br />
| Yes<br />
| 2015-08-15<br />
|-<br />
| [http://gbatemp.net/threads/release-minesweeper-2ds.384185/ Minesweeper 2DS]<br />
| A port of Minesweeper for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Minesweeper2DS/Minesweeper_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://pyug.at/PyWeek/2012-09 One Whale Trip]<br />
| Five-lane underwater whale swimming/pearl pickup adventure game in Python.<br />
| [[User:thp|thp]]<br />
| [https://bitbucket.org/pyugat/pyweek1209/downloads/OneWhaleTrip-2016-07-18-3DS.zip Here]<br />
| [https://bitbucket.org/pyugat/pyweek1209/src/bce5156dbee72f38c4fcf5d7b3df9cfb9ddd5b0a/3ds Yes]<br />
| 2016-10-02<br />
|-<br />
| [http://gbatemp.net/threads/release-paddle-puffle-3ds.392215/ Paddle Puffle 3DS]<br />
| A port of [http://puffles.gatuno.mx Paddle Puffle] for the 3DS.<br />
| [[User:Peanut42|Peanut42]]<br />
| [http://puffles.gatuno.mx/releases/paddlepuffle3ds.zip Here]<br />
| [https://github.com/gatuno/PaddlePuffle3DS Yes]<br />
| 2015-07-05<br />
|-<br />
| [http://david.dantoine.org/proyecto/26/ Pituka Classics]<br />
| Play CPC classics using [http://david.dantoine.org/proyecto/4/ Pituka Emulator-Core] on 3DS.<br />
| [[User:D_Skywalk|D_Skywalk]]<br />
| [http://david.dantoine.org/descargas/72 Rick Dangerous] [http://david.dantoine.org/descargas/2 Core]<br />
| [http://david.dantoine.org/descargas/4 Yes (core)]<br />
| 2016-02-26<br />
|-<br />
| [http://gbatemp.net/threads/release-pixel-shuffle-2ds.398540/ Pixel Shuffle 2DS]<br />
| An adaptation of the puzzle game [http://www.gimme5games.com/play-game/pixelshuffle Pixel Shuffle] for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/PixelShuffle2DS/Pixel_Shuffle_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [http://gbatemp.net/threads/release-pixel-swap-2ds.395749/ Pixel Swap 2DS]<br />
| An adaptation of puzzle games Pixel Swap 1 & 2 for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/PixelSwap2DS/Pixel_Swap_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://github.com/smealum/portal3DS Portal3DS]<br />
| An adaptation of [https://en.wikipedia.org/wiki/Portal_(video_game) Portal] for the 3DS.<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/portal3DS repo] (Precompiled [http://www.mediafire.com/file/yo463wt6y4tybch/portal3DS.rar here])<br />
| Yes<br />
| 2015-08-18<br />
|-<br />
| [https://github.com/masterfeizz/ctrQuake ctrQuake]<br />
| An unofficial port of Quake for the 3DS, fully playable.<br />
| [[User:MasterFeizz|MasterFeizz]]<br />
| [https://github.com/masterfeizz/ctrQuake/releases Here]<br />
| Yes<br />
| 2016-09-16<br />
|-<br />
| [https://gbatemp.net/threads/release-reversi-othello-for-3ds.395442/ Reversi]<br />
| [https://en.wikipedia.org/wiki/Reversi Reversi] for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/Othello-3DS/releases Here]<br />
| Yes<br />
| 2016-03-05<br />
|-<br />
| [https://github.com/landm2000/sokoban Sokoban]<br />
| An unofficial port of the puzzle game [https://en.wikipedia.org/wiki/Sokoban Sokoban] for the 3DS.<br />
| [[User:Landm|Landm]]<br />
| [https://github.com/landm2000/sokoban/tree/master Here]<br />
| Yes<br />
| 2016-03-14<br />
|-<br />
| [https://gbatemp.net/threads/release-space-fruit.399088/ Space Fruit]<br />
| Hackathon game by 4 friends ported to 3DS. Asteroids but with fruit.<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/SpaceFruit/releases Here]<br />
| Yes<br />
| 2016-04-09<br />
|-<br />
| [https://github.com/sgowen/tappy-plane Tappy Plane]<br />
| A port of [https://en.wikipedia.org/wiki/Flappy_Bird Flappy Bird] for 3DS, but with a colorful plane.<br />
| [[User:Sgowen|sgowen]]<br />
| [https://github.com/sgowen/tappy-plane/releases Here]<br />
| Yes<br />
| 2015-11-09<br />
|-<br />
| [https://thp.itch.io/tetrepetete-3ds Tetrepetete 3DS]<br />
| A game with blocks.<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/tetrepetete-3ds Here]<br />
| No<br />
| 2016-06-29<br />
|-<br />
| [http://gbatemp.net/threads/release-tilemap-2ds.386733/ TileMap 2DS]<br />
| An adaptation of the puzzle game TileMap for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/TileMap2DS/TileMap_2DS.rar Here]<br />
| No<br />
| 2015-11-03<br />
|-<br />
| [http://gbatemp.net/threads/release-tiles-2ds.385796/ Tiles 2DS]<br />
| An adaptation of the puzzle game Lights Out for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Tiles2DS/Tiles_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://thp.itch.io/that-rabbit-game-3ds That Rabbit Game 3DS]<br />
| Inverse duck hunt with accelerometer input and stereoscopic 3D.<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/that-rabbit-game-3ds Here]<br />
| No<br />
| 2016-07-04<br />
|-<br />
| [http://gbatemp.net/threads/trucmuche-2ds-09.404859// Trucmuche 2DS 09]<br />
| An adaptation of the hidden objects game Trucmuche for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Trucmuche2DS09/Trucmuche_2DS_09.rar Here]<br />
| No<br />
| 2015-12-03<br />
|-<br />
| [https://github.com/Steveice10/WorldOf3DSand World of 3DSand]<br />
| A port of World of Sand for the 3DS.<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/WorldOf3DSand/releases Here]<br />
| Yes<br />
| 2016-07-12<br />
|-<br />
| [https://github.com/smealum/yeti3DS Yeti3DS]<br />
| A quick and dirty port of Derek Evans' Yeti3D software rendering engine.<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/yeti3DS repo]<br />
| Yes<br />
| 2015-08-07<br />
|-<br />
| [https://thp.itch.io/loonies-8192 Loonies 8192]<br />
| A Mini Retro Puzzle for DOS, the PSP and 3DS (Homebrew)<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/loonies-8192 Here]<br />
| No<br />
| 2019-01-27<br />
|-<br />
|}<br />
<br />
=== Emulators ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| ''[https://github.com/st4rk/3DNES 3DNES]''<br />
| A NES emulator, without sound support. ''No longer under development.''<br />
| st4rk, gdkChan<br />
| [https://github.com/St4rk/3DNES/raw/master/3DNES_old.3dsx Here]<br />
| Yes<br />
| 2015-03-28<br />
|-<br />
| [http://asie.pl/homebrew/#atari800 atari800-3DS]<br />
| An Atari 8-bit home computer emulator.<br />
| asie<br />
| [http://asie.pl/homebrew/#atari800 Here]<br />
| [https://github.com/asiekierka/atari800-3ds Yes]<br />
| 2016-10-29<br />
|-<br />
| [https://github.com/StapleButter/blargSnes blargSnes]<br />
| A Super Nintendo (SNES) emulator. A compatibility list can be found [http://wiki.gbatemp.net/wiki/BlargSnes_Compatibility_List here].<br />
| StapleButter<br />
| [http://blargsnes.kuribo64.net/download/blargSnes_1.3b.zip Here]<br />
| Yes<br />
| 2015-06-12<br />
|-<br />
| [https://github.com/xerpi/CHIP-3DS CHIP-3DS]<br />
| A simple and slow CHIP-8 emulator.<br />
| xerpi<br />
| Build from [https://github.com/xerpi/CHIP-3DS repo] (alt. [https://www.mediafire.com/?y94yjhzf70fsfsi here])<br />
| Yes<br />
| 2015-04-02<br />
|-<br />
| [https://gbatemp.net/threads/chip8-3ds.434425/ CHIP8-2DS]<br />
| CHIP-8 emulator with savestates and touch controls.<br />
| nopy4869<br />
| [https://github.com/nopy4869/CHIP8-2DS/releases Here]<br />
| Yes<br />
| 2016-07-20<br />
|-<br />
| [https://github.com/shinyquagsire23/gpsp CitrAGB]<br />
| Yet another GBA emulator for the 3DS.<br />
| [[User:shinyquagsire23|Shiny Quagsire]]<br />
| Build from [https://github.com/shinyquagsire23/gpsp/tree/master/3ds repo] (alt. [https://www.dropbox.com/s/sxb7x34u58g4zo2/3ds.3dsx?dl=0 here])<br />
| Yes<br />
| 2015-09-21<br />
|-<br />
| [https://github.com/Steveice10/GameYob GameYob]<br />
| A Game Boy (Color) emulator. A compatibility list can be found [http://wiki.gbatemp.net/wiki/GameYob_3DS_Compatibility_List here].<br />
| Drenn/Steveice10<br />
| [https://github.com/Steveice10/GameYob/releases Here]<br />
| Yes<br />
| 2016-07-17<br />
|-<br />
| [https://github.com/mgba-emu/mgba mGBA]<br />
| A GBA emulator that runs well without kernel hax.<br />
| endrift<br />
| [https://mgba.io/downloads.html Here]<br />
| Yes<br />
| 2016-10-13<br />
|-<br />
| [https://github.com/mrdanielps/r3Ddragon r3Ddragon]<br />
| A WIP Virtual Boy emulator for the 3DS based on Reality Boy / Red Dragon.<br />
| mrdanielps<br />
| [https://github.com/mrdanielps/r3Ddragon/releases Here]<br />
| Yes<br />
| 2016-08-16<br />
|-<br />
| [https://github.com/libretro/RetroArch RetroArch]<br />
| A multisystem emulator. (GB, GBA, SNES, Genesis, CPS1, CPS2, etc.)<br />
| libretro<br />
| [http://buildbot.libretro.com/nightly/nintendo/3ds/ Here]<br />
| Yes<br />
| Undergoing rapid development.<br />
|-<br />
| [https://github.com/bubble2k16/snes9x_3ds SNES9x for 3DS]<br />
| A SNES emulator for the old 3DS / 2DS. Optimised from Snes9x 1.43 and runs many games at full speed. Compatibility list [http://wiki.gbatemp.net/wiki/Snes9x_for_3DS here]<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/snes9x_3ds/releases Here]<br />
| Yes<br />
| 2017-02-11<br />
|-<br />
| [https://github.com/bubble2k16/emus3ds_3ds VirtuaNES for 3DS]<br />
| A NES emulator for the old 3DS / 2DS. Optimised from VirtuaNES 0.9.7 and runs many games at full speed.<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/emus3ds/releases Here]<br />
| Yes<br />
| 2017-03-23<br />
|-<br />
| [https://github.com/bubble2k16/emus3ds_3ds TemperPCE for 3DS]<br />
| A PC-Engine/Turbografx-16 emulator for the old 3DS / 2DS. Optimised from Temper runs all games, including CD-ROM and SGX games at full speed.<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/temperpce_3ds/releases Here]<br />
| Yes<br />
| 2017-06-19<br />
|-<br />
|}<br />
<br />
===Theme managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/yellows8/3ds_homemenu_extdatatool 3DS HomeMenu extdata Tool]<br />
| Tool for accessing the SD extdata which Home Menu uses. This essentially allows writing custom themes to extdata which get loaded at Home Menu startup.<br />
| [[User:yellows8|yellows8]]<br />
| [https://github.com/yellows8/3ds_homemenu_extdatatool/releases Here]<br />
| Yes<br />
| 2015-08-17<br />
|-<br />
| [https://github.com/Rinnegatamante/CHMM2 Custom Home Menu Manager 2]<br />
| Theme manager for Nintendo 3DS. Discontinued.<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [http://rinnegatamante.it/CHMM2.rar Here]<br />
| Yes<br />
| 2016-07-04<br />
|-<br />
| [https://github.com/ErmanSayin/Themely/tree/88e93816e3b43a40bcee25b1a7a8c71ef6a37db8 Themely]<br />
| Theme manager for Nintendo 3DS with 3dsthem.es integration.<br />
| ErmanSayin<br />
| [https://github.com/ErmanSayin/Themely/releases/tag/v1.3.1 Here]<br />
| Not anymore, 1.3.1 last FOSS version<br />
| 2017-6-28<br />
|- <br />
|[https://github.com/usagirei/3DS-Theme-Editor Usagi 3DS Theme Editor]<br />
|A simple 3DS theme editor for PC. You will need to have the .NET Library installed on your PC first before you can use it.<br />
|[https://github.com/usagirei usagirei]<br />
|[https://github.com/usagirei/3DS-Theme-Editor/archive/master.zip Here]<br />
|Not sure<br />
|2017.05.28<br />
|-<br />
| [https://gbatemp.net/threads/release-anemone3ds-a-complete-theme-and-splash-manager-for-your-3ds.482804/ Anemone3DS]<br />
| New theme and Luma splash screen manager, created to fill the gap left by its predecessors.<br />
| [[User:astronautlevel2]]<br />
| [https://github.com/astronautlevel2/Anemone3DS/releases/ Here]<br />
| Yes<br />
| 2018-5-13<br />
|}<br />
<br />
===Title managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/Steveice10/FBI FBI]<br />
| Open source CIA (un)installer and launcher.<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/FBI/releases?after=2.0.0 Here]<br />
| Yes<br />
| 2015-12-02<br />
|-<br />
| [https://github.com/Steveice10/FBI FBI 2]<br />
| Multipurpose file/title/ticket/save manager<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/FBI/releases Here]<br />
| Yes<br />
| 2018-8-21<br />
|-<br />
| [https://gbatemp.net/threads/no-longer-working-community-freeshop-fork-open-source-eshop-alternative.483159/ FreeShop]<br />
| GUI CDN title installer<br />
| TheCruel/arc13/Paul/evi<br />
| [https://notabug.org/evi/freeShop/releases Here]<br />
| Yes<br />
| 2018-5-17<br />
|-<br />
| [https://gbatemp.net/threads/release-nasa-universal-cia-manager-for-fw-4-1-10-3.409806/ NASA]<br />
| Universal CIA Manager for FWs 4.1 - 10.7<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [http://rinnegatamante.it/site/3ds_hbs.php Here]<br />
| No<br />
| 2016-04-13<br />
|}<br />
<br />
Note: downloading non-system applications from CDN is broken in any known homebrew, regardless of whether a signed ticket is installed or not (See also: [[11.8.0-41#Server-side_changes]])<br />
<br />
=== Save managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://gbatemp.net/threads/save-data-manager-and-editor-for-firmware-up-to-9-9.396245/ save_manager]<br />
| Proof of concept save exporter/importer<br />
| [[User:profi200|profi200]]<br />
| [http://gbatemp.net/attachments/save_manager_-with_smdh-zip.24349/ Here]<br />
| [https://gist.github.com/profi200/d0d092c11d0eb0692748 Yes]<br />
| 2015-09-13<br />
|-<br />
| [https://github.com/meladroit/svdt svdt]<br />
| Save Data Explorer/Manager<br />
| [[User:meladroit|meladroit]]<br />
| [https://github.com/meladroit/svdt/releases Here]<br />
| Yes<br />
| 2015-10-16<br />
|-<br />
| [https://gbatemp.net/threads/release-jks-savemanager-homebrew-cia-save-manager.413143/ JK's Save Manager]<br />
| Save/Extdata Manager<br />
| JK_<br />
| [https://gbatemp.net/threads/release-jks-savemanager-homebrew-cia-save-manager.413143/ Here]<br />
| [https://github.com/J-D-K/JKSM/ Yes]<br />
| 2016-09-29<br />
|-<br />
| JK's Save Manager for Rosalina<br />
| Modded version of JKSM for use as .3dsx on Luma 8+<br />
| Phalk, JK_<br />
| [https://github.com/Phalk/JKSM/releases Here]<br />
| Yes<br />
| 2017-7-12<br />
|-<br />
| [https://github.com/BernardoGiordano/PKSM PKSM]<br />
| Save editor for Pokémon generations 4 to 7<br />
| Bernardo Giordano<br />
| [https://github.com/BernardoGiordano/PKSM/releases Here]<br />
| Yes<br />
| 2017-8-3<br />
|-<br />
| [https://github.com/phijor/SpecializeMii/ SpecializeMii]<br />
| Editor for Mii database (specialness)<br />
| phijor<br />
| [https://github.com/phijor/SpecializeMii/releases Here]<br />
| Yes<br />
| 2017-1-22<br />
|-<br />
| [https://github.com/rboninsegna/SpecializeMii/ SpecializeMii]<br />
| Editor for Mii database (specialness and ownership)<br />
| phijor, [[User:Ryccardo|Ryccardo]]<br />
| [https://github.com/rboninsegna/SpecializeMii/releases Here]<br />
| Yes<br />
| 2017-8-13<br />
|}<br />
<br />
=== File servers ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/mtheall/ftpd ftpd (ftBrony)]<br />
| A FTP server.<br />
| [https://github.com/mtheall mtheall]<br />
| [https://github.com/mtheall/ftpd/releases Here]<br />
| Yes<br />
| 2016-09-17<br />
|-<br />
| ''[https://github.com/iamevn/FTP-3DS FTP-3DS]''<br />
| Fork of ftBrony with a Nintendo theme. ''No longer under development and without repo.''<br />
| [[User:iamevn|iamevn]]<br />
| N/A<br />
| Yes (''No source officially available.'')<br />
| N/A<br />
|-<br />
| [https://github.com/FloatingStar/FTP-GMX FTP - Graphic ModifierX Edition]<br />
| Fork of ftpd with aesthetic modifications.<br />
| [[User:FloatingStar|FloatingStar]]<br />
| [https://github.com/FloatingStar/FTP-GMX/releases Here]<br />
| Yes<br />
| 2016-01-27<br />
|-<br />
| [https://github.com/smealum/ftpony ftpony]<br />
| A basic FTP server, useful for testing new homebrew versions without swapping the SD card. ''No longer under (active) development?''<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/ftpony repo] (alt. [https://mega.co.nz/#!nchBkL7B!T3vXnX4q8Uwp6APYYTDSZi2bkm25la-Qyz6j4CjsllI here])<br />
| Yes<br />
| 2014-11-24<br />
|}<br />
<br />
=== Icon Packs ===<br />
Icon Packs are <code>SMDH</code> Packs for homebrew apps.<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="10%" | Last Updated<br />
|-<br />
| [https://gbatemp.net/threads/icon-pack-simplok-for-the-homebrew-launcher.396750/ Simplok]<br />
| The first 3DS Icon pack.<br />
| [[User:link6155|link6155]]<br />
| [http://1drv.ms/1EJCq2e Here]<br />
| 2015-09-12<br />
|-<br />
| ''[https://gbatemp.net/threads/1lp-icon-pack.402018/ 1LP]''<br />
| Another 3DS Icon pack. ''Repo is dead, no alternate downloads available.''<br />
| [[User:100pcrack|100pcrack]]<br />
| N/A<br />
| 2015-12-22<br />
|-<br />
| [https://gbatemp.net/threads/icon-pack-modern-ui.404366/ Modern UI]<br />
| A simple icon pack with a flat and minimalist design.<br />
| [[User:LouchDaishiteru|LouchDaishiteru]]<br />
| [https://gbatemp.net/threads/icon-pack-modern-ui.404366/ Here]<br />
| 2016-02-15<br />
|}<br />
<br />
=== Demos ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/halcy/nordlicht19 Skate Station]<br />
| A demo for the 3DS featuring music and 3D effects <br />
| SVatG<br />
| [https://aka-san.halcy.de/nordlicht2019/Skate%20Station.zip Here]<br />
| Yes<br />
| July 2019<br />
|-<br />
| cubedemo<br />
| A short demo of Homebrew on the 3DS, with working sound.<br />
| [[User:plutoo|plutoo]]<br />
| [https://mega.co.nz/#!KUQFiQYA!pv8HDEyrmuX6Eyw2hW0opL7gf9Ztmjd9J5pPsvs_rD4 Here]<br />
| No<br />
| N/A<br />
|-<br />
| [https://gbatemp.net/threads/release-3ds-rgb-led-test-program.441633/ MCU Bricker / LED Rave]<br />
| Make the notification LED glow in different colors<br />
| [[User:MarcusD]]<br />
| [https://gbatemp.net/attachments/rgb-zip.124119/ Here]<br />
| Yes, but down<br />
| Late 2016?<br />
|-<br />
| Spine 2D<br />
| Demo of [http://esotericsoftware.com/ Spine]'s 2D skeletal animations<br />
| [[User:Cruel|Cruel]]<br />
| [https://mega.nz/#!Xg411B5R!kcVHP69Ilggmjh4q5OYmr2cFvf5UGdHWA98-_VttDTo 3DSX]; [https://mega.nz/#!z8gxHSQb!H0as1A4wqYrdKBhXJwdYik7nPd_msXJhz5N1CeZm1Iw CIA]<br />
| No<br />
| N/A<br />
|-<br />
| [http://www.pouet.net/prod.php?which=66607 demo ou mourir]<br />
| Small demo for the 3DS with music and 2D effects<br />
| Desire<br />
| [http://mudlord.info/democrap/dsr_demooumourir.zip Here]<br />
| No<br />
| November 2015<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=BCSAR&diff=21099
BCSAR
2019-11-15T02:14:36Z
<p>Oreo639: Update names</p>
<hr />
<div>[[Category:File formats]]<br />
== Overview ==<br />
<br />
The BCSAR (Binary CTR Sound ARchive) format is the 3DS's equivalent of the Wii's BRSAR format. They're not the same structures, though, but they do have the same purpose.<br />
<br />
BCSAR are located in the RomFS, this is usually stored under "romfs:/sound/<name>.bcsar". This contains various audio formats, such as CSTM, CWSD, CSEQ, and CWAV.<br />
<br />
== BCSAR Header ==<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| MAGIC "CSAR"<br />
|-<br />
| 0x4<br />
| 0x2<br />
| Byte order mark (0xFEFF = Big Endian, 0xFFFE = Little Endian)<br />
|-<br />
| 0x6<br />
| 0x2<br />
| Length of BCSAR header<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Version<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Length of the entire BCSAR (starting from 0x0)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Amount of main partitions in the BCSAR [STRG + INFO + FILE = 0x03 (= 3)]<br />
|-<br />
| 0x14<br />
| 0x4<br />
| STRG partition reference ID? (Always 0x2000)<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Location of STRG partition<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Length of STRG partition<br />
|-<br />
| 0x20<br />
| 0x4<br />
| INFO partition reference ID? (Always 0x2001)<br />
|-<br />
| 0x24<br />
| 0x4<br />
| Location of INFO partition<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Length of INFO partition<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| Main FILE partition reference ID? (Always 0x2002)<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Location of main FILE partition<br />
|-<br />
| 0x34<br />
| 0x4<br />
| Length of main FILE partition<br />
|-<br />
| 0x38<br />
| 0x4<br />
| Reserved for 4th main partition location?<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| Reserved for 4th main partition length?<br />
|-<br />
|}<br />
<br />
== Partitions ==<br />
<br />
=== STRG ===<br />
<br />
STRG contains the names of the audio files in the BCSAR.<br />
<br />
==== Header ====<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| MAGIC "STRG"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Length of STRG partition (also in CSAR header)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| String table type magic (always 0x2400)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| This + 8 points to the string table (always 0x10)<br />
|-<br />
| 0x10<br />
| 0x4<br />
| String table lookup type magic (always 0x2401)<br />
|-<br />
| 0x14<br />
| 0x4<br />
| This + 8 points to the string lookup table<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Filename count<br />
|-<br />
| 0x1C<br />
| 0xC * count<br />
| String offset table<br />
|-<br />
|}<br />
<br />
==== String offset table entry ====<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Type of the node (should be 0x1F01)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Offset to data from the end of the STRG header (sizeof 0x18)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Length of the data buffer (includes NUL terminator)<br />
|-<br />
|}<br />
<br />
Then every filename is rawly setted. You can set up a dictionary that contains, using a simple counter, the size of every filename in order. Then, using the same type of counter, get the values of the size of the filename in a correct order.<br />
<br />
==== String lookup table ====<br />
<br />
===== Header =====<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Index of the root entry<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Entry count<br />
|-<br />
| 0x8<br />
| 0x14 * count<br />
| Lookup entry<br />
|-<br />
|}<br />
<br />
===== Entry =====<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x2<br />
| Nonzero if contains data<br />
|-<br />
| 0x2<br />
| 0x2<br />
| Bit test condition (index = (this >> 3), bit = (~this & 7)), -1 if unused<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Fail condition leaf index (-1 if unused)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Success condition leaf index (-1 if unused)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| String lookup table index (-1 if unused)<br />
|-<br />
| 0x10<br />
| 0x3<br />
| 3-byte Resource ID, Little Endian (-1 if unused)<br />
|-<br />
| 0x13<br />
| 0x1<br />
| Resource type (01=sound, 02=sound list, 03=sound bank, 04=sound player name?, 06=sound group, FF=unused)<br />
|-<br />
|}<br />
<br />
=== INFO ===<br />
<br />
INFO presumably contains information on the audio files? Possibly used to connect names from STRG to data from FILE?<br />
<br />
For now I only know some information in the header for this partition, but I'm working on figuring the rest out.<br />
<br />
==== Header ====<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| MAGIC "INFO"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Length of INFO partition (also in CSAR header)<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Audio Table Reference ID (0x2100)<br />
|-<br />
| 0xC<br />
| 0x4<br />
| This + 8 points to the Audio Table<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Set Table Reference ID (0x2104)<br />
|-<br />
| 0x14<br />
| 0x4<br />
| This + 8 points to the Set Table<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Bank Table Reference ID (0x2101)<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| This + 8 points to the Bank Table<br />
|-<br />
| 0x20<br />
| 0x4<br />
| WAV Archive Table Reference ID (0x2103)<br />
|-<br />
| 0x24<br />
| 0x4<br />
| This + 8 points to the WAV Archive Table<br />
|-<br />
| 0x28<br />
| 0x4<br />
| Group Table Reference ID (0x2105)<br />
|-<br />
| 0x2C<br />
| 0x4<br />
| This + 8 points to the Group Table<br />
|-<br />
| 0x30<br />
| 0x4<br />
| Player Table Reference ID (0x2102)<br />
|-<br />
| 0x34<br />
| 0x4<br />
| This + 8 points to Player Table<br />
|-<br />
| 0x38<br />
| 0x4<br />
| FILE Table Reference ID (0x2106)<br />
|-<br />
| 0x3C<br />
| 0x4<br />
| This + 8 points to the FILE Table<br />
|-<br />
| 0x40<br />
| 0x4<br />
| Unknown Table Reference ID (0x220B)<br />
|-<br />
| 0x44<br />
| 0x4<br />
| This + 8 points to unknown<br />
|-<br />
|}<br />
<br />
==== Blocks ====<br />
<br />
Every offset in the header points to data similar to this:<br />
* 4byte length<br />
* length array of the below struct<br />
** u32 type<br />
** u32 offset relative to the address of the length field (beginning of the block)<br />
<br />
The data the offset points to is dependent on the type of the above struct:<br />
<br />
===== 0x2200 =====<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Unknown<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Sound player ID<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Unknown<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Type of the extended info<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Offset to extended info *relative to the beginning of this struct*<br />
|-<br />
| 0x14<br />
| ???<br />
| Unknown...<br />
|-<br />
|}<br />
<br />
===== 0x2204 =====<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| First Sound ID in this sequence set<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Last Sound ID in this sequence set<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Type of the extended info<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Offset to extended info *relative to the beginning of this struct*<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Type of the extended info<br />
|-<br />
| 0x14<br />
| 0x4<br />
| Offset to extended info *relative to the beginning of this struct*<br />
|-<br />
| 0x18<br />
| 0x4<br />
| Unknown<br />
|-<br />
| 0x1C<br />
| 0x4<br />
| Unknown<br />
|-<br />
|}<br />
<br />
===== 0x2206 =====<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| Unknown<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Type of the extended info<br />
|-<br />
| 0x8<br />
| 0x4<br />
| Offset to extended info *relative to the beginning of this struct*<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Unknown<br />
|-<br />
| 0x10<br />
| 0x4<br />
| Unknown<br />
|-<br />
|}<br />
<br />
===== Table IDs =====<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! ID<br />
! NAME<br />
|-<br />
| 0x2200<br />
| Audio Table<br />
|-<br />
| 0x2204<br />
| Set Table<br />
|-<br />
| 0x2206<br />
| Bank Table<br />
|-<br />
| 0x2207<br />
| WAV Archive Table<br />
|-<br />
| 0x2208<br />
| Group Table<br />
|-<br />
| 0x2208<br />
| Player Table<br />
|-<br />
| 0x220A<br />
| FILE Table<br />
|-<br />
|}<br />
<br />
=== FILE ===<br />
<br />
FILE contains all of the audio data in the BCSAR.<br />
<br />
{| class="wikitable" border="1"<br />
|-<br />
! OFFSET<br />
! SIZE<br />
! DESCRIPTION<br />
|-<br />
| 0x0<br />
| 0x4<br />
| MAGIC "FILE"<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Length of FILE partition (also in CSAR header)<br />
|-<br />
|}<br />
<br />
There isn't a whole lot else I can document about the FILE partition, since the data in it will most definitely vary depending on the game. (since audio is more than likely to change in each game)<br />
<br />
There is no table in FILE so a different partition (presumably INFO) must be used to connect the data in FILE with the names from STRG.<br />
<br />
After some more research, there are multiple FILE partitions, but only 1 of them is the 'main' FILE partition (it's the one you get from the BCSAR header). The 'main' FILE partition contains all of the other sub FILE partitions.<br />
<br />
== Tools ==<br />
* vgmtoolbox's Advanced Cutter/Offset Finder tool can extract BCWAVs without filenames<br />
* [https://github.com/soneek/3DSUSoundArchiveTool 3DSUSoundArchiveTool] reference implementation of CSAR extraction</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Homebrew_Applications&diff=21082
Homebrew Applications
2019-10-25T22:14:10Z
<p>Oreo639: Fix typo</p>
<hr />
<div>== Installing ==<br />
Applications are installed by copying the necessary files directly to the <code>3ds/</code> folder in the root of the SD card (preferred for new designs), or in a subdirectory of <code>3ds/</code>, in which case said subfolder must be named identically to its executable. Most applications come with two files:<br />
* <code>[appname].3dsx</code>: The executable.<br />
* <code>[appname].smdh</code>: The icon/metadata. (Not required in any case, and may be integrated into the <code>.3dsx</code>)<br />
* <code>[appname].xml</code>: The list of supported targets (i.e. installed titles which the app supports replacing in memory at runtime, thus inheriting its permissions), and of any arguments to be passed to the .3dsx. (Optional)<br />
<br />
A standalone .xml file can point to a differently-named .3dsx, launching it with potentially different arguments so that a single application can run in different modes.<br />
<br />
The [[Homebrew Launcher]] will scan the SD card for all <code>.3dsx</code> files, but will only display an icon for those who have one according to the format described above. Recent enough versions can freely navigate the filesystem to select an application.<br />
<br />
== List ==<br />
<br />
=== Launchers ===<br />
<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="10%" | Open-Source<br />
|-<br />
| [https://github.com/fincs/new-hbmenu Homebrew Launcher]<br />
| Run homebrew on your 3DS! Compatible with Rosalina and all prior 3dsx loading solutions<br />
| [https://devkitpro.org devkitPro]<br />
| [https://github.com/fincs/new-hbmenu/releases Here]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/3ds_hb_menu Homebrew Starter Pack]<br />
| Everything to get you started.<br />
| [[User:smea|smea]]<br />
| [https://smealum.github.io/ninjhax2/starter.zip Here]<br />
| Yes<br />
|-<br />
| [https://github.com/smealum/3ds_hb_menu Homebrew Launcher (v1.x)]<br />
| The old version of the 3DS Homebrew Launcher, originally created for ninjhax 1.x (Discontinued)<br />
| [[User:smea|smea]]<br />
| [https://smealum.github.io/ninjhax2/boot.3dsx Here]<br />
| Yes<br />
|-<br />
| [http://gbatemp.net/threads/release-homebrew-launcher-with-grid-layout.397527/ Mashers' HBL]<br />
| Homebrew Launcher with grid and folder support. (Discontinued)<br />
| [[User:Mashers|Mashers]]<br />
| [https://github.com/d0k3/3DS-Extended-Homebrew-Starter-Pack/blob/35b8ab7dc40cb550b6ea45da319cdd0a0a3b2b54/boot.3dsx Here]<br />
| Lost in masher's retirement<br />
|}<br />
<br />
=== Applications ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/VideahGams/3dsfetch 3dsfetch]<br />
| Small 3DS version of a popular Linux ricing script called screenfetch.<br />
| [[User:VideahGams|VideahGams]]<br />
| [https://github.com/VideahGams/3dsfetch/tree/master Here]<br />
| Yes<br />
| 2015-09-17<br />
|-<br />
| [https://github.com/joel16/3DSident/ 3DSident]<br />
| Identity tool for the Nintendo 3DS heavily inspired by PSPident.<br />
| [[User:Joel16|Joel16]]<br />
| [https://github.com/joel16/3DSident/releases Here]<br />
| Yes<br />
| 2017-7-21<br />
|-<br />
| [https://gbatemp.net/threads/release-clear-mac-filter.515882/ Clear MAC Filter]<br />
| Reset 8-hour per-console StreetPass rate limiting<br />
| tastymeatball<br />
| [https://gbatemp.net/threads/release-clear-mac-filter.515882/ Here]<br />
| Yes<br />
| 2018-8-24<br />
|-<br />
| [https://github.com/CPunch/CtrRGBPATTY/releases CtrRGBPATTY]<br />
| Generate patches that edit LED notifications<br />
| CPunch<br />
| [https://github.com/CPunch/CtrRGBPATTY/releases Here]<br />
| Yes<br />
| 2017-11-3<br />
|-<br />
| [https://github.com/plutooo/ctrrpc ctrrpc]<br />
| A small and easily extensible RPC server/client written in C/Python. Allows you to quickly poke service-commands and <code>syscall</code>s over Wi-Fi from a Python shell on your PC. Useful during reverse-engineering. ''No longer under (active) development?''<br />
| [[User:plutooo|plutoo]]<br />
| Build from [https://github.com/plutooo/ctrrpc repo]<br />
| Yes<br />
| 2014-11-10<br />
|-<br />
| [https://github.com/yellows8/ctr-streaming-server ctr-streaming-server]<br />
| A 3DS homebrew audio/video playback server. It can also send [[HID_Shared_Memory|HID]] state to the client (see the README) when enabled. The included <code>parse_hidstream</code> tool can be used to parse that HID data to simulate keyboard/mouse input events, via Linux <code>uinput</code>. ''No longer under (active) development?''<br />
| [[User:yellows8|yellows8]]<br />
| Build from [https://github.com/yellows8/ctr-streaming-server repo]<br />
| Yes<br />
| 2014-11-20<br />
|-<br />
| [https://github.com/DownloadMii/DownloadMii-3DS DownloadMii]<br />
| A WIP repo-based online marketplace for homebrew applications & games.<br />
| [[User:filfat|filfat]]<br />
| Build from [https://github.com/DownloadMii/DownloadMii-3DS repo]<br />
| Yes<br />
| 2015-11-24<br />
|-<br />
| [https://github.com/linoma/fb43ds fb43ds]<br />
| A simple 3DS Facebook chat client<br />
| [[User:linoma|linoma]]<br />
| Build from [https://github.com/linoma/fb43ds repo]<br />
| Yes<br />
| 2015-04-07<br />
|-<br />
| [https://github.com/iamevn/for-anyone-who-walks-a-lot for-anyone-who-walks-a-lot]<br />
| Tool to get past the 10 coin per day limit on earning Play Coins by walking.<br />
| [[User:iamevn|iamevn]]<br />
| [https://github.com/iamevn/for-anyone-who-walks-a-lot/releases Here]<br />
| Yes<br />
| 2016-03-26<br />
|-<br />
| [https://github.com/zeta0134/3ds-homebrew-browser Homebrew Browser]<br />
| Download homebrew from the internet!<br />
| [[User:cromo|cromo]], [[User:zeta0134|zeta0134]]<br />
| [https://github.com/zeta0134/3ds-homebrew-browser/releases Here]<br />
| Yes<br />
| 2015-10-07<br />
|-<br />
| [https://github.com/MrJPGames/NFCReader NFCReader]<br />
| Allows you to use your 3DS as a NFC/RFID UID Scanner.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/NFCReader/releases Here]<br />
| Yes<br />
| 2017-01-21<br />
|-<br />
| [https://github.com/SciresM/ScreenInfo ScreenInfo]<br />
| Identify whether New 3DS LCD panels are TN or IPS.<br />
| [[User:SciresM|SciresM]]<br />
| [https://github.com/SciresM/ScreenInfo/releases Here]<br />
| Yes<br />
| 2016-09-04<br />
|}<br />
<br />
=== Game Engines ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/TurtleP/LovePotion Löve Potion]<br />
| [https://love2d.org/ LOVE2D] for 3DS Homebrew.<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/LovePotion/releases Here]<br />
| [https://github.com/TurtleP/LovePotion Yes]<br />
| 2018-08-27<br />
|-<br />
| [https://ctrulua.github.io/ ctrµLua]<br />
| A Lua interpreter for 3DS, brought to life by the remnants of the µLua community.<br />
| [[User:Firew0lf|Firew0lf]], Reuh, Negi<br />
| [https://github.com/ctruLua/ctruLua/releases Here]<br />
| Yes<br />
| 2016-06-27<br />
|-<br />
| [https://blog.easyrpg.org/2016/05/player-for-nintendo-3ds/ EasyRPG Player]<br />
| RPG Maker 2000/2003 interpreter<br />
| [[User:Rinnegatamante|Rinnegatamante]] & EasyRPG Team<br />
| [https://easyrpg.org/player/downloads/ Here]<br />
| [https://github.com/EasyRPG/Player Yes]<br />
| 2019-03-03<br />
|-<br />
| [https://github.com/Rinnegatamante/lpp-3ds LuaPlayer+ 3DS]<br />
| First Lua interpreter 3DS homebrew, under Lua 5.3.1<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [https://github.com/Rinnegatamante/lpp-3ds/releases Here]<br />
| Yes<br />
| 2016-09-21<br />
|-<br />
| [http://vault.digitalmzx.net MegaZeux 3DS]<br />
| A port of the MegaZeux GCS to the 3DS.<br />
| MegaZeux developers<br />
| [http://vault.digitalmzx.net Here]<br />
| [https://github.com/AliceLR/megazeux Yes]<br />
| 2018-03-04<br />
|}<br />
<br />
=== Games ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [http://gbatemp.net/threads/release-100-boxes-2ds.384714/ 100 Boxes 2DS]<br />
| A remake of homebrew "100 Boxes puzzle" for DS and GBA.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/100Boxes2DS/100_Boxes_2DS.rar Here]<br />
| No<br />
| 2015-11-11<br />
|-<br />
| [https://github.com/MrJPGames/2048-3D 2048-3D]<br />
| A port of the popular game 2048 for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/2048-3D/releases Here]<br />
| Yes<br />
| 2016-02-12<br />
|-<br />
| ''[https://github.com/smealum/3dscraft 3DSCraft]''<br />
| A Minecraft port for the 3DS. ''No longer under (active) development?''<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/3dscraft repo] (alt. [https://smealum.github.io/3dscraft/downloads/3dscraft_141120.zip here])<br />
| Yes<br />
| 2014-11-20<br />
|-<br />
| [https://github.com/markwinap/3DS_Nyan_Cat 3DS Nyan Cat]<br />
| A port of Nyan Cat for the 3DS, using <code>LIBSF2D</code>.<br />
| [[User:markwinap|markwinap]]<br />
| Build from [https://github.com/markwinap/3DS_Nyan_Cat repo] (alt. [https://www.dropbox.com/s/e400my3xm0zw74r/nyan_cat.zip?dl=0 here])<br />
| Yes<br />
| 2015-05-26<br />
|-<br />
| [https://gbatemp.net/threads/preview-ld-34-port-antibounce.406361 Antibounce]<br />
| "Move your player to bounce around and collect coins. Go between screens through the holes in the sides of the floor. 3D can also be enabled."<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/Antibounce/releases Here]<br />
| Yes<br />
| 2015-12-23<br />
|-<br />
| [https://github.com/Magicrafter13/Breakout Breakout]<br />
| "A 3ds Breakout Clone."<br />
| [[User:Magicrafter13|Magicrafter13]]<br />
| [https://github.com/Magicrafter13/Breakout/releases Here]<br />
| Yes<br />
| 2017-10-17<br />
|-<br />
| ''[https://github.com/UnsureSherlock/checkers3ds checkers3ds]''<br />
| A checkers game in glorious ASCII. ''No longer under development.''<br />
| [[User:UnsureSherlock|UnsureSherlock]]<br />
| Build from [https://github.com/UnsureSherlock/checkers3ds repo]<br />
| Yes<br />
| 2016-02-25<br />
|-<br />
| [https://github.com/Kaisogen/CookieCollector-3DS- Cookie Collector]<br />
| A tiny adaptation of the popular [https://en.wikipedia.org/wiki/Cookie_Clicker Cookie Clicker] game for the 3DS.<br />
| [[User:Kaisogen|Kaisogen]]<br />
| [https://github.com/Kaisogen/CookieCollector-3DS-/releases Here]<br />
| Yes<br />
| 2017-06-04<br />
|-<br />
| [https://github.com/TheMachinumps/Cookie_Clicker_3DS Cookie Clicker 3DS]<br />
| A simple Cookie Clicker type of game inspired by [[User:Kaisogen|Kaisogen]]'s Cookie Collector<br />
| [[User:TheMachinumps|TheMachinumps]]<br />
| [https://github.com/TheMachinumps/Cookie_Clicker_3DS/releases Here]<br />
| Yes<br />
| 2016-08-27<br />
|-<br />
| [https://gbatemp.net/threads/release-drawattack-networked-drawing-game.402291/ DrawAttack]<br />
| Online multiplayer drawing game, like Pictionary.<br />
| [[User:Cruel|Cruel]]<br />
| [https://github.com/Cruel/DrawAttack/releases Here]<br />
| Yes<br />
| 2016-04-17<br />
|-<br />
| [https://github.com/masterfeizz/EDuke3D EDuke3D]<br />
| An unofficial port of EDuke32 for the 3DS.<br />
| [[User:MasterFeizz|MasterFeizz]]<br />
| [https://github.com/masterfeizz/EDuke3D/releases Here]<br />
| Yes<br />
| 2016-05-09<br />
|-<br />
| [https://gbatemp.net/threads/release-hamsters-2ds.383457/ Hamsters 2DS]<br />
| A text-based hamster breeding game.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Hamsters2DS/Hamsters_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://github.com/BHSPitMonkey/Helii3DS Helii]<br />
| A port of [https://github.com/BHSPitMonkey/Helii3D Helii] for the 3DS.<br />
| [[User:BHSPitMonkey|BHSPitMonkey]]<br />
| [https://github.com/BHSPitMonkey/Helii3DS/releases Here]<br />
| Yes<br />
| 2015-09-18<br />
|-<br />
| [https://github.com/sgowen/insectoid-defense Insectoid Defense]<br />
| A Sci-Fi Tower Defense game.<br />
| [[User:Sgowen|sgowen]]<br />
| [https://github.com/sgowen/insectoid-defense/releases Here]<br />
| Yes<br />
| 2015-11-09<br />
|-<br />
| [https://github.com/VideahGams/NumberFucker3DS NumberFucker3DS]<br />
| Simple math game, originally used as a debug game for LövePotion.<br />
| [[User:VideahGams|VideahGams]]<br />
| [https://github.com/VideahGams/NumberFucker3DS Here]<br />
| Yes<br />
| 2015-09-19<br />
|-<br />
|[https://gbatemp.net/threads/release-zelda-roth-for-3ds.425503/ Zelda ROTH for 3DS]<br />
|A port of Legend of Zelda: Return of the Hylian, a Zelda fangame, to 3DS.<br />
|[[User:nop90|nop90]]<br />
|[https://github.com/nop90/ZeldaROTH/releases Here]<br />
|Yes<br />
|2016-09-11<br />
|-<br />
| [https://gbatemp.net/threads/release-mastermind-3ds.394710/ Mastermind 3DS]<br />
| A port of Mastermind for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/Mastermind-3DS/releases Here]<br />
| Yes<br />
| 2015-08-15<br />
|-<br />
| [http://gbatemp.net/threads/release-minesweeper-2ds.384185/ Minesweeper 2DS]<br />
| A port of Minesweeper for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Minesweeper2DS/Minesweeper_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://pyug.at/PyWeek/2012-09 One Whale Trip]<br />
| Five-lane underwater whale swimming/pearl pickup adventure game in Python.<br />
| [[User:thp|thp]]<br />
| [https://bitbucket.org/pyugat/pyweek1209/downloads/OneWhaleTrip-2016-07-18-3DS.zip Here]<br />
| [https://bitbucket.org/pyugat/pyweek1209/src/bce5156dbee72f38c4fcf5d7b3df9cfb9ddd5b0a/3ds Yes]<br />
| 2016-10-02<br />
|-<br />
| [http://gbatemp.net/threads/release-paddle-puffle-3ds.392215/ Paddle Puffle 3DS]<br />
| A port of [http://puffles.gatuno.mx Paddle Puffle] for the 3DS.<br />
| [[User:Peanut42|Peanut42]]<br />
| [http://puffles.gatuno.mx/releases/paddlepuffle3ds.zip Here]<br />
| [https://github.com/gatuno/PaddlePuffle3DS Yes]<br />
| 2015-07-05<br />
|-<br />
| [http://david.dantoine.org/proyecto/26/ Pituka Classics]<br />
| Play CPC classics using [http://david.dantoine.org/proyecto/4/ Pituka Emulator-Core] on 3DS.<br />
| [[User:D_Skywalk|D_Skywalk]]<br />
| [http://david.dantoine.org/descargas/72 Rick Dangerous] [http://david.dantoine.org/descargas/2 Core]<br />
| [http://david.dantoine.org/descargas/4 Yes (core)]<br />
| 2016-02-26<br />
|-<br />
| [http://gbatemp.net/threads/release-pixel-shuffle-2ds.398540/ Pixel Shuffle 2DS]<br />
| An adaptation of the puzzle game [http://www.gimme5games.com/play-game/pixelshuffle Pixel Shuffle] for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/PixelShuffle2DS/Pixel_Shuffle_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [http://gbatemp.net/threads/release-pixel-swap-2ds.395749/ Pixel Swap 2DS]<br />
| An adaptation of puzzle games Pixel Swap 1 & 2 for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/PixelSwap2DS/Pixel_Swap_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://github.com/smealum/portal3DS Portal3DS]<br />
| An adaptation of [https://en.wikipedia.org/wiki/Portal_(video_game) Portal] for the 3DS.<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/portal3DS repo] (Precompiled [http://www.mediafire.com/file/yo463wt6y4tybch/portal3DS.rar here])<br />
| Yes<br />
| 2015-08-18<br />
|-<br />
| [https://github.com/masterfeizz/ctrQuake ctrQuake]<br />
| An unofficial port of Quake for the 3DS, fully playable.<br />
| [[User:MasterFeizz|MasterFeizz]]<br />
| [https://github.com/masterfeizz/ctrQuake/releases Here]<br />
| Yes<br />
| 2016-09-16<br />
|-<br />
| [https://gbatemp.net/threads/release-reversi-othello-for-3ds.395442/ Reversi]<br />
| [https://en.wikipedia.org/wiki/Reversi Reversi] for the 3DS.<br />
| [[User:MrJPGames|Jasper Peters]]<br />
| [https://github.com/MrJPGames/Othello-3DS/releases Here]<br />
| Yes<br />
| 2016-03-05<br />
|-<br />
| [https://github.com/landm2000/sokoban Sokoban]<br />
| An unofficial port of the puzzle game [https://en.wikipedia.org/wiki/Sokoban Sokoban] for the 3DS.<br />
| [[User:Landm|Landm]]<br />
| [https://github.com/landm2000/sokoban/tree/master Here]<br />
| Yes<br />
| 2016-03-14<br />
|-<br />
| [https://gbatemp.net/threads/release-space-fruit.399088/ Space Fruit]<br />
| Hackathon game by 4 friends ported to 3DS. Asteroids but with fruit.<br />
| [[User:TurtleP|TurtleP]]<br />
| [https://github.com/TurtleP/SpaceFruit/releases Here]<br />
| Yes<br />
| 2016-04-09<br />
|-<br />
| [https://github.com/sgowen/tappy-plane Tappy Plane]<br />
| A port of [https://en.wikipedia.org/wiki/Flappy_Bird Flappy Bird] for 3DS, but with a colorful plane.<br />
| [[User:Sgowen|sgowen]]<br />
| [https://github.com/sgowen/tappy-plane/releases Here]<br />
| Yes<br />
| 2015-11-09<br />
|-<br />
| [https://thp.itch.io/tetrepetete-3ds Tetrepetete 3DS]<br />
| A game with blocks.<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/tetrepetete-3ds Here]<br />
| No<br />
| 2016-06-29<br />
|-<br />
| [http://gbatemp.net/threads/release-tilemap-2ds.386733/ TileMap 2DS]<br />
| An adaptation of the puzzle game TileMap for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/TileMap2DS/TileMap_2DS.rar Here]<br />
| No<br />
| 2015-11-03<br />
|-<br />
| [http://gbatemp.net/threads/release-tiles-2ds.385796/ Tiles 2DS]<br />
| An adaptation of the puzzle game Lights Out for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Tiles2DS/Tiles_2DS.rar Here]<br />
| No<br />
| 2015-11-01<br />
|-<br />
| [https://thp.itch.io/that-rabbit-game-3ds That Rabbit Game 3DS]<br />
| Inverse duck hunt with accelerometer input and stereoscopic 3D.<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/that-rabbit-game-3ds Here]<br />
| No<br />
| 2016-07-04<br />
|-<br />
| [http://gbatemp.net/threads/trucmuche-2ds-09.404859// Trucmuche 2DS 09]<br />
| An adaptation of the hidden objects game Trucmuche for the 3DS.<br />
| [[User:Cid2mizard|Cid2mizard]]<br />
| [http://3ds.nintendomax.com/Homebrews/Jeux/Trucmuche2DS09/Trucmuche_2DS_09.rar Here]<br />
| No<br />
| 2015-12-03<br />
|-<br />
| [https://github.com/Steveice10/WorldOf3DSand World of 3DSand]<br />
| A port of World of Sand for the 3DS.<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/WorldOf3DSand/releases Here]<br />
| Yes<br />
| 2016-07-12<br />
|-<br />
| [https://github.com/smealum/yeti3DS Yeti3DS]<br />
| A quick and dirty port of Derek Evans' Yeti3D software rendering engine.<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/yeti3DS repo]<br />
| Yes<br />
| 2015-08-07<br />
|-<br />
| [https://thp.itch.io/loonies-8192 Loonies 8192]<br />
| A Mini Retro Puzzle for DOS, the PSP and 3DS (Homebrew)<br />
| [[User:thp|thp]]<br />
| [https://thp.itch.io/loonies-8192 Here]<br />
| No<br />
| 2019-01-27<br />
|-<br />
|}<br />
<br />
=== Emulators ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| ''[https://github.com/st4rk/3DNES 3DNES]''<br />
| A NES emulator, without sound support. ''No longer under development.''<br />
| st4rk, gdkChan<br />
| [https://github.com/St4rk/3DNES/raw/master/3DNES_old.3dsx Here]<br />
| Yes<br />
| 2015-03-28<br />
|-<br />
| [http://asie.pl/homebrew/#atari800 atari800-3DS]<br />
| An Atari 8-bit home computer emulator.<br />
| asie<br />
| [http://asie.pl/homebrew/#atari800 Here]<br />
| [https://github.com/asiekierka/atari800-3ds Yes]<br />
| 2016-10-29<br />
|-<br />
| [https://github.com/StapleButter/blargSnes blargSnes]<br />
| A Super Nintendo (SNES) emulator. A compatibility list can be found [http://wiki.gbatemp.net/wiki/BlargSnes_Compatibility_List here].<br />
| StapleButter<br />
| [http://blargsnes.kuribo64.net/download/blargSnes_1.3b.zip Here]<br />
| Yes<br />
| 2015-06-12<br />
|-<br />
| [https://github.com/xerpi/CHIP-3DS CHIP-3DS]<br />
| A simple and slow CHIP-8 emulator.<br />
| xerpi<br />
| Build from [https://github.com/xerpi/CHIP-3DS repo] (alt. [https://www.mediafire.com/?y94yjhzf70fsfsi here])<br />
| Yes<br />
| 2015-04-02<br />
|-<br />
| [https://gbatemp.net/threads/chip8-3ds.434425/ CHIP8-2DS]<br />
| CHIP-8 emulator with savestates and touch controls.<br />
| nopy4869<br />
| [https://github.com/nopy4869/CHIP8-2DS/releases Here]<br />
| Yes<br />
| 2016-07-20<br />
|-<br />
| [https://github.com/shinyquagsire23/gpsp CitrAGB]<br />
| Yet another GBA emulator for the 3DS.<br />
| [[User:shinyquagsire23|Shiny Quagsire]]<br />
| Build from [https://github.com/shinyquagsire23/gpsp/tree/master/3ds repo] (alt. [https://www.dropbox.com/s/sxb7x34u58g4zo2/3ds.3dsx?dl=0 here])<br />
| Yes<br />
| 2015-09-21<br />
|-<br />
| [https://github.com/Steveice10/GameYob GameYob]<br />
| A Game Boy (Color) emulator. A compatibility list can be found [http://wiki.gbatemp.net/wiki/GameYob_3DS_Compatibility_List here].<br />
| Drenn/Steveice10<br />
| [https://github.com/Steveice10/GameYob/releases Here]<br />
| Yes<br />
| 2016-07-17<br />
|-<br />
| [https://github.com/mgba-emu/mgba mGBA]<br />
| A GBA emulator that runs well without kernel hax.<br />
| endrift<br />
| [https://mgba.io/downloads.html Here]<br />
| Yes<br />
| 2016-10-13<br />
|-<br />
| [https://github.com/mrdanielps/r3Ddragon r3Ddragon]<br />
| A WIP Virtual Boy emulator for the 3DS based on Reality Boy / Red Dragon.<br />
| mrdanielps<br />
| [https://github.com/mrdanielps/r3Ddragon/releases Here]<br />
| Yes<br />
| 2016-08-16<br />
|-<br />
| [https://github.com/libretro/RetroArch RetroArch]<br />
| A multisystem emulator. (GB, GBA, SNES, Genesis, CPS1, CPS2, etc.)<br />
| libretro<br />
| [http://buildbot.libretro.com/nightly/nintendo/3ds/ Here]<br />
| Yes<br />
| Undergoing rapid development.<br />
|-<br />
| [https://github.com/bubble2k16/snes9x_3ds SNES9x for 3DS]<br />
| A SNES emulator for the old 3DS / 2DS. Optimised from Snes9x 1.43 and runs many games at full speed. Compatibility list [http://wiki.gbatemp.net/wiki/Snes9x_for_3DS here]<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/snes9x_3ds/releases Here]<br />
| Yes<br />
| 2017-02-11<br />
|-<br />
| [https://github.com/bubble2k16/emus3ds_3ds VirtuaNES for 3DS]<br />
| A NES emulator for the old 3DS / 2DS. Optimised from VirtuaNES 0.9.7 and runs many games at full speed.<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/emus3ds/releases Here]<br />
| Yes<br />
| 2017-03-23<br />
|-<br />
| [https://github.com/bubble2k16/emus3ds_3ds TemperPCE for 3DS]<br />
| A PC-Engine/Turbografx-16 emulator for the old 3DS / 2DS. Optimised from Temper runs all games, including CD-ROM and SGX games at full speed.<br />
| bubble2k16<br />
| [https://github.com/bubble2k16/temperpce_3ds/releases Here]<br />
| Yes<br />
| 2017-06-19<br />
|-<br />
|}<br />
<br />
===Theme managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/yellows8/3ds_homemenu_extdatatool 3DS HomeMenu extdata Tool]<br />
| Tool for accessing the SD extdata which Home Menu uses. This essentially allows writing custom themes to extdata which get loaded at Home Menu startup.<br />
| [[User:yellows8|yellows8]]<br />
| [https://github.com/yellows8/3ds_homemenu_extdatatool/releases Here]<br />
| Yes<br />
| 2015-08-17<br />
|-<br />
| [https://github.com/Rinnegatamante/CHMM2 Custom Home Menu Manager 2]<br />
| Theme manager for Nintendo 3DS. Discontinued.<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [http://rinnegatamante.it/CHMM2.rar Here]<br />
| Yes<br />
| 2016-07-04<br />
|-<br />
| [https://github.com/ErmanSayin/Themely/tree/88e93816e3b43a40bcee25b1a7a8c71ef6a37db8 Themely]<br />
| Theme manager for Nintendo 3DS with 3dsthem.es integration.<br />
| ErmanSayin<br />
| [https://github.com/ErmanSayin/Themely/releases/tag/v1.3.1 Here]<br />
| Not anymore, 1.3.1 last FOSS version<br />
| 2017-6-28<br />
|-<br />
| [https://gbatemp.net/threads/release-anemone3ds-a-complete-theme-and-splash-manager-for-your-3ds.482804/ Anemone3DS]<br />
| New theme and Luma splash screen manager, created to fill the gap left by its predecessors.<br />
| [[User:astronautlevel2]]<br />
| [https://github.com/astronautlevel2/Anemone3DS/releases/ Here]<br />
| Yes<br />
| 2018-5-13<br />
|}<br />
<br />
===Title managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/Steveice10/FBI FBI]<br />
| Open source CIA (un)installer and launcher.<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/FBI/releases?after=2.0.0 Here]<br />
| Yes<br />
| 2015-12-02<br />
|-<br />
| [https://github.com/Steveice10/FBI FBI 2]<br />
| Multipurpose file/title/ticket/save manager<br />
| [[User:Steveice10|Steveice10]]<br />
| [https://github.com/Steveice10/FBI/releases Here]<br />
| Yes<br />
| 2018-8-21<br />
|-<br />
| [https://gbatemp.net/threads/no-longer-working-community-freeshop-fork-open-source-eshop-alternative.483159/ FreeShop]<br />
| GUI CDN title installer<br />
| TheCruel/arc13/Paul/evi<br />
| [https://notabug.org/evi/freeShop/releases Here]<br />
| Yes<br />
| 2018-5-17<br />
|-<br />
| [https://gbatemp.net/threads/release-nasa-universal-cia-manager-for-fw-4-1-10-3.409806/ NASA]<br />
| Universal CIA Manager for FWs 4.1 - 10.7<br />
| [[User:Rinnegatamante|Rinnegatamante]]<br />
| [http://rinnegatamante.it/site/3ds_hbs.php Here]<br />
| No<br />
| 2016-04-13<br />
|}<br />
<br />
Note: downloading non-system applications from CDN is broken in any known homebrew, regardless of whether a signed ticket is installed or not (See also: [[11.8.0-41#Server-side_changes]])<br />
<br />
=== Save managers===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://gbatemp.net/threads/save-data-manager-and-editor-for-firmware-up-to-9-9.396245/ save_manager]<br />
| Proof of concept save exporter/importer<br />
| [[User:profi200|profi200]]<br />
| [http://gbatemp.net/attachments/save_manager_-with_smdh-zip.24349/ Here]<br />
| [https://gist.github.com/profi200/d0d092c11d0eb0692748 Yes]<br />
| 2015-09-13<br />
|-<br />
| [https://github.com/meladroit/svdt svdt]<br />
| Save Data Explorer/Manager<br />
| [[User:meladroit|meladroit]]<br />
| [https://github.com/meladroit/svdt/releases Here]<br />
| Yes<br />
| 2015-10-16<br />
|-<br />
| [https://gbatemp.net/threads/release-jks-savemanager-homebrew-cia-save-manager.413143/ JK's Save Manager]<br />
| Save/Extdata Manager<br />
| JK_<br />
| [https://gbatemp.net/threads/release-jks-savemanager-homebrew-cia-save-manager.413143/ Here]<br />
| [https://github.com/J-D-K/JKSM/ Yes]<br />
| 2016-09-29<br />
|-<br />
| JK's Save Manager for Rosalina<br />
| Modded version of JKSM for use as .3dsx on Luma 8+<br />
| Phalk, JK_<br />
| [https://github.com/Phalk/JKSM/releases Here]<br />
| Yes<br />
| 2017-7-12<br />
|-<br />
| [https://github.com/BernardoGiordano/PKSM PKSM]<br />
| Save editor for Pokémon generations 4 to 7<br />
| Bernardo Giordano<br />
| [https://github.com/BernardoGiordano/PKSM/releases Here]<br />
| Yes<br />
| 2017-8-3<br />
|-<br />
| [https://github.com/phijor/SpecializeMii/ SpecializeMii]<br />
| Editor for Mii database (specialness)<br />
| phijor<br />
| [https://github.com/phijor/SpecializeMii/releases Here]<br />
| Yes<br />
| 2017-1-22<br />
|-<br />
| [https://github.com/rboninsegna/SpecializeMii/ SpecializeMii]<br />
| Editor for Mii database (specialness and ownership)<br />
| phijor, [[User:Ryccardo|Ryccardo]]<br />
| [https://github.com/rboninsegna/SpecializeMii/releases Here]<br />
| Yes<br />
| 2017-8-13<br />
|}<br />
<br />
=== File servers ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/mtheall/ftpd ftpd (ftBrony)]<br />
| A FTP server.<br />
| [https://github.com/mtheall mtheall]<br />
| [https://github.com/mtheall/ftpd/releases Here]<br />
| Yes<br />
| 2016-09-17<br />
|-<br />
| ''[https://github.com/iamevn/FTP-3DS FTP-3DS]''<br />
| Fork of ftBrony with a Nintendo theme. ''No longer under development and without repo.''<br />
| [[User:iamevn|iamevn]]<br />
| N/A<br />
| Yes (''No source officially available.'')<br />
| N/A<br />
|-<br />
| [https://github.com/FloatingStar/FTP-GMX FTP - Graphic ModifierX Edition]<br />
| Fork of ftpd with aesthetic modifications.<br />
| [[User:FloatingStar|FloatingStar]]<br />
| [https://github.com/FloatingStar/FTP-GMX/releases Here]<br />
| Yes<br />
| 2016-01-27<br />
|-<br />
| [https://github.com/smealum/ftpony ftpony]<br />
| A basic FTP server, useful for testing new homebrew versions without swapping the SD card. ''No longer under (active) development?''<br />
| [[User:smea|smea]]<br />
| Build from [https://github.com/smealum/ftpony repo] (alt. [https://mega.co.nz/#!nchBkL7B!T3vXnX4q8Uwp6APYYTDSZi2bkm25la-Qyz6j4CjsllI here])<br />
| Yes<br />
| 2014-11-24<br />
|}<br />
<br />
=== Icon Packs ===<br />
Icon Packs are <code>SMDH</code> Packs for homebrew apps.<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="10%" | Last Updated<br />
|-<br />
| [https://gbatemp.net/threads/icon-pack-simplok-for-the-homebrew-launcher.396750/ Simplok]<br />
| The first 3DS Icon pack.<br />
| [[User:link6155|link6155]]<br />
| [http://1drv.ms/1EJCq2e Here]<br />
| 2015-09-12<br />
|-<br />
| ''[https://gbatemp.net/threads/1lp-icon-pack.402018/ 1LP]''<br />
| Another 3DS Icon pack. ''Repo is dead, no alternate downloads available.''<br />
| [[User:100pcrack|100pcrack]]<br />
| N/A<br />
| 2015-12-22<br />
|-<br />
| [https://gbatemp.net/threads/icon-pack-modern-ui.404366/ Modern UI]<br />
| A simple icon pack with a flat and minimalist design.<br />
| [[User:LouchDaishiteru|LouchDaishiteru]]<br />
| [https://gbatemp.net/threads/icon-pack-modern-ui.404366/ Here]<br />
| 2016-02-15<br />
|}<br />
<br />
=== Demos ===<br />
{| class="wikitable" border="1"<br />
! width="20%" | Name<br />
! width="50%" | Description<br />
! width="10%" | Author<br />
! width="10%" | Download<br />
! width="5%" | Open-Source<br />
! width="15%" | Last Updated<br />
|-<br />
| [https://github.com/halcy/nordlicht19 Skate Station]<br />
| A demo for the 3DS featuring music and 3D effects <br />
| SVatG<br />
| [https://aka-san.halcy.de/nordlicht2019/Skate%20Station.zip Here]<br />
| Yes<br />
| July 2019<br />
|-<br />
| cubedemo<br />
| A short demo of Homebrew on the 3DS, with working sound.<br />
| [[User:plutoo|plutoo]]<br />
| [https://mega.co.nz/#!KUQFiQYA!pv8HDEyrmuX6Eyw2hW0opL7gf9Ztmjd9J5pPsvs_rD4 Here]<br />
| No<br />
| N/A<br />
|-<br />
| [https://gbatemp.net/threads/release-3ds-rgb-led-test-program.441633/ MCU Bricker / LED Rave]<br />
| Make the notification LED glow in different colors<br />
| [[User:MarcusD]]<br />
| [https://gbatemp.net/attachments/rgb-zip.124119/ Here]<br />
| Yes, but down<br />
| Late 2016?<br />
|-<br />
| Spine 2D<br />
| Demo of [http://esotericsoftware.com/ Spine]'s 2D skeletal animations<br />
| [[User:Cruel|Cruel]]<br />
| [https://mega.nz/#!Xg411B5R!kcVHP69Ilggmjh4q5OYmr2cFvf5UGdHWA98-_VttDTo 3DSX]; [https://mega.nz/#!z8gxHSQb!H0as1A4wqYrdKBhXJwdYik7nPd_msXJhz5N1CeZm1Iw CIA]<br />
| No<br />
| N/A<br />
|-<br />
| [http://www.pouet.net/prod.php?which=66607 demo ou mourir]<br />
| Small demo for the 3DS with music and 2D effects<br />
| Desire<br />
| [http://mudlord.info/democrap/dsr_demooumourir.zip Here]<br />
| No<br />
| November 2015<br />
|}</div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Mii&diff=20925
Mii
2019-04-18T19:43:52Z
<p>Oreo639: My bad</p>
<hr />
<div>Originally [http://wiibrew.org/wiki/Mii_Data created for the Nintendo Wii] (and backported to a selection of DS/i games), the '''Mii''' format was expanded with a larger selection of facial features and a new "copying" permission for the 3DS family, and later implemented as-is on Wii U.<br />
<br />
See [[Mii Maker]] for the application chiefly designed to create, edit, delete, and trade Miis or convert them from and to a QR code.<br />
<br />
The default endianness in this page is little-endian, unless explicitly specified.<br />
<br />
==Mii Database==<br />
Format of the Mii main database '''CFL_DB.dat''', found in [[Extdata#NAND_Shared_Extdata|shared extdata]] archive f0000000b.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x4<br />
| Header "CFOG" (Mii Maker section)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Header 0x00000100<br />
|-<br />
| 0x8<br />
| 0x23F0 (100 * 0x5C)<br />
| Array of owned (saved in Mii Maker) Miis. Order in file is unrelated to canonical order in-app.<br />
|-<br />
| 0x23F8<br />
| 0x4<br />
| Header "CFHE"<br />
|-<br />
| 0x23FC<br />
| 0x2<br />
| Linked list tail index. 0xFFFF if the list is empty<br />
|-<br />
| 0x23FE<br />
| 0x2<br />
| Linked list head index. 0xFFFF if the list is empty<br />
|-<br />
| 0x2400<br />
| 0xA410 (3000 * 0xE)<br />
| Linked list of objects? See chapter<br />
|-<br />
| 0xC810<br />
| 0xE<br />
| Padding?<br />
|-<br />
| 0xC81E<br />
| 0x2<br />
| Checksum of all of the above (the first 0xC81E byte). See section [[#Checksum|below]].<br />
|-<br />
| 0xC820<br />
| 0x4<br />
| Header "CFRA" (Invitations section)<br />
|-<br />
| 0xC824<br />
| 0x4<br />
| Mii count in this section. Maximum 100<br />
|-<br />
| 0xC828<br />
| 0x64 (100 * 0x1)<br />
| Order index of Mii in this section?<br />
|-<br />
| 0xC88C<br />
| 0x1C20 (100 * 0x48)<br />
| Array of Miis contributed from games, used for Mii Plaza "invitations" feature.<br/>The format isn't that of a full Mii. The "author" field is missing<br />
|-<br />
| 0xE4AC<br />
| 0x12<br />
| 01 00 [..] 00<br />
|-<br />
| 0xE4BE<br />
| 0x2<br />
| Checksum over the data above starting from 0xC820<br />
|-<br />
| 0xE4C0<br />
| 0x3D860 (3000 * 0x54)<br />
| Another array of Miis. Seems related to the CFHE section. <br/>The Mii format in this section is modified. The "author" field is missing, A 4-byte timestamp (seconds since 2000) together with 8-byte zeros(?) is appended at the end.<br />
|}<br />
When encrypted in QR codes, 4 additional bytes are added. Two null bytes and a CRC-16. It's the exact same CRC-16 as for the Wii blocks on the 0x5e first bytes. It seems that the CRC is ignored, the Mii Maker expecting the result of APT:Unwrap to detect integrity loss.<br />
<br />
==CFHE object==<br />
<br />
A 0xE-byte long linked list node. The format is 4-byte Mii ID (See Mii format) + 6-byte MAC + 2-byte previous node index (prev) + 2-byte next node index (next).<br />
<br />
An invalid node has value: ID = 0, MAC = 0, prev = 0x7FFF, next = 0x7FFF.<br />
<br />
The highest bit of these fields has some special meaning and isn't part of the index value.<br />
<br />
==Checksum==<br />
<br />
The algorithm used to verify the integrity of the database is based on [http://srecord.sourceforge.net/crc16-ccitt.html CRC16-CCITT], though it's an incorrect implementation. It is the same algorithm used to verify [http://wiibrew.org/wiki/Mii_Data#Block_format Mii Data on the Wii].<br />
<br />
To obtain the correct value for the checksum, apply the algorithm to the first 0xC81E bytes of the database. This can be done using [https://gbatemp.net/threads/tutorial-give-your-mii-gold-pants-and-use-it-for-streetpass.379146/page-24#post-6569186 FixCRC]; alternativly a pseudocode implementation of the checksum algorithm is given below:<br />
<br />
<source lang="python"><br />
def crc16_CCITTWii(u8[]: data) -> u16:<br />
"""Calculate a checksum of data using the CRC16-CCITT implementation of the Wii<br />
<br />
This implementation uses 0x0000 as the starting value, which is different<br />
from what CRC16-CCITT specifies.<br />
"""<br />
<br />
# note: a correct implementation of CRC16-CCITT<br />
# would initialize this to 0xffff<br />
u32 crc := 0x0<br />
<br />
for byte in data:<br />
# Iterate over every of the 8 bits in byte.<br />
# Begin with the most significant bit. (7, 6, ... , 1, 0)<br />
for bit in 7..0:<br />
# & - binary `and'; <</>> - bitshift left/right; ^ - binary `xor'<br />
crc := (<br />
(crc << 1) | ((byte >> bit) & 0x1)<br />
^ (0x1021 if crc & 0x8000 else 0)<br />
)<br />
<br />
for _ in 0..15:<br />
crc := (crc << 1) ^ (0x1021 if crc & 0x8000 else 0)<br />
<br />
# only return the lowest 16 bit of crc<br />
return (u16) (crc & 0xffff)<br />
<br />
checksum := crc16_CCITTWii(miidb[0:0xc81e]) # checksum over the first 0xc81e bytes<br />
</source><br />
<br />
==Mii format==<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x1<br />
| Always 3?<br />
|-<br />
| 0x1<br />
| 0x1<br />
| bit 0: allow copying<br/>bit 1: private name?<br/>bit 2-3: region lock (0=no lock, 1=JPN, 2=USA, 3=EUR)<br/>bit4-5:character set(0=JPN+USA+EUR, 1=CHN, 2=KOR, 3=TWN)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Mii position shown on the selection screen<br/>bit 0-3: page index <br/>bit 4-7: slot index<br />
|-<br />
| 0x3<br />
| 0x1<br />
| bit 0-3: ?<br/>bit 4-6: version? (1=Wii, 2=DSi, 3=3DS)<br />
|-<br />
| 0x4<br />
| 0x8<br />
| System ID (identifies owner, for purpose of enforcing editing restrictions and blue pants).<br/>Is not tied to the MAC address anymore.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Mii ID (big-endian 32bit unsigned integer):<br/>Bit 0..27: (bit[0..27] * 2) = date of creation (seconds since 01/01/2010 00:00:00)<br/>Bit 28: Always set?<br/>Bit 29: set for temporary Mii<br/>Bit 30: Set for DSi mii?<br/>Bit 31: not set iff Mii is special<br />
|-<br />
| 0x10<br />
| 0x6<br />
| Creator's full MAC<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Padding (0000)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| bit 0: sex (0 if male, 1 if female)<br/>bit 1-4: birthday month<br/>bit 5-9: birthday day<br/>bit 10-13: favorite color<br/>bit 14: favorite mii (0 if false, 1 if true)<br />
|-<br />
| 0x1A<br />
| 0x14<br />
| UTF-16 Mii Name (10 chars max, 0000 terminated)<br />
|-<br />
| 0x2E<br />
| 0x2<br />
| width & height<br />
|-<br />
| 0x30<br />
| 0x1<br />
| bit 0: disable sharing<br/>bit 1-4: face shape<br/>bit 5-7: skin color<br />
|-<br />
| 0x31<br />
| 0x1<br />
| bit 0-3: wrinkles<br/>bit 4-7: makeup<br />
|-<br />
| 0x32<br />
| 0x1<br />
| hair style<br />
|-<br />
| 0x33<br />
| 0x1<br />
| bit 0-2: hair color<br/>bit 3: flip hair<br />
|-<br />
| 0x34<br />
| 0x4<br />
| bit 0-5: eye style<br/>bit 6-8: eye color <br/>bit 9-12: eye scale <br/>bit 13-15: eye yscale<br/>bit 16-20: eye rotation<br/>bit 21-24: eye x spacing<br/>bit 25-29: eye y position<br />
|-<br />
| 0x38<br />
| 0x4<br />
| bit 0-4: eyebrow style<br/>bit 5-7: eyebrow color <br/>bit 8-11: eyebrow scale<br/>bit 12-14: eyebrow yscale <br/>bit 16-19: eyebrow rotation<br/>bit 21-24: eyebrow x spacing<br/>bit 25-29: eyebrow y position<br />
|-<br />
| 0x3C<br />
| 0x2<br />
| bit 0-4: nose style<br/>bit 5-8: nose scale<br/>bit 9-13: nose y position<br />
|-<br />
| 0x3E<br />
| 0x2<br />
| bit 0-5: mouse style<br/>biy 6-8: mouse color<br/>bit 9-12: mouse scale<br/>bit 13-15: mouse yscale<br />
|-<br />
| 0x40<br />
| 0x2<br />
| bit 0-4: mouse y position<br/>bit 5-7: mustach style<br />
|-<br />
| 0x42<br />
| 0x2<br />
| bit 0-2: beard style<br/>bit 3-5: beard color<br/>bit 6-9: mustache scale<br/>bit 10-14:mustache y position<br />
|-<br />
| 0x44<br />
| 0x2<br />
| bit 0-3: glasses style<br/>bit 4-6: glasses color<br/>bit 7-10: glasses scale<br/>bit 11-15: glasses y position<br />
|-<br />
| 0x46<br />
| 0x2<br />
| bit 0: enable mole<br/>bit 1-4: mole scale<br/>bit 5-9: mole x position<br/>bit 10-14: mole y position<br />
|-<br />
| 0x48<br />
| 0x14<br />
| UTF-16 Author Name (10 chars max, 0000 terminated)<br />
|}<br />
<br />
==Mii categories (pants colors)==<br />
<br />
====Special (gold) Miis====<br />
Specialness will override any other color and make the Mii non-editable.<br />
<br />
Copying is rumored to have to be disabled.<br />
<br />
Zeroed system-id and timestamp?<br />
<br />
====Imported (blue) Miis====<br />
Any (non-gold) Mii with a different System ID will appear as a foreign one.<br />
<br />
There is also a range of Mii IDs that are always foreign and uneditable, regardless of the System ID:<br />
<br />
<br />
====Regular (black/red) Miis====<br />
Always editable, since they can only appear as such on the console that created them.<br />
<br />
<br />
====Personal (red) Mii====<br />
A red Mii that happens to be the first in the file!<br />
<br />
==Mii values==<br />
Each of the following values were found with NTR Debugger:<br />
If you want to access the value, grab the given "NTR address" and add 0x08815000.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Data<br />
! NTR address<br />
! Variation (hex)<br />
! Notes<br />
|-<br />
| Face style<br />
| 0x894<br />
| 00-0B<br />
| Not ordered as in editor, read below<br />
|-<br />
| Face color<br />
| 0x898<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Wrinkles<br />
| 0x89C<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Makeup<br />
| 0x8A0<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Hair style<br />
| 0x8A4<br />
| 00-84<br />
| Not ordered as in editor, read below<br />
|-<br />
| Hair color<br />
| 0x8A8<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Hair flipped<br />
| 0x8AC<br />
| 1 if true<br />
| From top to bottom<br />
|-<br />
| Eye style<br />
| 0x8B0<br />
| 00-3C<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyes color<br />
| 0x8B4<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Eyes size<br />
| 0x8B8<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Eyes thickness<br />
| 0x8BC<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyes rotation<br />
| 0x8C0<br />
| 00-07<br />
| <br />
|-<br />
| Eyes spacing<br />
| 0x8C4<br />
| 00-0C<br />
| <br />
|-<br />
| Eyes height<br />
| 0x8C8<br />
| 00-12<br />
| <br />
|-<br />
| Eyebrows style<br />
| 0x8CC<br />
| 00-18<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyebrows color<br />
| 0x8D0<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Eyebrows size<br />
| 0x8D4<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows thickness<br />
| 0x8D8<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows rotation<br />
| 0x8DC<br />
| 00-0B<br />
| <br />
|-<br />
| Eyebrows spacing<br />
| 0x8E0<br />
| 00-0C<br />
| <br />
|-<br />
| Eyebrows height<br />
| 0x8E4<br />
| 03-12<br />
| Yup, minimum is 0x03<br />
|-<br />
| Nose style<br />
| 0x8E8<br />
| 00-11<br />
| Not ordered as in editor, read below<br />
|-<br />
| Nose size<br />
| 0x8EC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Nose height<br />
| 0x8F0<br />
| 00-12<br />
| <br />
|-<br />
| Mouth style<br />
| 0x8F4<br />
| 00-23<br />
| Not ordered as in editor, read below<br />
|-<br />
| Mouth color<br />
| 0x8F8<br />
| 00-04<br />
| From top to bottom.<br />
|-<br />
| Mouth size<br />
| 0x8FC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mouth thickness<br />
| 0x900<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Mouth height<br />
| 0x904<br />
| 00-12<br />
| <br />
|-<br />
| Mustache style<br />
| 0x908<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Beard style<br />
| 0x90C<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Mustache/Beard color<br />
| 0x910<br />
| 00-07<br />
| From top to button.<br />
|-<br />
| Mustache size<br />
| 0x914<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mustache height<br />
| 0x918<br />
| 00-10<br />
| <br />
|-<br />
| Glasses style<br />
| 0x91C<br />
| 00-08<br />
| Order like in editor.<br />
|-<br />
| Glasses color<br />
| 0x920<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Glasses size<br />
| 0x924<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Glasses height<br />
| 0x928<br />
| 00-14<br />
| <br />
|-<br />
| Mole enable<br />
| 0x92C<br />
| 1 if enabled, 0 else.<br />
| <br />
|-<br />
| Mole size<br />
| 0x930<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mole horiz pos<br />
| 0x934<br />
| 00-10<br />
| <br />
|-<br />
| Mole vert pos<br />
| 0x938<br />
| 00-1E<br />
| <br />
|-<br />
| Mii height<br />
| 0x93C<br />
| 00-7F<br />
| <br />
|-<br />
| Mii weight<br />
| 0x940<br />
| 00-7F<br />
| <br />
|-<br />
| Mii name<br />
| 0x944-0x959<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Creator's name<br />
| 0x95A-96F<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Mii gender<br />
| 0x970<br />
| 0: Male, 1: Female<br />
| <br />
|-<br />
| Birthdate month<br />
| 0x974<br />
| 01-0C<br />
| <br />
|-<br />
| Birthdate day<br />
| 0x978<br />
| 01-1F<br />
| <br />
|-<br />
| Mii shirt color<br />
| 0x97C<br />
| 00-0B<br />
| Ordered like in editor.<br />
|-<br />
| Favorite<br />
| 0x980<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Allow copy<br />
| 0x981<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Unused byte?<br />
| 0x982<br />
| <br />
| <br />
|-<br />
| Allow sharing<br />
| 0x983<br />
| 0: true, 1: false<br />
|<br />
|-<br />
| ???<br />
| 0x984-0x98F<br />
| All zero?<br />
|<br />
|-<br />
| ???<br />
| 0x990-0x997<br />
| 4?<br />
|<br />
|}<br />
0x08815998: Same 4 bytes as encrypted Mii: first 4 bits for Mii type, 4 next for number of seconds since 01/01/2010 00:00:00 UTC+3 (should be verified by other country and region 3DS) divided by 2.<br />
0x0881599C: 6 bytes of MAC address of the 3DS that created the Mii.<br />
0x088159A2: 6 unknow uses bytes<br />
0x088159A8: Same 8 bytes as decrypted Mii at 0x04 through 0x0B. Seems NAND specific, kept the same on Miis created on same NAND but different 3DS via System Transfer. Might be a coincidence but the two first bytes are in ID0 folder name in the Nintendo 3DS folder.<br />
<br />
===Mapped Editor <-> Hex values===<br />
<br />
Most of the values are ordered (left button decreases, right increases, color choices are top to bottom...) but for most "main" part of the UI, where you choose the style of the part being edited, hex values has no correlation with displayed order.<br />
Here is a JSON that can go from a Part, a Page and Position to the right hex value. This is 0 indexed (eg: datas["face"][0][11]).<br />
<br />
<nowiki>{<br />
face: [<br />
0x00,0x01,0x08,<br />
0x02,0x03,0x09,<br />
0x04,0x05,0x0a,<br />
0x06,0x07,0x0b<br />
],<br />
hairs: [<br />
[0x21,0x2f,0x28,<br />
0x25,0x20,0x6b,<br />
0x30,0x33,0x37,<br />
0x46,0x2c,0x42],<br />
[0x34,0x32,0x26,<br />
0x31,0x2b,0x1f,<br />
0x38,0x44,0x3e,<br />
0x73,0x4c,0x77],<br />
[0x40,0x51,0x74,<br />
0x79,0x16,0x3a,<br />
0x3c,0x57,0x7d,<br />
0x75,0x49,0x4b],<br />
[0x2a,0x59,0x39,<br />
0x36,0x50,0x22,<br />
0x17,0x56,0x58,<br />
0x76,0x27,0x24],<br />
[0x2d,0x43,0x3b,<br />
0x41,0x29,0x1e,<br />
0x0c,0x10,0x0a,<br />
0x52,0x80,0x81],<br />
[0x0e,0x5f,0x69,<br />
0x64,0x06,0x14,<br />
0x5d,0x66,0x1b,<br />
0x04,0x11,0x6e]<br />
[0x7b,0x08,0x6a,<br />
0x48,0x03,0x15,<br />
0x00,0x62,0x3f,<br />
0x5a,0x0b,0x78],<br />
[0x05,0x4a,0x6c,<br />
0x5e,0x7c,0x19,<br />
0x63,0x45,0x23,<br />
0x0d,0x7a,0x71],<br />
[0x35,0x18,0x55,<br />
0x53,0x47,0x83,<br />
0x60,0x65,0x1d,<br />
0x07,0x0f,0x70],<br />
[0x4f,0x01,0x6d,<br />
0x7f,0x5b,0x1a,<br />
0x3d,0x67,0x02,<br />
0x4d,0x12,0x5c],<br />
[0x54,0x09,0x13,<br />
0x82,0x61,0x68,<br />
0x2e,0x4e,0x1c,<br />
0x72,0x7e,0x6f]<br />
],<br />
eyebrows: [<br />
[0x06,0x00,0x0c,<br />
0x01,0x09,0x13,<br />
0x07,0x15,0x08,<br />
0x11,0x05,0x04],<br />
[0x0b,0x0a,0x02,<br />
0x03,0x0e,0x14,<br />
0x0f,0x0d,0x16,<br />
0x12,0x10,0x17]<br />
],<br />
nose: [<br />
[0x01,0x0a,0x02,<br />
0x03,0x06,0x00,<br />
0x05,0x04,0x08,<br />
0x09,0x07,0x0B],<br />
[0x0d,0x0e,0x0c,<br />
0x11,0x10,0x0f]<br />
],<br />
mouth: [<br />
[0x17,0x01,0x13,<br />
0x15,0x16,0x05,<br />
0x00,0x08,0x0a,<br />
0x10,0x06,0x0d],<br />
[0x07,0x09,0x02,<br />
0x11,0x03,0x04,<br />
0x0f,0x0b,0x14,<br />
0x12,0x0e,0x0c],<br />
[0x1b,0x1e,0x18,<br />
0x19,0x1d,0x1c,<br />
0x1a,0x23,0x1f,<br />
0x22,0x21,0x20]<br />
]<br />
}</nowiki></div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Mii&diff=20924
Mii
2019-04-18T07:44:18Z
<p>Oreo639: Fix bit size for 0x18</p>
<hr />
<div>Originally [http://wiibrew.org/wiki/Mii_Data created for the Nintendo Wii] (and backported to a selection of DS/i games), the '''Mii''' format was expanded with a larger selection of facial features and a new "copying" permission for the 3DS family, and later implemented as-is on Wii U.<br />
<br />
See [[Mii Maker]] for the application chiefly designed to create, edit, delete, and trade Miis or convert them from and to a QR code.<br />
<br />
The default endianness in this page is little-endian, unless explicitly specified.<br />
<br />
==Mii Database==<br />
Format of the Mii main database '''CFL_DB.dat''', found in [[Extdata#NAND_Shared_Extdata|shared extdata]] archive f0000000b.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x4<br />
| Header "CFOG" (Mii Maker section)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Header 0x00000100<br />
|-<br />
| 0x8<br />
| 0x23F0 (100 * 0x5C)<br />
| Array of owned (saved in Mii Maker) Miis. Order in file is unrelated to canonical order in-app.<br />
|-<br />
| 0x23F8<br />
| 0x4<br />
| Header "CFHE"<br />
|-<br />
| 0x23FC<br />
| 0x2<br />
| Linked list tail index. 0xFFFF if the list is empty<br />
|-<br />
| 0x23FE<br />
| 0x2<br />
| Linked list head index. 0xFFFF if the list is empty<br />
|-<br />
| 0x2400<br />
| 0xA410 (3000 * 0xE)<br />
| Linked list of objects? See chapter<br />
|-<br />
| 0xC810<br />
| 0xE<br />
| Padding?<br />
|-<br />
| 0xC81E<br />
| 0x2<br />
| Checksum of all of the above (the first 0xC81E byte). See section [[#Checksum|below]].<br />
|-<br />
| 0xC820<br />
| 0x4<br />
| Header "CFRA" (Invitations section)<br />
|-<br />
| 0xC824<br />
| 0x4<br />
| Mii count in this section. Maximum 100<br />
|-<br />
| 0xC828<br />
| 0x64 (100 * 0x1)<br />
| Order index of Mii in this section?<br />
|-<br />
| 0xC88C<br />
| 0x1C20 (100 * 0x48)<br />
| Array of Miis contributed from games, used for Mii Plaza "invitations" feature.<br/>The format isn't that of a full Mii. The "author" field is missing<br />
|-<br />
| 0xE4AC<br />
| 0x12<br />
| 01 00 [..] 00<br />
|-<br />
| 0xE4BE<br />
| 0x2<br />
| Checksum over the data above starting from 0xC820<br />
|-<br />
| 0xE4C0<br />
| 0x3D860 (3000 * 0x54)<br />
| Another array of Miis. Seems related to the CFHE section. <br/>The Mii format in this section is modified. The "author" field is missing, A 4-byte timestamp (seconds since 2000) together with 8-byte zeros(?) is appended at the end.<br />
|}<br />
When encrypted in QR codes, 4 additional bytes are added. Two null bytes and a CRC-16. It's the exact same CRC-16 as for the Wii blocks on the 0x5e first bytes. It seems that the CRC is ignored, the Mii Maker expecting the result of APT:Unwrap to detect integrity loss.<br />
<br />
==CFHE object==<br />
<br />
A 0xE-byte long linked list node. The format is 4-byte Mii ID (See Mii format) + 6-byte MAC + 2-byte previous node index (prev) + 2-byte next node index (next).<br />
<br />
An invalid node has value: ID = 0, MAC = 0, prev = 0x7FFF, next = 0x7FFF.<br />
<br />
The highest bit of these fields has some special meaning and isn't part of the index value.<br />
<br />
==Checksum==<br />
<br />
The algorithm used to verify the integrity of the database is based on [http://srecord.sourceforge.net/crc16-ccitt.html CRC16-CCITT], though it's an incorrect implementation. It is the same algorithm used to verify [http://wiibrew.org/wiki/Mii_Data#Block_format Mii Data on the Wii].<br />
<br />
To obtain the correct value for the checksum, apply the algorithm to the first 0xC81E bytes of the database. This can be done using [https://gbatemp.net/threads/tutorial-give-your-mii-gold-pants-and-use-it-for-streetpass.379146/page-24#post-6569186 FixCRC]; alternativly a pseudocode implementation of the checksum algorithm is given below:<br />
<br />
<source lang="python"><br />
def crc16_CCITTWii(u8[]: data) -> u16:<br />
"""Calculate a checksum of data using the CRC16-CCITT implementation of the Wii<br />
<br />
This implementation uses 0x0000 as the starting value, which is different<br />
from what CRC16-CCITT specifies.<br />
"""<br />
<br />
# note: a correct implementation of CRC16-CCITT<br />
# would initialize this to 0xffff<br />
u32 crc := 0x0<br />
<br />
for byte in data:<br />
# Iterate over every of the 8 bits in byte.<br />
# Begin with the most significant bit. (7, 6, ... , 1, 0)<br />
for bit in 7..0:<br />
# & - binary `and'; <</>> - bitshift left/right; ^ - binary `xor'<br />
crc := (<br />
(crc << 1) | ((byte >> bit) & 0x1)<br />
^ (0x1021 if crc & 0x8000 else 0)<br />
)<br />
<br />
for _ in 0..15:<br />
crc := (crc << 1) ^ (0x1021 if crc & 0x8000 else 0)<br />
<br />
# only return the lowest 16 bit of crc<br />
return (u16) (crc & 0xffff)<br />
<br />
checksum := crc16_CCITTWii(miidb[0:0xc81e]) # checksum over the first 0xc81e bytes<br />
</source><br />
<br />
==Mii format==<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x1<br />
| Always 3?<br />
|-<br />
| 0x1<br />
| 0x1<br />
| bit 0: allow copying<br/>bit 1: private name?<br/>bit 2-3: region lock (0=no lock, 1=JPN, 2=USA, 3=EUR)<br/>bit4-5:character set(0=JPN+USA+EUR, 1=CHN, 2=KOR, 3=TWN)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Mii position shown on the selection screen<br/>bit 0-3: page index <br/>bit 4-7: slot index<br />
|-<br />
| 0x3<br />
| 0x1<br />
| bit 0-3: ?<br/>bit 4-7: version? (1=Wii, 2=DSi, 3=3DS)<br />
|-<br />
| 0x4<br />
| 0x8<br />
| System ID (identifies owner, for purpose of enforcing editing restrictions and blue pants).<br/>Is not tied to the MAC address anymore.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Mii ID (big-endian 32bit unsigned integer):<br/>Bit 0..27: (bit[0..27] * 2) = date of creation (seconds since 01/01/2010 00:00:00)<br/>Bit 28: Always set?<br/>Bit 29: set for temporary Mii<br/>Bit 30: Set for DSi mii?<br/>Bit 31: not set iff Mii is special<br />
|-<br />
| 0x10<br />
| 0x6<br />
| Creator's full MAC<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Padding (0000)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| bit 0: sex (0 if male, 1 if female)<br/>bit 1-4: birthday month<br/>bit 5-9: birthday day<br/>bit 10-13: favorite color<br/>bit 14: favorite mii (0 if false, 1 if true)<br />
|-<br />
| 0x1A<br />
| 0x14<br />
| UTF-16 Mii Name (10 chars max, 0000 terminated)<br />
|-<br />
| 0x2E<br />
| 0x2<br />
| width & height<br />
|-<br />
| 0x30<br />
| 0x1<br />
| bit 0: disable sharing<br/>bit 1-4: face shape<br/>bit 5-7: skin color<br />
|-<br />
| 0x31<br />
| 0x1<br />
| bit 0-3: wrinkles<br/>bit 4-7: makeup<br />
|-<br />
| 0x32<br />
| 0x1<br />
| hair style<br />
|-<br />
| 0x33<br />
| 0x1<br />
| bit 0-2: hair color<br/>bit 3: flip hair<br />
|-<br />
| 0x34<br />
| 0x4<br />
| bit 0-5: eye style<br/>bit 6-8: eye color <br/>bit 9-12: eye scale <br/>bit 13-15: eye yscale<br/>bit 16-20: eye rotation<br/>bit 21-24: eye x spacing<br/>bit 25-29: eye y position<br />
|-<br />
| 0x38<br />
| 0x4<br />
| bit 0-4: eyebrow style<br/>bit 5-7: eyebrow color <br/>bit 8-11: eyebrow scale<br/>bit 12-14: eyebrow yscale <br/>bit 16-19: eyebrow rotation<br/>bit 21-24: eyebrow x spacing<br/>bit 25-29: eyebrow y position<br />
|-<br />
| 0x3C<br />
| 0x2<br />
| bit 0-4: nose style<br/>bit 5-8: nose scale<br/>bit 9-13: nose y position<br />
|-<br />
| 0x3E<br />
| 0x2<br />
| bit 0-5: mouse style<br/>biy 6-8: mouse color<br/>bit 9-12: mouse scale<br/>bit 13-15: mouse yscale<br />
|-<br />
| 0x40<br />
| 0x2<br />
| bit 0-4: mouse y position<br/>bit 5-7: mustach style<br />
|-<br />
| 0x42<br />
| 0x2<br />
| bit 0-2: beard style<br/>bit 3-5: beard color<br/>bit 6-9: mustache scale<br/>bit 10-14:mustache y position<br />
|-<br />
| 0x44<br />
| 0x2<br />
| bit 0-3: glasses style<br/>bit 4-6: glasses color<br/>bit 7-10: glasses scale<br/>bit 11-15: glasses y position<br />
|-<br />
| 0x46<br />
| 0x2<br />
| bit 0: enable mole<br/>bit 1-4: mole scale<br/>bit 5-9: mole x position<br/>bit 10-14: mole y position<br />
|-<br />
| 0x48<br />
| 0x14<br />
| UTF-16 Author Name (10 chars max, 0000 terminated)<br />
|}<br />
<br />
==Mii categories (pants colors)==<br />
<br />
====Special (gold) Miis====<br />
Specialness will override any other color and make the Mii non-editable.<br />
<br />
Copying is rumored to have to be disabled.<br />
<br />
Zeroed system-id and timestamp?<br />
<br />
====Imported (blue) Miis====<br />
Any (non-gold) Mii with a different System ID will appear as a foreign one.<br />
<br />
There is also a range of Mii IDs that are always foreign and uneditable, regardless of the System ID:<br />
<br />
<br />
====Regular (black/red) Miis====<br />
Always editable, since they can only appear as such on the console that created them.<br />
<br />
<br />
====Personal (red) Mii====<br />
A red Mii that happens to be the first in the file!<br />
<br />
==Mii values==<br />
Each of the following values were found with NTR Debugger:<br />
If you want to access the value, grab the given "NTR address" and add 0x08815000.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Data<br />
! NTR address<br />
! Variation (hex)<br />
! Notes<br />
|-<br />
| Face style<br />
| 0x894<br />
| 00-0B<br />
| Not ordered as in editor, read below<br />
|-<br />
| Face color<br />
| 0x898<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Wrinkles<br />
| 0x89C<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Makeup<br />
| 0x8A0<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Hair style<br />
| 0x8A4<br />
| 00-84<br />
| Not ordered as in editor, read below<br />
|-<br />
| Hair color<br />
| 0x8A8<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Hair flipped<br />
| 0x8AC<br />
| 1 if true<br />
| From top to bottom<br />
|-<br />
| Eye style<br />
| 0x8B0<br />
| 00-3C<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyes color<br />
| 0x8B4<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Eyes size<br />
| 0x8B8<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Eyes thickness<br />
| 0x8BC<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyes rotation<br />
| 0x8C0<br />
| 00-07<br />
| <br />
|-<br />
| Eyes spacing<br />
| 0x8C4<br />
| 00-0C<br />
| <br />
|-<br />
| Eyes height<br />
| 0x8C8<br />
| 00-12<br />
| <br />
|-<br />
| Eyebrows style<br />
| 0x8CC<br />
| 00-18<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyebrows color<br />
| 0x8D0<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Eyebrows size<br />
| 0x8D4<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows thickness<br />
| 0x8D8<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows rotation<br />
| 0x8DC<br />
| 00-0B<br />
| <br />
|-<br />
| Eyebrows spacing<br />
| 0x8E0<br />
| 00-0C<br />
| <br />
|-<br />
| Eyebrows height<br />
| 0x8E4<br />
| 03-12<br />
| Yup, minimum is 0x03<br />
|-<br />
| Nose style<br />
| 0x8E8<br />
| 00-11<br />
| Not ordered as in editor, read below<br />
|-<br />
| Nose size<br />
| 0x8EC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Nose height<br />
| 0x8F0<br />
| 00-12<br />
| <br />
|-<br />
| Mouth style<br />
| 0x8F4<br />
| 00-23<br />
| Not ordered as in editor, read below<br />
|-<br />
| Mouth color<br />
| 0x8F8<br />
| 00-04<br />
| From top to bottom.<br />
|-<br />
| Mouth size<br />
| 0x8FC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mouth thickness<br />
| 0x900<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Mouth height<br />
| 0x904<br />
| 00-12<br />
| <br />
|-<br />
| Mustache style<br />
| 0x908<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Beard style<br />
| 0x90C<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Mustache/Beard color<br />
| 0x910<br />
| 00-07<br />
| From top to button.<br />
|-<br />
| Mustache size<br />
| 0x914<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mustache height<br />
| 0x918<br />
| 00-10<br />
| <br />
|-<br />
| Glasses style<br />
| 0x91C<br />
| 00-08<br />
| Order like in editor.<br />
|-<br />
| Glasses color<br />
| 0x920<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Glasses size<br />
| 0x924<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Glasses height<br />
| 0x928<br />
| 00-14<br />
| <br />
|-<br />
| Mole enable<br />
| 0x92C<br />
| 1 if enabled, 0 else.<br />
| <br />
|-<br />
| Mole size<br />
| 0x930<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mole horiz pos<br />
| 0x934<br />
| 00-10<br />
| <br />
|-<br />
| Mole vert pos<br />
| 0x938<br />
| 00-1E<br />
| <br />
|-<br />
| Mii height<br />
| 0x93C<br />
| 00-7F<br />
| <br />
|-<br />
| Mii weight<br />
| 0x940<br />
| 00-7F<br />
| <br />
|-<br />
| Mii name<br />
| 0x944-0x959<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Creator's name<br />
| 0x95A-96F<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Mii gender<br />
| 0x970<br />
| 0: Male, 1: Female<br />
| <br />
|-<br />
| Birthdate month<br />
| 0x974<br />
| 01-0C<br />
| <br />
|-<br />
| Birthdate day<br />
| 0x978<br />
| 01-1F<br />
| <br />
|-<br />
| Mii shirt color<br />
| 0x97C<br />
| 00-0B<br />
| Ordered like in editor.<br />
|-<br />
| Favorite<br />
| 0x980<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Allow copy<br />
| 0x981<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Unused byte?<br />
| 0x982<br />
| <br />
| <br />
|-<br />
| Allow sharing<br />
| 0x983<br />
| 0: true, 1: false<br />
|<br />
|-<br />
| ???<br />
| 0x984-0x98F<br />
| All zero?<br />
|<br />
|-<br />
| ???<br />
| 0x990-0x997<br />
| 4?<br />
|<br />
|}<br />
0x08815998: Same 4 bytes as encrypted Mii: first 4 bits for Mii type, 4 next for number of seconds since 01/01/2010 00:00:00 UTC+3 (should be verified by other country and region 3DS) divided by 2.<br />
0x0881599C: 6 bytes of MAC address of the 3DS that created the Mii.<br />
0x088159A2: 6 unknow uses bytes<br />
0x088159A8: Same 8 bytes as decrypted Mii at 0x04 through 0x0B. Seems NAND specific, kept the same on Miis created on same NAND but different 3DS via System Transfer. Might be a coincidence but the two first bytes are in ID0 folder name in the Nintendo 3DS folder.<br />
<br />
===Mapped Editor <-> Hex values===<br />
<br />
Most of the values are ordered (left button decreases, right increases, color choices are top to bottom...) but for most "main" part of the UI, where you choose the style of the part being edited, hex values has no correlation with displayed order.<br />
Here is a JSON that can go from a Part, a Page and Position to the right hex value. This is 0 indexed (eg: datas["face"][0][11]).<br />
<br />
<nowiki>{<br />
face: [<br />
0x00,0x01,0x08,<br />
0x02,0x03,0x09,<br />
0x04,0x05,0x0a,<br />
0x06,0x07,0x0b<br />
],<br />
hairs: [<br />
[0x21,0x2f,0x28,<br />
0x25,0x20,0x6b,<br />
0x30,0x33,0x37,<br />
0x46,0x2c,0x42],<br />
[0x34,0x32,0x26,<br />
0x31,0x2b,0x1f,<br />
0x38,0x44,0x3e,<br />
0x73,0x4c,0x77],<br />
[0x40,0x51,0x74,<br />
0x79,0x16,0x3a,<br />
0x3c,0x57,0x7d,<br />
0x75,0x49,0x4b],<br />
[0x2a,0x59,0x39,<br />
0x36,0x50,0x22,<br />
0x17,0x56,0x58,<br />
0x76,0x27,0x24],<br />
[0x2d,0x43,0x3b,<br />
0x41,0x29,0x1e,<br />
0x0c,0x10,0x0a,<br />
0x52,0x80,0x81],<br />
[0x0e,0x5f,0x69,<br />
0x64,0x06,0x14,<br />
0x5d,0x66,0x1b,<br />
0x04,0x11,0x6e]<br />
[0x7b,0x08,0x6a,<br />
0x48,0x03,0x15,<br />
0x00,0x62,0x3f,<br />
0x5a,0x0b,0x78],<br />
[0x05,0x4a,0x6c,<br />
0x5e,0x7c,0x19,<br />
0x63,0x45,0x23,<br />
0x0d,0x7a,0x71],<br />
[0x35,0x18,0x55,<br />
0x53,0x47,0x83,<br />
0x60,0x65,0x1d,<br />
0x07,0x0f,0x70],<br />
[0x4f,0x01,0x6d,<br />
0x7f,0x5b,0x1a,<br />
0x3d,0x67,0x02,<br />
0x4d,0x12,0x5c],<br />
[0x54,0x09,0x13,<br />
0x82,0x61,0x68,<br />
0x2e,0x4e,0x1c,<br />
0x72,0x7e,0x6f]<br />
],<br />
eyebrows: [<br />
[0x06,0x00,0x0c,<br />
0x01,0x09,0x13,<br />
0x07,0x15,0x08,<br />
0x11,0x05,0x04],<br />
[0x0b,0x0a,0x02,<br />
0x03,0x0e,0x14,<br />
0x0f,0x0d,0x16,<br />
0x12,0x10,0x17]<br />
],<br />
nose: [<br />
[0x01,0x0a,0x02,<br />
0x03,0x06,0x00,<br />
0x05,0x04,0x08,<br />
0x09,0x07,0x0B],<br />
[0x0d,0x0e,0x0c,<br />
0x11,0x10,0x0f]<br />
],<br />
mouth: [<br />
[0x17,0x01,0x13,<br />
0x15,0x16,0x05,<br />
0x00,0x08,0x0a,<br />
0x10,0x06,0x0d],<br />
[0x07,0x09,0x02,<br />
0x11,0x03,0x04,<br />
0x0f,0x0b,0x14,<br />
0x12,0x0e,0x0c],<br />
[0x1b,0x1e,0x18,<br />
0x19,0x1d,0x1c,<br />
0x1a,0x23,0x1f,<br />
0x22,0x21,0x20]<br />
]<br />
}</nowiki></div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Mii&diff=20923
Mii
2019-04-18T03:32:29Z
<p>Oreo639: Fix bit size for 0x3</p>
<hr />
<div>Originally [http://wiibrew.org/wiki/Mii_Data created for the Nintendo Wii] (and backported to a selection of DS/i games), the '''Mii''' format was expanded with a larger selection of facial features and a new "copying" permission for the 3DS family, and later implemented as-is on Wii U.<br />
<br />
See [[Mii Maker]] for the application chiefly designed to create, edit, delete, and trade Miis or convert them from and to a QR code.<br />
<br />
The default endianness in this page is little-endian, unless explicitly specified.<br />
<br />
==Mii Database==<br />
Format of the Mii main database '''CFL_DB.dat''', found in [[Extdata#NAND_Shared_Extdata|shared extdata]] archive f0000000b.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x4<br />
| Header "CFOG" (Mii Maker section)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Header 0x00000100<br />
|-<br />
| 0x8<br />
| 0x23F0 (100 * 0x5C)<br />
| Array of owned (saved in Mii Maker) Miis. Order in file is unrelated to canonical order in-app.<br />
|-<br />
| 0x23F8<br />
| 0x4<br />
| Header "CFHE"<br />
|-<br />
| 0x23FC<br />
| 0x2<br />
| Linked list tail index. 0xFFFF if the list is empty<br />
|-<br />
| 0x23FE<br />
| 0x2<br />
| Linked list head index. 0xFFFF if the list is empty<br />
|-<br />
| 0x2400<br />
| 0xA410 (3000 * 0xE)<br />
| Linked list of objects? See chapter<br />
|-<br />
| 0xC810<br />
| 0xE<br />
| Padding?<br />
|-<br />
| 0xC81E<br />
| 0x2<br />
| Checksum of all of the above (the first 0xC81E byte). See section [[#Checksum|below]].<br />
|-<br />
| 0xC820<br />
| 0x4<br />
| Header "CFRA" (Invitations section)<br />
|-<br />
| 0xC824<br />
| 0x4<br />
| Mii count in this section. Maximum 100<br />
|-<br />
| 0xC828<br />
| 0x64 (100 * 0x1)<br />
| Order index of Mii in this section?<br />
|-<br />
| 0xC88C<br />
| 0x1C20 (100 * 0x48)<br />
| Array of Miis contributed from games, used for Mii Plaza "invitations" feature.<br/>The format isn't that of a full Mii. The "author" field is missing<br />
|-<br />
| 0xE4AC<br />
| 0x12<br />
| 01 00 [..] 00<br />
|-<br />
| 0xE4BE<br />
| 0x2<br />
| Checksum over the data above starting from 0xC820<br />
|-<br />
| 0xE4C0<br />
| 0x3D860 (3000 * 0x54)<br />
| Another array of Miis. Seems related to the CFHE section. <br/>The Mii format in this section is modified. The "author" field is missing, A 4-byte timestamp (seconds since 2000) together with 8-byte zeros(?) is appended at the end.<br />
|}<br />
When encrypted in QR codes, 4 additional bytes are added. Two null bytes and a CRC-16. It's the exact same CRC-16 as for the Wii blocks on the 0x5e first bytes. It seems that the CRC is ignored, the Mii Maker expecting the result of APT:Unwrap to detect integrity loss.<br />
<br />
==CFHE object==<br />
<br />
A 0xE-byte long linked list node. The format is 4-byte Mii ID (See Mii format) + 6-byte MAC + 2-byte previous node index (prev) + 2-byte next node index (next).<br />
<br />
An invalid node has value: ID = 0, MAC = 0, prev = 0x7FFF, next = 0x7FFF.<br />
<br />
The highest bit of these fields has some special meaning and isn't part of the index value.<br />
<br />
==Checksum==<br />
<br />
The algorithm used to verify the integrity of the database is based on [http://srecord.sourceforge.net/crc16-ccitt.html CRC16-CCITT], though it's an incorrect implementation. It is the same algorithm used to verify [http://wiibrew.org/wiki/Mii_Data#Block_format Mii Data on the Wii].<br />
<br />
To obtain the correct value for the checksum, apply the algorithm to the first 0xC81E bytes of the database. This can be done using [https://gbatemp.net/threads/tutorial-give-your-mii-gold-pants-and-use-it-for-streetpass.379146/page-24#post-6569186 FixCRC]; alternativly a pseudocode implementation of the checksum algorithm is given below:<br />
<br />
<source lang="python"><br />
def crc16_CCITTWii(u8[]: data) -> u16:<br />
"""Calculate a checksum of data using the CRC16-CCITT implementation of the Wii<br />
<br />
This implementation uses 0x0000 as the starting value, which is different<br />
from what CRC16-CCITT specifies.<br />
"""<br />
<br />
# note: a correct implementation of CRC16-CCITT<br />
# would initialize this to 0xffff<br />
u32 crc := 0x0<br />
<br />
for byte in data:<br />
# Iterate over every of the 8 bits in byte.<br />
# Begin with the most significant bit. (7, 6, ... , 1, 0)<br />
for bit in 7..0:<br />
# & - binary `and'; <</>> - bitshift left/right; ^ - binary `xor'<br />
crc := (<br />
(crc << 1) | ((byte >> bit) & 0x1)<br />
^ (0x1021 if crc & 0x8000 else 0)<br />
)<br />
<br />
for _ in 0..15:<br />
crc := (crc << 1) ^ (0x1021 if crc & 0x8000 else 0)<br />
<br />
# only return the lowest 16 bit of crc<br />
return (u16) (crc & 0xffff)<br />
<br />
checksum := crc16_CCITTWii(miidb[0:0xc81e]) # checksum over the first 0xc81e bytes<br />
</source><br />
<br />
==Mii format==<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x1<br />
| Always 3?<br />
|-<br />
| 0x1<br />
| 0x1<br />
| bit 0: allow copying<br/>bit 1: private name?<br/>bit 2-3: region lock (0=no lock, 1=JPN, 2=USA, 3=EUR)<br/>bit4-5:character set(0=JPN+USA+EUR, 1=CHN, 2=KOR, 3=TWN)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Mii position shown on the selection screen<br/>bit 0-3: page index <br/>bit 4-7: slot index<br />
|-<br />
| 0x3<br />
| 0x1<br />
| bit 0-3: ?<br/>bit 4-7: version? (1=Wii, 2=DSi, 3=3DS)<br />
|-<br />
| 0x4<br />
| 0x8<br />
| System ID (identifies owner, for purpose of enforcing editing restrictions and blue pants).<br/>Is not tied to the MAC address anymore.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Mii ID (big-endian 32bit unsigned integer):<br/>Bit 0..27: (bit[0..27] * 2) = date of creation (seconds since 01/01/2010 00:00:00)<br/>Bit 28: Always set?<br/>Bit 29: set for temporary Mii<br/>Bit 30: Set for DSi mii?<br/>Bit 31: not set iff Mii is special<br />
|-<br />
| 0x10<br />
| 0x6<br />
| Creator's full MAC<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Padding (0000)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| bit 0: sex (0 if male, 1 if female)<br/>bit 1-4: birthday month<br/>bit 5-9: birthday day<br/>bit 10-14: favorite color<br/>bit 15: favorite mii (0 if false, 1 if true)<br />
|-<br />
| 0x1A<br />
| 0x14<br />
| UTF-16 Mii Name (10 chars max, 0000 terminated)<br />
|-<br />
| 0x2E<br />
| 0x2<br />
| width & height<br />
|-<br />
| 0x30<br />
| 0x1<br />
| bit 0: disable sharing<br/>bit 1-4: face shape<br/>bit 5-7: skin color<br />
|-<br />
| 0x31<br />
| 0x1<br />
| bit 0-3: wrinkles<br/>bit 4-7: makeup<br />
|-<br />
| 0x32<br />
| 0x1<br />
| hair style<br />
|-<br />
| 0x33<br />
| 0x1<br />
| bit 0-2: hair color<br/>bit 3: flip hair<br />
|-<br />
| 0x34<br />
| 0x4<br />
| bit 0-5: eye style<br/>bit 6-8: eye color <br/>bit 9-12: eye scale <br/>bit 13-15: eye yscale<br/>bit 16-20: eye rotation<br/>bit 21-24: eye x spacing<br/>bit 25-29: eye y position<br />
|-<br />
| 0x38<br />
| 0x4<br />
| bit 0-4: eyebrow style<br/>bit 5-7: eyebrow color <br/>bit 8-11: eyebrow scale<br/>bit 12-14: eyebrow yscale <br/>bit 16-19: eyebrow rotation<br/>bit 21-24: eyebrow x spacing<br/>bit 25-29: eyebrow y position<br />
|-<br />
| 0x3C<br />
| 0x2<br />
| bit 0-4: nose style<br/>bit 5-8: nose scale<br/>bit 9-13: nose y position<br />
|-<br />
| 0x3E<br />
| 0x2<br />
| bit 0-5: mouse style<br/>biy 6-8: mouse color<br/>bit 9-12: mouse scale<br/>bit 13-15: mouse yscale<br />
|-<br />
| 0x40<br />
| 0x2<br />
| bit 0-4: mouse y position<br/>bit 5-7: mustach style<br />
|-<br />
| 0x42<br />
| 0x2<br />
| bit 0-2: beard style<br/>bit 3-5: beard color<br/>bit 6-9: mustache scale<br/>bit 10-14:mustache y position<br />
|-<br />
| 0x44<br />
| 0x2<br />
| bit 0-3: glasses style<br/>bit 4-6: glasses color<br/>bit 7-10: glasses scale<br/>bit 11-15: glasses y position<br />
|-<br />
| 0x46<br />
| 0x2<br />
| bit 0: enable mole<br/>bit 1-4: mole scale<br/>bit 5-9: mole x position<br/>bit 10-14: mole y position<br />
|-<br />
| 0x48<br />
| 0x14<br />
| UTF-16 Author Name (10 chars max, 0000 terminated)<br />
|}<br />
<br />
==Mii categories (pants colors)==<br />
<br />
====Special (gold) Miis====<br />
Specialness will override any other color and make the Mii non-editable.<br />
<br />
Copying is rumored to have to be disabled.<br />
<br />
Zeroed system-id and timestamp?<br />
<br />
====Imported (blue) Miis====<br />
Any (non-gold) Mii with a different System ID will appear as a foreign one.<br />
<br />
There is also a range of Mii IDs that are always foreign and uneditable, regardless of the System ID:<br />
<br />
<br />
====Regular (black/red) Miis====<br />
Always editable, since they can only appear as such on the console that created them.<br />
<br />
<br />
====Personal (red) Mii====<br />
A red Mii that happens to be the first in the file!<br />
<br />
==Mii values==<br />
Each of the following values were found with NTR Debugger:<br />
If you want to access the value, grab the given "NTR address" and add 0x08815000.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Data<br />
! NTR address<br />
! Variation (hex)<br />
! Notes<br />
|-<br />
| Face style<br />
| 0x894<br />
| 00-0B<br />
| Not ordered as in editor, read below<br />
|-<br />
| Face color<br />
| 0x898<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Wrinkles<br />
| 0x89C<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Makeup<br />
| 0x8A0<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Hair style<br />
| 0x8A4<br />
| 00-84<br />
| Not ordered as in editor, read below<br />
|-<br />
| Hair color<br />
| 0x8A8<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Hair flipped<br />
| 0x8AC<br />
| 1 if true<br />
| From top to bottom<br />
|-<br />
| Eye style<br />
| 0x8B0<br />
| 00-3C<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyes color<br />
| 0x8B4<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Eyes size<br />
| 0x8B8<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Eyes thickness<br />
| 0x8BC<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyes rotation<br />
| 0x8C0<br />
| 00-07<br />
| <br />
|-<br />
| Eyes spacing<br />
| 0x8C4<br />
| 00-0C<br />
| <br />
|-<br />
| Eyes height<br />
| 0x8C8<br />
| 00-12<br />
| <br />
|-<br />
| Eyebrows style<br />
| 0x8CC<br />
| 00-18<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyebrows color<br />
| 0x8D0<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Eyebrows size<br />
| 0x8D4<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows thickness<br />
| 0x8D8<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows rotation<br />
| 0x8DC<br />
| 00-0B<br />
| <br />
|-<br />
| Eyebrows spacing<br />
| 0x8E0<br />
| 00-0C<br />
| <br />
|-<br />
| Eyebrows height<br />
| 0x8E4<br />
| 03-12<br />
| Yup, minimum is 0x03<br />
|-<br />
| Nose style<br />
| 0x8E8<br />
| 00-11<br />
| Not ordered as in editor, read below<br />
|-<br />
| Nose size<br />
| 0x8EC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Nose height<br />
| 0x8F0<br />
| 00-12<br />
| <br />
|-<br />
| Mouth style<br />
| 0x8F4<br />
| 00-23<br />
| Not ordered as in editor, read below<br />
|-<br />
| Mouth color<br />
| 0x8F8<br />
| 00-04<br />
| From top to bottom.<br />
|-<br />
| Mouth size<br />
| 0x8FC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mouth thickness<br />
| 0x900<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Mouth height<br />
| 0x904<br />
| 00-12<br />
| <br />
|-<br />
| Mustache style<br />
| 0x908<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Beard style<br />
| 0x90C<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Mustache/Beard color<br />
| 0x910<br />
| 00-07<br />
| From top to button.<br />
|-<br />
| Mustache size<br />
| 0x914<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mustache height<br />
| 0x918<br />
| 00-10<br />
| <br />
|-<br />
| Glasses style<br />
| 0x91C<br />
| 00-08<br />
| Order like in editor.<br />
|-<br />
| Glasses color<br />
| 0x920<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Glasses size<br />
| 0x924<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Glasses height<br />
| 0x928<br />
| 00-14<br />
| <br />
|-<br />
| Mole enable<br />
| 0x92C<br />
| 1 if enabled, 0 else.<br />
| <br />
|-<br />
| Mole size<br />
| 0x930<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mole horiz pos<br />
| 0x934<br />
| 00-10<br />
| <br />
|-<br />
| Mole vert pos<br />
| 0x938<br />
| 00-1E<br />
| <br />
|-<br />
| Mii height<br />
| 0x93C<br />
| 00-7F<br />
| <br />
|-<br />
| Mii weight<br />
| 0x940<br />
| 00-7F<br />
| <br />
|-<br />
| Mii name<br />
| 0x944-0x959<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Creator's name<br />
| 0x95A-96F<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Mii gender<br />
| 0x970<br />
| 0: Male, 1: Female<br />
| <br />
|-<br />
| Birthdate month<br />
| 0x974<br />
| 01-0C<br />
| <br />
|-<br />
| Birthdate day<br />
| 0x978<br />
| 01-1F<br />
| <br />
|-<br />
| Mii shirt color<br />
| 0x97C<br />
| 00-0B<br />
| Ordered like in editor.<br />
|-<br />
| Favorite<br />
| 0x980<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Allow copy<br />
| 0x981<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Unused byte?<br />
| 0x982<br />
| <br />
| <br />
|-<br />
| Allow sharing<br />
| 0x983<br />
| 0: true, 1: false<br />
|<br />
|-<br />
| ???<br />
| 0x984-0x98F<br />
| All zero?<br />
|<br />
|-<br />
| ???<br />
| 0x990-0x997<br />
| 4?<br />
|<br />
|}<br />
0x08815998: Same 4 bytes as encrypted Mii: first 4 bits for Mii type, 4 next for number of seconds since 01/01/2010 00:00:00 UTC+3 (should be verified by other country and region 3DS) divided by 2.<br />
0x0881599C: 6 bytes of MAC address of the 3DS that created the Mii.<br />
0x088159A2: 6 unknow uses bytes<br />
0x088159A8: Same 8 bytes as decrypted Mii at 0x04 through 0x0B. Seems NAND specific, kept the same on Miis created on same NAND but different 3DS via System Transfer. Might be a coincidence but the two first bytes are in ID0 folder name in the Nintendo 3DS folder.<br />
<br />
===Mapped Editor <-> Hex values===<br />
<br />
Most of the values are ordered (left button decreases, right increases, color choices are top to bottom...) but for most "main" part of the UI, where you choose the style of the part being edited, hex values has no correlation with displayed order.<br />
Here is a JSON that can go from a Part, a Page and Position to the right hex value. This is 0 indexed (eg: datas["face"][0][11]).<br />
<br />
<nowiki>{<br />
face: [<br />
0x00,0x01,0x08,<br />
0x02,0x03,0x09,<br />
0x04,0x05,0x0a,<br />
0x06,0x07,0x0b<br />
],<br />
hairs: [<br />
[0x21,0x2f,0x28,<br />
0x25,0x20,0x6b,<br />
0x30,0x33,0x37,<br />
0x46,0x2c,0x42],<br />
[0x34,0x32,0x26,<br />
0x31,0x2b,0x1f,<br />
0x38,0x44,0x3e,<br />
0x73,0x4c,0x77],<br />
[0x40,0x51,0x74,<br />
0x79,0x16,0x3a,<br />
0x3c,0x57,0x7d,<br />
0x75,0x49,0x4b],<br />
[0x2a,0x59,0x39,<br />
0x36,0x50,0x22,<br />
0x17,0x56,0x58,<br />
0x76,0x27,0x24],<br />
[0x2d,0x43,0x3b,<br />
0x41,0x29,0x1e,<br />
0x0c,0x10,0x0a,<br />
0x52,0x80,0x81],<br />
[0x0e,0x5f,0x69,<br />
0x64,0x06,0x14,<br />
0x5d,0x66,0x1b,<br />
0x04,0x11,0x6e]<br />
[0x7b,0x08,0x6a,<br />
0x48,0x03,0x15,<br />
0x00,0x62,0x3f,<br />
0x5a,0x0b,0x78],<br />
[0x05,0x4a,0x6c,<br />
0x5e,0x7c,0x19,<br />
0x63,0x45,0x23,<br />
0x0d,0x7a,0x71],<br />
[0x35,0x18,0x55,<br />
0x53,0x47,0x83,<br />
0x60,0x65,0x1d,<br />
0x07,0x0f,0x70],<br />
[0x4f,0x01,0x6d,<br />
0x7f,0x5b,0x1a,<br />
0x3d,0x67,0x02,<br />
0x4d,0x12,0x5c],<br />
[0x54,0x09,0x13,<br />
0x82,0x61,0x68,<br />
0x2e,0x4e,0x1c,<br />
0x72,0x7e,0x6f]<br />
],<br />
eyebrows: [<br />
[0x06,0x00,0x0c,<br />
0x01,0x09,0x13,<br />
0x07,0x15,0x08,<br />
0x11,0x05,0x04],<br />
[0x0b,0x0a,0x02,<br />
0x03,0x0e,0x14,<br />
0x0f,0x0d,0x16,<br />
0x12,0x10,0x17]<br />
],<br />
nose: [<br />
[0x01,0x0a,0x02,<br />
0x03,0x06,0x00,<br />
0x05,0x04,0x08,<br />
0x09,0x07,0x0B],<br />
[0x0d,0x0e,0x0c,<br />
0x11,0x10,0x0f]<br />
],<br />
mouth: [<br />
[0x17,0x01,0x13,<br />
0x15,0x16,0x05,<br />
0x00,0x08,0x0a,<br />
0x10,0x06,0x0d],<br />
[0x07,0x09,0x02,<br />
0x11,0x03,0x04,<br />
0x0f,0x0b,0x14,<br />
0x12,0x0e,0x0c],<br />
[0x1b,0x1e,0x18,<br />
0x19,0x1d,0x1c,<br />
0x1a,0x23,0x1f,<br />
0x22,0x21,0x20]<br />
]<br />
}</nowiki></div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Mii&diff=20922
Mii
2019-04-18T02:11:50Z
<p>Oreo639: Make more consistent</p>
<hr />
<div>Originally [http://wiibrew.org/wiki/Mii_Data created for the Nintendo Wii] (and backported to a selection of DS/i games), the '''Mii''' format was expanded with a larger selection of facial features and a new "copying" permission for the 3DS family, and later implemented as-is on Wii U.<br />
<br />
See [[Mii Maker]] for the application chiefly designed to create, edit, delete, and trade Miis or convert them from and to a QR code.<br />
<br />
The default endianness in this page is little-endian, unless explicitly specified.<br />
<br />
==Mii Database==<br />
Format of the Mii main database '''CFL_DB.dat''', found in [[Extdata#NAND_Shared_Extdata|shared extdata]] archive f0000000b.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x4<br />
| Header "CFOG" (Mii Maker section)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Header 0x00000100<br />
|-<br />
| 0x8<br />
| 0x23F0 (100 * 0x5C)<br />
| Array of owned (saved in Mii Maker) Miis. Order in file is unrelated to canonical order in-app.<br />
|-<br />
| 0x23F8<br />
| 0x4<br />
| Header "CFHE"<br />
|-<br />
| 0x23FC<br />
| 0x2<br />
| Linked list tail index. 0xFFFF if the list is empty<br />
|-<br />
| 0x23FE<br />
| 0x2<br />
| Linked list head index. 0xFFFF if the list is empty<br />
|-<br />
| 0x2400<br />
| 0xA410 (3000 * 0xE)<br />
| Linked list of objects? See chapter<br />
|-<br />
| 0xC810<br />
| 0xE<br />
| Padding?<br />
|-<br />
| 0xC81E<br />
| 0x2<br />
| Checksum of all of the above (the first 0xC81E byte). See section [[#Checksum|below]].<br />
|-<br />
| 0xC820<br />
| 0x4<br />
| Header "CFRA" (Invitations section)<br />
|-<br />
| 0xC824<br />
| 0x4<br />
| Mii count in this section. Maximum 100<br />
|-<br />
| 0xC828<br />
| 0x64 (100 * 0x1)<br />
| Order index of Mii in this section?<br />
|-<br />
| 0xC88C<br />
| 0x1C20 (100 * 0x48)<br />
| Array of Miis contributed from games, used for Mii Plaza "invitations" feature.<br/>The format isn't that of a full Mii. The "author" field is missing<br />
|-<br />
| 0xE4AC<br />
| 0x12<br />
| 01 00 [..] 00<br />
|-<br />
| 0xE4BE<br />
| 0x2<br />
| Checksum over the data above starting from 0xC820<br />
|-<br />
| 0xE4C0<br />
| 0x3D860 (3000 * 0x54)<br />
| Another array of Miis. Seems related to the CFHE section. <br/>The Mii format in this section is modified. The "author" field is missing, A 4-byte timestamp (seconds since 2000) together with 8-byte zeros(?) is appended at the end.<br />
|}<br />
When encrypted in QR codes, 4 additional bytes are added. Two null bytes and a CRC-16. It's the exact same CRC-16 as for the Wii blocks on the 0x5e first bytes. It seems that the CRC is ignored, the Mii Maker expecting the result of APT:Unwrap to detect integrity loss.<br />
<br />
==CFHE object==<br />
<br />
A 0xE-byte long linked list node. The format is 4-byte Mii ID (See Mii format) + 6-byte MAC + 2-byte previous node index (prev) + 2-byte next node index (next).<br />
<br />
An invalid node has value: ID = 0, MAC = 0, prev = 0x7FFF, next = 0x7FFF.<br />
<br />
The highest bit of these fields has some special meaning and isn't part of the index value.<br />
<br />
==Checksum==<br />
<br />
The algorithm used to verify the integrity of the database is based on [http://srecord.sourceforge.net/crc16-ccitt.html CRC16-CCITT], though it's an incorrect implementation. It is the same algorithm used to verify [http://wiibrew.org/wiki/Mii_Data#Block_format Mii Data on the Wii].<br />
<br />
To obtain the correct value for the checksum, apply the algorithm to the first 0xC81E bytes of the database. This can be done using [https://gbatemp.net/threads/tutorial-give-your-mii-gold-pants-and-use-it-for-streetpass.379146/page-24#post-6569186 FixCRC]; alternativly a pseudocode implementation of the checksum algorithm is given below:<br />
<br />
<source lang="python"><br />
def crc16_CCITTWii(u8[]: data) -> u16:<br />
"""Calculate a checksum of data using the CRC16-CCITT implementation of the Wii<br />
<br />
This implementation uses 0x0000 as the starting value, which is different<br />
from what CRC16-CCITT specifies.<br />
"""<br />
<br />
# note: a correct implementation of CRC16-CCITT<br />
# would initialize this to 0xffff<br />
u32 crc := 0x0<br />
<br />
for byte in data:<br />
# Iterate over every of the 8 bits in byte.<br />
# Begin with the most significant bit. (7, 6, ... , 1, 0)<br />
for bit in 7..0:<br />
# & - binary `and'; <</>> - bitshift left/right; ^ - binary `xor'<br />
crc := (<br />
(crc << 1) | ((byte >> bit) & 0x1)<br />
^ (0x1021 if crc & 0x8000 else 0)<br />
)<br />
<br />
for _ in 0..15:<br />
crc := (crc << 1) ^ (0x1021 if crc & 0x8000 else 0)<br />
<br />
# only return the lowest 16 bit of crc<br />
return (u16) (crc & 0xffff)<br />
<br />
checksum := crc16_CCITTWii(miidb[0:0xc81e]) # checksum over the first 0xc81e bytes<br />
</source><br />
<br />
==Mii format==<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x1<br />
| Always 3?<br />
|-<br />
| 0x1<br />
| 0x1<br />
| bit 0: allow copying<br/>bit 1: private name?<br/>bit 2-3: region lock (0=no lock, 1=JPN, 2=USA, 3=EUR)<br/>bit4-5:character set(0=JPN+USA+EUR, 1=CHN, 2=KOR, 3=TWN)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Mii position shown on the selection screen<br/>bit 0-3: page index <br/>bit 4-7: slot index<br />
|-<br />
| 0x3<br />
| 0x1<br />
| bit 0-3: ?<br/>bit 4-6: version? (1=Wii, 2=DSi, 3=3DS)<br />
|-<br />
| 0x4<br />
| 0x8<br />
| System ID (identifies owner, for purpose of enforcing editing restrictions and blue pants).<br/>Is not tied to the MAC address anymore.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Mii ID (big-endian 32bit unsigned integer):<br/>Bit 0..27: (bit[0..27] * 2) = date of creation (seconds since 01/01/2010 00:00:00)<br/>Bit 28: Always set?<br/>Bit 29: set for temporary Mii<br/>Bit 30: Set for DSi mii?<br/>Bit 31: not set iff Mii is special<br />
|-<br />
| 0x10<br />
| 0x6<br />
| Creator's full MAC<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Padding (0000)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| bit 0: sex (0 if male, 1 if female)<br/>bit 1-4: birthday month<br/>bit 5-9: birthday day<br/>bit 10-14: favorite color<br/>bit 15: favorite mii (0 if false, 1 if true)<br />
|-<br />
| 0x1A<br />
| 0x14<br />
| UTF-16 Mii Name (10 chars max, 0000 terminated)<br />
|-<br />
| 0x2E<br />
| 0x2<br />
| width & height<br />
|-<br />
| 0x30<br />
| 0x1<br />
| bit 0: disable sharing<br/>bit 1-4: face shape<br/>bit 5-7: skin color<br />
|-<br />
| 0x31<br />
| 0x1<br />
| bit 0-3: wrinkles<br/>bit 4-7: makeup<br />
|-<br />
| 0x32<br />
| 0x1<br />
| hair style<br />
|-<br />
| 0x33<br />
| 0x1<br />
| bit 0-2: hair color<br/>bit 3: flip hair<br />
|-<br />
| 0x34<br />
| 0x4<br />
| bit 0-5: eye style<br/>bit 6-8: eye color <br/>bit 9-12: eye scale <br/>bit 13-15: eye yscale<br/>bit 16-20: eye rotation<br/>bit 21-24: eye x spacing<br/>bit 25-29: eye y position<br />
|-<br />
| 0x38<br />
| 0x4<br />
| bit 0-4: eyebrow style<br/>bit 5-7: eyebrow color <br/>bit 8-11: eyebrow scale<br/>bit 12-14: eyebrow yscale <br/>bit 16-19: eyebrow rotation<br/>bit 21-24: eyebrow x spacing<br/>bit 25-29: eyebrow y position<br />
|-<br />
| 0x3C<br />
| 0x2<br />
| bit 0-4: nose style<br/>bit 5-8: nose scale<br/>bit 9-13: nose y position<br />
|-<br />
| 0x3E<br />
| 0x2<br />
| bit 0-5: mouse style<br/>biy 6-8: mouse color<br/>bit 9-12: mouse scale<br/>bit 13-15: mouse yscale<br />
|-<br />
| 0x40<br />
| 0x2<br />
| bit 0-4: mouse y position<br/>bit 5-7: mustach style<br />
|-<br />
| 0x42<br />
| 0x2<br />
| bit 0-2: beard style<br/>bit 3-5: beard color<br/>bit 6-9: mustache scale<br/>bit 10-14:mustache y position<br />
|-<br />
| 0x44<br />
| 0x2<br />
| bit 0-3: glasses style<br/>bit 4-6: glasses color<br/>bit 7-10: glasses scale<br/>bit 11-15: glasses y position<br />
|-<br />
| 0x46<br />
| 0x2<br />
| bit 0: enable mole<br/>bit 1-4: mole scale<br/>bit 5-9: mole x position<br/>bit 10-14: mole y position<br />
|-<br />
| 0x48<br />
| 0x14<br />
| UTF-16 Author Name (10 chars max, 0000 terminated)<br />
|}<br />
<br />
==Mii categories (pants colors)==<br />
<br />
====Special (gold) Miis====<br />
Specialness will override any other color and make the Mii non-editable.<br />
<br />
Copying is rumored to have to be disabled.<br />
<br />
Zeroed system-id and timestamp?<br />
<br />
====Imported (blue) Miis====<br />
Any (non-gold) Mii with a different System ID will appear as a foreign one.<br />
<br />
There is also a range of Mii IDs that are always foreign and uneditable, regardless of the System ID:<br />
<br />
<br />
====Regular (black/red) Miis====<br />
Always editable, since they can only appear as such on the console that created them.<br />
<br />
<br />
====Personal (red) Mii====<br />
A red Mii that happens to be the first in the file!<br />
<br />
==Mii values==<br />
Each of the following values were found with NTR Debugger:<br />
If you want to access the value, grab the given "NTR address" and add 0x08815000.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Data<br />
! NTR address<br />
! Variation (hex)<br />
! Notes<br />
|-<br />
| Face style<br />
| 0x894<br />
| 00-0B<br />
| Not ordered as in editor, read below<br />
|-<br />
| Face color<br />
| 0x898<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Wrinkles<br />
| 0x89C<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Makeup<br />
| 0x8A0<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Hair style<br />
| 0x8A4<br />
| 00-84<br />
| Not ordered as in editor, read below<br />
|-<br />
| Hair color<br />
| 0x8A8<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Hair flipped<br />
| 0x8AC<br />
| 1 if true<br />
| From top to bottom<br />
|-<br />
| Eye style<br />
| 0x8B0<br />
| 00-3C<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyes color<br />
| 0x8B4<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Eyes size<br />
| 0x8B8<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Eyes thickness<br />
| 0x8BC<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyes rotation<br />
| 0x8C0<br />
| 00-07<br />
| <br />
|-<br />
| Eyes spacing<br />
| 0x8C4<br />
| 00-0C<br />
| <br />
|-<br />
| Eyes height<br />
| 0x8C8<br />
| 00-12<br />
| <br />
|-<br />
| Eyebrows style<br />
| 0x8CC<br />
| 00-18<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyebrows color<br />
| 0x8D0<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Eyebrows size<br />
| 0x8D4<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows thickness<br />
| 0x8D8<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows rotation<br />
| 0x8DC<br />
| 00-0B<br />
| <br />
|-<br />
| Eyebrows spacing<br />
| 0x8E0<br />
| 00-0C<br />
| <br />
|-<br />
| Eyebrows height<br />
| 0x8E4<br />
| 03-12<br />
| Yup, minimum is 0x03<br />
|-<br />
| Nose style<br />
| 0x8E8<br />
| 00-11<br />
| Not ordered as in editor, read below<br />
|-<br />
| Nose size<br />
| 0x8EC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Nose height<br />
| 0x8F0<br />
| 00-12<br />
| <br />
|-<br />
| Mouth style<br />
| 0x8F4<br />
| 00-23<br />
| Not ordered as in editor, read below<br />
|-<br />
| Mouth color<br />
| 0x8F8<br />
| 00-04<br />
| From top to bottom.<br />
|-<br />
| Mouth size<br />
| 0x8FC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mouth thickness<br />
| 0x900<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Mouth height<br />
| 0x904<br />
| 00-12<br />
| <br />
|-<br />
| Mustache style<br />
| 0x908<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Beard style<br />
| 0x90C<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Mustache/Beard color<br />
| 0x910<br />
| 00-07<br />
| From top to button.<br />
|-<br />
| Mustache size<br />
| 0x914<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mustache height<br />
| 0x918<br />
| 00-10<br />
| <br />
|-<br />
| Glasses style<br />
| 0x91C<br />
| 00-08<br />
| Order like in editor.<br />
|-<br />
| Glasses color<br />
| 0x920<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Glasses size<br />
| 0x924<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Glasses height<br />
| 0x928<br />
| 00-14<br />
| <br />
|-<br />
| Mole enable<br />
| 0x92C<br />
| 1 if enabled, 0 else.<br />
| <br />
|-<br />
| Mole size<br />
| 0x930<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mole horiz pos<br />
| 0x934<br />
| 00-10<br />
| <br />
|-<br />
| Mole vert pos<br />
| 0x938<br />
| 00-1E<br />
| <br />
|-<br />
| Mii height<br />
| 0x93C<br />
| 00-7F<br />
| <br />
|-<br />
| Mii weight<br />
| 0x940<br />
| 00-7F<br />
| <br />
|-<br />
| Mii name<br />
| 0x944-0x959<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Creator's name<br />
| 0x95A-96F<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Mii gender<br />
| 0x970<br />
| 0: Male, 1: Female<br />
| <br />
|-<br />
| Birthdate month<br />
| 0x974<br />
| 01-0C<br />
| <br />
|-<br />
| Birthdate day<br />
| 0x978<br />
| 01-1F<br />
| <br />
|-<br />
| Mii shirt color<br />
| 0x97C<br />
| 00-0B<br />
| Ordered like in editor.<br />
|-<br />
| Favorite<br />
| 0x980<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Allow copy<br />
| 0x981<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Unused byte?<br />
| 0x982<br />
| <br />
| <br />
|-<br />
| Allow sharing<br />
| 0x983<br />
| 0: true, 1: false<br />
|<br />
|-<br />
| ???<br />
| 0x984-0x98F<br />
| All zero?<br />
|<br />
|-<br />
| ???<br />
| 0x990-0x997<br />
| 4?<br />
|<br />
|}<br />
0x08815998: Same 4 bytes as encrypted Mii: first 4 bits for Mii type, 4 next for number of seconds since 01/01/2010 00:00:00 UTC+3 (should be verified by other country and region 3DS) divided by 2.<br />
0x0881599C: 6 bytes of MAC address of the 3DS that created the Mii.<br />
0x088159A2: 6 unknow uses bytes<br />
0x088159A8: Same 8 bytes as decrypted Mii at 0x04 through 0x0B. Seems NAND specific, kept the same on Miis created on same NAND but different 3DS via System Transfer. Might be a coincidence but the two first bytes are in ID0 folder name in the Nintendo 3DS folder.<br />
<br />
===Mapped Editor <-> Hex values===<br />
<br />
Most of the values are ordered (left button decreases, right increases, color choices are top to bottom...) but for most "main" part of the UI, where you choose the style of the part being edited, hex values has no correlation with displayed order.<br />
Here is a JSON that can go from a Part, a Page and Position to the right hex value. This is 0 indexed (eg: datas["face"][0][11]).<br />
<br />
<nowiki>{<br />
face: [<br />
0x00,0x01,0x08,<br />
0x02,0x03,0x09,<br />
0x04,0x05,0x0a,<br />
0x06,0x07,0x0b<br />
],<br />
hairs: [<br />
[0x21,0x2f,0x28,<br />
0x25,0x20,0x6b,<br />
0x30,0x33,0x37,<br />
0x46,0x2c,0x42],<br />
[0x34,0x32,0x26,<br />
0x31,0x2b,0x1f,<br />
0x38,0x44,0x3e,<br />
0x73,0x4c,0x77],<br />
[0x40,0x51,0x74,<br />
0x79,0x16,0x3a,<br />
0x3c,0x57,0x7d,<br />
0x75,0x49,0x4b],<br />
[0x2a,0x59,0x39,<br />
0x36,0x50,0x22,<br />
0x17,0x56,0x58,<br />
0x76,0x27,0x24],<br />
[0x2d,0x43,0x3b,<br />
0x41,0x29,0x1e,<br />
0x0c,0x10,0x0a,<br />
0x52,0x80,0x81],<br />
[0x0e,0x5f,0x69,<br />
0x64,0x06,0x14,<br />
0x5d,0x66,0x1b,<br />
0x04,0x11,0x6e]<br />
[0x7b,0x08,0x6a,<br />
0x48,0x03,0x15,<br />
0x00,0x62,0x3f,<br />
0x5a,0x0b,0x78],<br />
[0x05,0x4a,0x6c,<br />
0x5e,0x7c,0x19,<br />
0x63,0x45,0x23,<br />
0x0d,0x7a,0x71],<br />
[0x35,0x18,0x55,<br />
0x53,0x47,0x83,<br />
0x60,0x65,0x1d,<br />
0x07,0x0f,0x70],<br />
[0x4f,0x01,0x6d,<br />
0x7f,0x5b,0x1a,<br />
0x3d,0x67,0x02,<br />
0x4d,0x12,0x5c],<br />
[0x54,0x09,0x13,<br />
0x82,0x61,0x68,<br />
0x2e,0x4e,0x1c,<br />
0x72,0x7e,0x6f]<br />
],<br />
eyebrows: [<br />
[0x06,0x00,0x0c,<br />
0x01,0x09,0x13,<br />
0x07,0x15,0x08,<br />
0x11,0x05,0x04],<br />
[0x0b,0x0a,0x02,<br />
0x03,0x0e,0x14,<br />
0x0f,0x0d,0x16,<br />
0x12,0x10,0x17]<br />
],<br />
nose: [<br />
[0x01,0x0a,0x02,<br />
0x03,0x06,0x00,<br />
0x05,0x04,0x08,<br />
0x09,0x07,0x0B],<br />
[0x0d,0x0e,0x0c,<br />
0x11,0x10,0x0f]<br />
],<br />
mouth: [<br />
[0x17,0x01,0x13,<br />
0x15,0x16,0x05,<br />
0x00,0x08,0x0a,<br />
0x10,0x06,0x0d],<br />
[0x07,0x09,0x02,<br />
0x11,0x03,0x04,<br />
0x0f,0x0b,0x14,<br />
0x12,0x0e,0x0c],<br />
[0x1b,0x1e,0x18,<br />
0x19,0x1d,0x1c,<br />
0x1a,0x23,0x1f,<br />
0x22,0x21,0x20]<br />
]<br />
}</nowiki></div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Mii&diff=20921
Mii
2019-04-18T02:09:31Z
<p>Oreo639: Add further clarification</p>
<hr />
<div>Originally [http://wiibrew.org/wiki/Mii_Data created for the Nintendo Wii] (and backported to a selection of DS/i games), the '''Mii''' format was expanded with a larger selection of facial features and a new "copying" permission for the 3DS family, and later implemented as-is on Wii U.<br />
<br />
See [[Mii Maker]] for the application chiefly designed to create, edit, delete, and trade Miis or convert them from and to a QR code.<br />
<br />
The default endianness in this page is little-endian, unless explicitly specified.<br />
<br />
==Mii Database==<br />
Format of the Mii main database '''CFL_DB.dat''', found in [[Extdata#NAND_Shared_Extdata|shared extdata]] archive f0000000b.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x4<br />
| Header "CFOG" (Mii Maker section)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Header 0x00000100<br />
|-<br />
| 0x8<br />
| 0x23F0 (100 * 0x5C)<br />
| Array of owned (saved in Mii Maker) Miis. Order in file is unrelated to canonical order in-app.<br />
|-<br />
| 0x23F8<br />
| 0x4<br />
| Header "CFHE"<br />
|-<br />
| 0x23FC<br />
| 0x2<br />
| Linked list tail index. 0xFFFF if the list is empty<br />
|-<br />
| 0x23FE<br />
| 0x2<br />
| Linked list head index. 0xFFFF if the list is empty<br />
|-<br />
| 0x2400<br />
| 0xA410 (3000 * 0xE)<br />
| Linked list of objects? See chapter<br />
|-<br />
| 0xC810<br />
| 0xE<br />
| Padding?<br />
|-<br />
| 0xC81E<br />
| 0x2<br />
| Checksum of all of the above (the first 0xC81E byte). See section [[#Checksum|below]].<br />
|-<br />
| 0xC820<br />
| 0x4<br />
| Header "CFRA" (Invitations section)<br />
|-<br />
| 0xC824<br />
| 0x4<br />
| Mii count in this section. Maximum 100<br />
|-<br />
| 0xC828<br />
| 0x64 (100 * 0x1)<br />
| Order index of Mii in this section?<br />
|-<br />
| 0xC88C<br />
| 0x1C20 (100 * 0x48)<br />
| Array of Miis contributed from games, used for Mii Plaza "invitations" feature.<br/>The format isn't that of a full Mii. The "author" field is missing<br />
|-<br />
| 0xE4AC<br />
| 0x12<br />
| 01 00 [..] 00<br />
|-<br />
| 0xE4BE<br />
| 0x2<br />
| Checksum over the data above starting from 0xC820<br />
|-<br />
| 0xE4C0<br />
| 0x3D860 (3000 * 0x54)<br />
| Another array of Miis. Seems related to the CFHE section. <br/>The Mii format in this section is modified. The "author" field is missing, A 4-byte timestamp (seconds since 2000) together with 8-byte zeros(?) is appended at the end.<br />
|}<br />
When encrypted in QR codes, 4 additional bytes are added. Two null bytes and a CRC-16. It's the exact same CRC-16 as for the Wii blocks on the 0x5e first bytes. It seems that the CRC is ignored, the Mii Maker expecting the result of APT:Unwrap to detect integrity loss.<br />
<br />
==CFHE object==<br />
<br />
A 0xE-byte long linked list node. The format is 4-byte Mii ID (See Mii format) + 6-byte MAC + 2-byte previous node index (prev) + 2-byte next node index (next).<br />
<br />
An invalid node has value: ID = 0, MAC = 0, prev = 0x7FFF, next = 0x7FFF.<br />
<br />
The highest bit of these fields has some special meaning and isn't part of the index value.<br />
<br />
==Checksum==<br />
<br />
The algorithm used to verify the integrity of the database is based on [http://srecord.sourceforge.net/crc16-ccitt.html CRC16-CCITT], though it's an incorrect implementation. It is the same algorithm used to verify [http://wiibrew.org/wiki/Mii_Data#Block_format Mii Data on the Wii].<br />
<br />
To obtain the correct value for the checksum, apply the algorithm to the first 0xC81E bytes of the database. This can be done using [https://gbatemp.net/threads/tutorial-give-your-mii-gold-pants-and-use-it-for-streetpass.379146/page-24#post-6569186 FixCRC]; alternativly a pseudocode implementation of the checksum algorithm is given below:<br />
<br />
<source lang="python"><br />
def crc16_CCITTWii(u8[]: data) -> u16:<br />
"""Calculate a checksum of data using the CRC16-CCITT implementation of the Wii<br />
<br />
This implementation uses 0x0000 as the starting value, which is different<br />
from what CRC16-CCITT specifies.<br />
"""<br />
<br />
# note: a correct implementation of CRC16-CCITT<br />
# would initialize this to 0xffff<br />
u32 crc := 0x0<br />
<br />
for byte in data:<br />
# Iterate over every of the 8 bits in byte.<br />
# Begin with the most significant bit. (7, 6, ... , 1, 0)<br />
for bit in 7..0:<br />
# & - binary `and'; <</>> - bitshift left/right; ^ - binary `xor'<br />
crc := (<br />
(crc << 1) | ((byte >> bit) & 0x1)<br />
^ (0x1021 if crc & 0x8000 else 0)<br />
)<br />
<br />
for _ in 0..15:<br />
crc := (crc << 1) ^ (0x1021 if crc & 0x8000 else 0)<br />
<br />
# only return the lowest 16 bit of crc<br />
return (u16) (crc & 0xffff)<br />
<br />
checksum := crc16_CCITTWii(miidb[0:0xc81e]) # checksum over the first 0xc81e bytes<br />
</source><br />
<br />
==Mii format==<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x1<br />
| Always 3?<br />
|-<br />
| 0x1<br />
| 0x1<br />
| bit 0: allow copying<br/>bit 1: private name?<br/>bit 2-3: region lock (0=no lock, 1=JPN, 2=USA, 3=EUR)<br/>bit4-5:character set(0=JPN+USA+EUR, 1=CHN, 2=KOR, 3=TWN)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Mii position shown on the selection screen<br/>bit 0-3: page index <br/>bit 4-7: slot index<br />
|-<br />
| 0x3<br />
| 0x1<br />
| bit 0-3: ?<br/>bit 4-6: version? (1=Wii, 2=DSi, 3=3DS)<br />
|-<br />
| 0x4<br />
| 0x8<br />
| System ID (identifies owner, for purpose of enforcing editing restrictions and blue pants).<br/>Is not tied to the MAC address anymore.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Mii ID (big-endian 32bit unsigned integer):<br/>Bit 0..27: (bit[0..27] * 2) = date of creation (seconds since 01/01/2010 00:00:00)<br/>Bit 28: Always set?<br/>Bit 29: set for temporary Mii<br/>Bit 30: Set for DSi mii?<br/>Bit 31: not set iff Mii is special<br />
|-<br />
| 0x10<br />
| 0x6<br />
| Creator's full MAC<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Padding (0000)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| bit 0: sex (0 if male, 1 if female)<br/>bit 1-4: birthday month<br/>bit 5-9: birthday day<br/>bit 10-14: favorite color<br/>bit 15: favorite mii (1 if true, 0 if false)<br />
|-<br />
| 0x1A<br />
| 0x14<br />
| UTF-16 Mii Name (10 chars max, 0000 terminated)<br />
|-<br />
| 0x2E<br />
| 0x2<br />
| width & height<br />
|-<br />
| 0x30<br />
| 0x1<br />
| bit 0: disable sharing<br/>bit 1-4: face shape<br/>bit 5-7: skin color<br />
|-<br />
| 0x31<br />
| 0x1<br />
| bit 0-3: wrinkles<br/>bit 4-7: makeup<br />
|-<br />
| 0x32<br />
| 0x1<br />
| hair style<br />
|-<br />
| 0x33<br />
| 0x1<br />
| bit 0-2: hair color<br/>bit 3: flip hair<br />
|-<br />
| 0x34<br />
| 0x4<br />
| bit 0-5: eye style<br/>bit 6-8: eye color <br/>bit 9-12: eye scale <br/>bit 13-15: eye yscale<br/>bit 16-20: eye rotation<br/>bit 21-24: eye x spacing<br/>bit 25-29: eye y position<br />
|-<br />
| 0x38<br />
| 0x4<br />
| bit 0-4: eyebrow style<br/>bit 5-7: eyebrow color <br/>bit 8-11: eyebrow scale<br/>bit 12-14: eyebrow yscale <br/>bit 16-19: eyebrow rotation<br/>bit 21-24: eyebrow x spacing<br/>bit 25-29: eyebrow y position<br />
|-<br />
| 0x3C<br />
| 0x2<br />
| bit 0-4: nose style<br/>bit 5-8: nose scale<br/>bit 9-13: nose y position<br />
|-<br />
| 0x3E<br />
| 0x2<br />
| bit 0-5: mouse style<br/>biy 6-8: mouse color<br/>bit 9-12: mouse scale<br/>bit 13-15: mouse yscale<br />
|-<br />
| 0x40<br />
| 0x2<br />
| bit 0-4: mouse y position<br/>bit 5-7: mustach style<br />
|-<br />
| 0x42<br />
| 0x2<br />
| bit 0-2: beard style<br/>bit 3-5: beard color<br/>bit 6-9: mustache scale<br/>bit 10-14:mustache y position<br />
|-<br />
| 0x44<br />
| 0x2<br />
| bit 0-3: glasses style<br/>bit 4-6: glasses color<br/>bit 7-10: glasses scale<br/>bit 11-15: glasses y position<br />
|-<br />
| 0x46<br />
| 0x2<br />
| bit 0: enable mole<br/>bit 1-4: mole scale<br/>bit 5-9: mole x position<br/>bit 10-14: mole y position<br />
|-<br />
| 0x48<br />
| 0x14<br />
| UTF-16 Author Name (10 chars max, 0000 terminated)<br />
|}<br />
<br />
==Mii categories (pants colors)==<br />
<br />
====Special (gold) Miis====<br />
Specialness will override any other color and make the Mii non-editable.<br />
<br />
Copying is rumored to have to be disabled.<br />
<br />
Zeroed system-id and timestamp?<br />
<br />
====Imported (blue) Miis====<br />
Any (non-gold) Mii with a different System ID will appear as a foreign one.<br />
<br />
There is also a range of Mii IDs that are always foreign and uneditable, regardless of the System ID:<br />
<br />
<br />
====Regular (black/red) Miis====<br />
Always editable, since they can only appear as such on the console that created them.<br />
<br />
<br />
====Personal (red) Mii====<br />
A red Mii that happens to be the first in the file!<br />
<br />
==Mii values==<br />
Each of the following values were found with NTR Debugger:<br />
If you want to access the value, grab the given "NTR address" and add 0x08815000.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Data<br />
! NTR address<br />
! Variation (hex)<br />
! Notes<br />
|-<br />
| Face style<br />
| 0x894<br />
| 00-0B<br />
| Not ordered as in editor, read below<br />
|-<br />
| Face color<br />
| 0x898<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Wrinkles<br />
| 0x89C<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Makeup<br />
| 0x8A0<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Hair style<br />
| 0x8A4<br />
| 00-84<br />
| Not ordered as in editor, read below<br />
|-<br />
| Hair color<br />
| 0x8A8<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Hair flipped<br />
| 0x8AC<br />
| 1 if true<br />
| From top to bottom<br />
|-<br />
| Eye style<br />
| 0x8B0<br />
| 00-3C<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyes color<br />
| 0x8B4<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Eyes size<br />
| 0x8B8<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Eyes thickness<br />
| 0x8BC<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyes rotation<br />
| 0x8C0<br />
| 00-07<br />
| <br />
|-<br />
| Eyes spacing<br />
| 0x8C4<br />
| 00-0C<br />
| <br />
|-<br />
| Eyes height<br />
| 0x8C8<br />
| 00-12<br />
| <br />
|-<br />
| Eyebrows style<br />
| 0x8CC<br />
| 00-18<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyebrows color<br />
| 0x8D0<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Eyebrows size<br />
| 0x8D4<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows thickness<br />
| 0x8D8<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows rotation<br />
| 0x8DC<br />
| 00-0B<br />
| <br />
|-<br />
| Eyebrows spacing<br />
| 0x8E0<br />
| 00-0C<br />
| <br />
|-<br />
| Eyebrows height<br />
| 0x8E4<br />
| 03-12<br />
| Yup, minimum is 0x03<br />
|-<br />
| Nose style<br />
| 0x8E8<br />
| 00-11<br />
| Not ordered as in editor, read below<br />
|-<br />
| Nose size<br />
| 0x8EC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Nose height<br />
| 0x8F0<br />
| 00-12<br />
| <br />
|-<br />
| Mouth style<br />
| 0x8F4<br />
| 00-23<br />
| Not ordered as in editor, read below<br />
|-<br />
| Mouth color<br />
| 0x8F8<br />
| 00-04<br />
| From top to bottom.<br />
|-<br />
| Mouth size<br />
| 0x8FC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mouth thickness<br />
| 0x900<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Mouth height<br />
| 0x904<br />
| 00-12<br />
| <br />
|-<br />
| Mustache style<br />
| 0x908<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Beard style<br />
| 0x90C<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Mustache/Beard color<br />
| 0x910<br />
| 00-07<br />
| From top to button.<br />
|-<br />
| Mustache size<br />
| 0x914<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mustache height<br />
| 0x918<br />
| 00-10<br />
| <br />
|-<br />
| Glasses style<br />
| 0x91C<br />
| 00-08<br />
| Order like in editor.<br />
|-<br />
| Glasses color<br />
| 0x920<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Glasses size<br />
| 0x924<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Glasses height<br />
| 0x928<br />
| 00-14<br />
| <br />
|-<br />
| Mole enable<br />
| 0x92C<br />
| 1 if enabled, 0 else.<br />
| <br />
|-<br />
| Mole size<br />
| 0x930<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mole horiz pos<br />
| 0x934<br />
| 00-10<br />
| <br />
|-<br />
| Mole vert pos<br />
| 0x938<br />
| 00-1E<br />
| <br />
|-<br />
| Mii height<br />
| 0x93C<br />
| 00-7F<br />
| <br />
|-<br />
| Mii weight<br />
| 0x940<br />
| 00-7F<br />
| <br />
|-<br />
| Mii name<br />
| 0x944-0x959<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Creator's name<br />
| 0x95A-96F<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Mii gender<br />
| 0x970<br />
| 0: Male, 1: Female<br />
| <br />
|-<br />
| Birthdate month<br />
| 0x974<br />
| 01-0C<br />
| <br />
|-<br />
| Birthdate day<br />
| 0x978<br />
| 01-1F<br />
| <br />
|-<br />
| Mii shirt color<br />
| 0x97C<br />
| 00-0B<br />
| Ordered like in editor.<br />
|-<br />
| Favorite<br />
| 0x980<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Allow copy<br />
| 0x981<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Unused byte?<br />
| 0x982<br />
| <br />
| <br />
|-<br />
| Allow sharing<br />
| 0x983<br />
| 0: true, 1: false<br />
|<br />
|-<br />
| ???<br />
| 0x984-0x98F<br />
| All zero?<br />
|<br />
|-<br />
| ???<br />
| 0x990-0x997<br />
| 4?<br />
|<br />
|}<br />
0x08815998: Same 4 bytes as encrypted Mii: first 4 bits for Mii type, 4 next for number of seconds since 01/01/2010 00:00:00 UTC+3 (should be verified by other country and region 3DS) divided by 2.<br />
0x0881599C: 6 bytes of MAC address of the 3DS that created the Mii.<br />
0x088159A2: 6 unknow uses bytes<br />
0x088159A8: Same 8 bytes as decrypted Mii at 0x04 through 0x0B. Seems NAND specific, kept the same on Miis created on same NAND but different 3DS via System Transfer. Might be a coincidence but the two first bytes are in ID0 folder name in the Nintendo 3DS folder.<br />
<br />
===Mapped Editor <-> Hex values===<br />
<br />
Most of the values are ordered (left button decreases, right increases, color choices are top to bottom...) but for most "main" part of the UI, where you choose the style of the part being edited, hex values has no correlation with displayed order.<br />
Here is a JSON that can go from a Part, a Page and Position to the right hex value. This is 0 indexed (eg: datas["face"][0][11]).<br />
<br />
<nowiki>{<br />
face: [<br />
0x00,0x01,0x08,<br />
0x02,0x03,0x09,<br />
0x04,0x05,0x0a,<br />
0x06,0x07,0x0b<br />
],<br />
hairs: [<br />
[0x21,0x2f,0x28,<br />
0x25,0x20,0x6b,<br />
0x30,0x33,0x37,<br />
0x46,0x2c,0x42],<br />
[0x34,0x32,0x26,<br />
0x31,0x2b,0x1f,<br />
0x38,0x44,0x3e,<br />
0x73,0x4c,0x77],<br />
[0x40,0x51,0x74,<br />
0x79,0x16,0x3a,<br />
0x3c,0x57,0x7d,<br />
0x75,0x49,0x4b],<br />
[0x2a,0x59,0x39,<br />
0x36,0x50,0x22,<br />
0x17,0x56,0x58,<br />
0x76,0x27,0x24],<br />
[0x2d,0x43,0x3b,<br />
0x41,0x29,0x1e,<br />
0x0c,0x10,0x0a,<br />
0x52,0x80,0x81],<br />
[0x0e,0x5f,0x69,<br />
0x64,0x06,0x14,<br />
0x5d,0x66,0x1b,<br />
0x04,0x11,0x6e]<br />
[0x7b,0x08,0x6a,<br />
0x48,0x03,0x15,<br />
0x00,0x62,0x3f,<br />
0x5a,0x0b,0x78],<br />
[0x05,0x4a,0x6c,<br />
0x5e,0x7c,0x19,<br />
0x63,0x45,0x23,<br />
0x0d,0x7a,0x71],<br />
[0x35,0x18,0x55,<br />
0x53,0x47,0x83,<br />
0x60,0x65,0x1d,<br />
0x07,0x0f,0x70],<br />
[0x4f,0x01,0x6d,<br />
0x7f,0x5b,0x1a,<br />
0x3d,0x67,0x02,<br />
0x4d,0x12,0x5c],<br />
[0x54,0x09,0x13,<br />
0x82,0x61,0x68,<br />
0x2e,0x4e,0x1c,<br />
0x72,0x7e,0x6f]<br />
],<br />
eyebrows: [<br />
[0x06,0x00,0x0c,<br />
0x01,0x09,0x13,<br />
0x07,0x15,0x08,<br />
0x11,0x05,0x04],<br />
[0x0b,0x0a,0x02,<br />
0x03,0x0e,0x14,<br />
0x0f,0x0d,0x16,<br />
0x12,0x10,0x17]<br />
],<br />
nose: [<br />
[0x01,0x0a,0x02,<br />
0x03,0x06,0x00,<br />
0x05,0x04,0x08,<br />
0x09,0x07,0x0B],<br />
[0x0d,0x0e,0x0c,<br />
0x11,0x10,0x0f]<br />
],<br />
mouth: [<br />
[0x17,0x01,0x13,<br />
0x15,0x16,0x05,<br />
0x00,0x08,0x0a,<br />
0x10,0x06,0x0d],<br />
[0x07,0x09,0x02,<br />
0x11,0x03,0x04,<br />
0x0f,0x0b,0x14,<br />
0x12,0x0e,0x0c],<br />
[0x1b,0x1e,0x18,<br />
0x19,0x1d,0x1c,<br />
0x1a,0x23,0x1f,<br />
0x22,0x21,0x20]<br />
]<br />
}</nowiki></div>
Oreo639
https://www.3dbrew.org/w/index.php?title=Mii&diff=20920
Mii
2019-04-18T02:04:54Z
<p>Oreo639: Document offset 0x18 in mii format</p>
<hr />
<div>Originally [http://wiibrew.org/wiki/Mii_Data created for the Nintendo Wii] (and backported to a selection of DS/i games), the '''Mii''' format was expanded with a larger selection of facial features and a new "copying" permission for the 3DS family, and later implemented as-is on Wii U.<br />
<br />
See [[Mii Maker]] for the application chiefly designed to create, edit, delete, and trade Miis or convert them from and to a QR code.<br />
<br />
The default endianness in this page is little-endian, unless explicitly specified.<br />
<br />
==Mii Database==<br />
Format of the Mii main database '''CFL_DB.dat''', found in [[Extdata#NAND_Shared_Extdata|shared extdata]] archive f0000000b.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x4<br />
| Header "CFOG" (Mii Maker section)<br />
|-<br />
| 0x4<br />
| 0x4<br />
| Header 0x00000100<br />
|-<br />
| 0x8<br />
| 0x23F0 (100 * 0x5C)<br />
| Array of owned (saved in Mii Maker) Miis. Order in file is unrelated to canonical order in-app.<br />
|-<br />
| 0x23F8<br />
| 0x4<br />
| Header "CFHE"<br />
|-<br />
| 0x23FC<br />
| 0x2<br />
| Linked list tail index. 0xFFFF if the list is empty<br />
|-<br />
| 0x23FE<br />
| 0x2<br />
| Linked list head index. 0xFFFF if the list is empty<br />
|-<br />
| 0x2400<br />
| 0xA410 (3000 * 0xE)<br />
| Linked list of objects? See chapter<br />
|-<br />
| 0xC810<br />
| 0xE<br />
| Padding?<br />
|-<br />
| 0xC81E<br />
| 0x2<br />
| Checksum of all of the above (the first 0xC81E byte). See section [[#Checksum|below]].<br />
|-<br />
| 0xC820<br />
| 0x4<br />
| Header "CFRA" (Invitations section)<br />
|-<br />
| 0xC824<br />
| 0x4<br />
| Mii count in this section. Maximum 100<br />
|-<br />
| 0xC828<br />
| 0x64 (100 * 0x1)<br />
| Order index of Mii in this section?<br />
|-<br />
| 0xC88C<br />
| 0x1C20 (100 * 0x48)<br />
| Array of Miis contributed from games, used for Mii Plaza "invitations" feature.<br/>The format isn't that of a full Mii. The "author" field is missing<br />
|-<br />
| 0xE4AC<br />
| 0x12<br />
| 01 00 [..] 00<br />
|-<br />
| 0xE4BE<br />
| 0x2<br />
| Checksum over the data above starting from 0xC820<br />
|-<br />
| 0xE4C0<br />
| 0x3D860 (3000 * 0x54)<br />
| Another array of Miis. Seems related to the CFHE section. <br/>The Mii format in this section is modified. The "author" field is missing, A 4-byte timestamp (seconds since 2000) together with 8-byte zeros(?) is appended at the end.<br />
|}<br />
When encrypted in QR codes, 4 additional bytes are added. Two null bytes and a CRC-16. It's the exact same CRC-16 as for the Wii blocks on the 0x5e first bytes. It seems that the CRC is ignored, the Mii Maker expecting the result of APT:Unwrap to detect integrity loss.<br />
<br />
==CFHE object==<br />
<br />
A 0xE-byte long linked list node. The format is 4-byte Mii ID (See Mii format) + 6-byte MAC + 2-byte previous node index (prev) + 2-byte next node index (next).<br />
<br />
An invalid node has value: ID = 0, MAC = 0, prev = 0x7FFF, next = 0x7FFF.<br />
<br />
The highest bit of these fields has some special meaning and isn't part of the index value.<br />
<br />
==Checksum==<br />
<br />
The algorithm used to verify the integrity of the database is based on [http://srecord.sourceforge.net/crc16-ccitt.html CRC16-CCITT], though it's an incorrect implementation. It is the same algorithm used to verify [http://wiibrew.org/wiki/Mii_Data#Block_format Mii Data on the Wii].<br />
<br />
To obtain the correct value for the checksum, apply the algorithm to the first 0xC81E bytes of the database. This can be done using [https://gbatemp.net/threads/tutorial-give-your-mii-gold-pants-and-use-it-for-streetpass.379146/page-24#post-6569186 FixCRC]; alternativly a pseudocode implementation of the checksum algorithm is given below:<br />
<br />
<source lang="python"><br />
def crc16_CCITTWii(u8[]: data) -> u16:<br />
"""Calculate a checksum of data using the CRC16-CCITT implementation of the Wii<br />
<br />
This implementation uses 0x0000 as the starting value, which is different<br />
from what CRC16-CCITT specifies.<br />
"""<br />
<br />
# note: a correct implementation of CRC16-CCITT<br />
# would initialize this to 0xffff<br />
u32 crc := 0x0<br />
<br />
for byte in data:<br />
# Iterate over every of the 8 bits in byte.<br />
# Begin with the most significant bit. (7, 6, ... , 1, 0)<br />
for bit in 7..0:<br />
# & - binary `and'; <</>> - bitshift left/right; ^ - binary `xor'<br />
crc := (<br />
(crc << 1) | ((byte >> bit) & 0x1)<br />
^ (0x1021 if crc & 0x8000 else 0)<br />
)<br />
<br />
for _ in 0..15:<br />
crc := (crc << 1) ^ (0x1021 if crc & 0x8000 else 0)<br />
<br />
# only return the lowest 16 bit of crc<br />
return (u16) (crc & 0xffff)<br />
<br />
checksum := crc16_CCITTWii(miidb[0:0xc81e]) # checksum over the first 0xc81e bytes<br />
</source><br />
<br />
==Mii format==<br />
<br />
{| class="wikitable"<br />
|-<br />
! Offset<br />
! Length<br />
! <br />
|-<br />
| 0x0<br />
| 0x1<br />
| Always 3?<br />
|-<br />
| 0x1<br />
| 0x1<br />
| bit 0: allow copying<br/>bit 1: private name?<br/>bit 2-3: region lock (0=no lock, 1=JPN, 2=USA, 3=EUR)<br/>bit4-5:character set(0=JPN+USA+EUR, 1=CHN, 2=KOR, 3=TWN)<br />
|-<br />
| 0x2<br />
| 0x1<br />
| Mii position shown on the selection screen<br/>bit 0-3: page index <br/>bit 4-7: slot index<br />
|-<br />
| 0x3<br />
| 0x1<br />
| bit 0-3: ?<br/>bit 4-6: version? (1=Wii, 2=DSi, 3=3DS)<br />
|-<br />
| 0x4<br />
| 0x8<br />
| System ID (identifies owner, for purpose of enforcing editing restrictions and blue pants).<br/>Is not tied to the MAC address anymore.<br />
|-<br />
| 0xC<br />
| 0x4<br />
| Mii ID (big-endian 32bit unsigned integer):<br/>Bit 0..27: (bit[0..27] * 2) = date of creation (seconds since 01/01/2010 00:00:00)<br/>Bit 28: Always set?<br/>Bit 29: set for temporary Mii<br/>Bit 30: Set for DSi mii?<br/>Bit 31: not set iff Mii is special<br />
|-<br />
| 0x10<br />
| 0x6<br />
| Creator's full MAC<br />
|-<br />
| 0x16<br />
| 0x2<br />
| Padding (0000)<br />
|-<br />
| 0x18<br />
| 0x2<br />
| bit 0: sex<br/>bit 1-4: birthday month<br/>bit 5-9: birthday day<br/>bit 10-14: favorite color<br/>bit 15: favorite mii (1 if true 0 if false)<br />
|-<br />
| 0x1A<br />
| 0x14<br />
| UTF-16 Mii Name (10 chars max, 0000 terminated)<br />
|-<br />
| 0x2E<br />
| 0x2<br />
| width & height<br />
|-<br />
| 0x30<br />
| 0x1<br />
| bit 0: disable sharing<br/>bit 1-4: face shape<br/>bit 5-7: skin color<br />
|-<br />
| 0x31<br />
| 0x1<br />
| bit 0-3: wrinkles<br/>bit 4-7: makeup<br />
|-<br />
| 0x32<br />
| 0x1<br />
| hair style<br />
|-<br />
| 0x33<br />
| 0x1<br />
| bit 0-2: hair color<br/>bit 3: flip hair<br />
|-<br />
| 0x34<br />
| 0x4<br />
| bit 0-5: eye style<br/>bit 6-8: eye color <br/>bit 9-12: eye scale <br/>bit 13-15: eye yscale<br/>bit 16-20: eye rotation<br/>bit 21-24: eye x spacing<br/>bit 25-29: eye y position<br />
|-<br />
| 0x38<br />
| 0x4<br />
| bit 0-4: eyebrow style<br/>bit 5-7: eyebrow color <br/>bit 8-11: eyebrow scale<br/>bit 12-14: eyebrow yscale <br/>bit 16-19: eyebrow rotation<br/>bit 21-24: eyebrow x spacing<br/>bit 25-29: eyebrow y position<br />
|-<br />
| 0x3C<br />
| 0x2<br />
| bit 0-4: nose style<br/>bit 5-8: nose scale<br/>bit 9-13: nose y position<br />
|-<br />
| 0x3E<br />
| 0x2<br />
| bit 0-5: mouse style<br/>biy 6-8: mouse color<br/>bit 9-12: mouse scale<br/>bit 13-15: mouse yscale<br />
|-<br />
| 0x40<br />
| 0x2<br />
| bit 0-4: mouse y position<br/>bit 5-7: mustach style<br />
|-<br />
| 0x42<br />
| 0x2<br />
| bit 0-2: beard style<br/>bit 3-5: beard color<br/>bit 6-9: mustache scale<br/>bit 10-14:mustache y position<br />
|-<br />
| 0x44<br />
| 0x2<br />
| bit 0-3: glasses style<br/>bit 4-6: glasses color<br/>bit 7-10: glasses scale<br/>bit 11-15: glasses y position<br />
|-<br />
| 0x46<br />
| 0x2<br />
| bit 0: enable mole<br/>bit 1-4: mole scale<br/>bit 5-9: mole x position<br/>bit 10-14: mole y position<br />
|-<br />
| 0x48<br />
| 0x14<br />
| UTF-16 Author Name (10 chars max, 0000 terminated)<br />
|}<br />
<br />
==Mii categories (pants colors)==<br />
<br />
====Special (gold) Miis====<br />
Specialness will override any other color and make the Mii non-editable.<br />
<br />
Copying is rumored to have to be disabled.<br />
<br />
Zeroed system-id and timestamp?<br />
<br />
====Imported (blue) Miis====<br />
Any (non-gold) Mii with a different System ID will appear as a foreign one.<br />
<br />
There is also a range of Mii IDs that are always foreign and uneditable, regardless of the System ID:<br />
<br />
<br />
====Regular (black/red) Miis====<br />
Always editable, since they can only appear as such on the console that created them.<br />
<br />
<br />
====Personal (red) Mii====<br />
A red Mii that happens to be the first in the file!<br />
<br />
==Mii values==<br />
Each of the following values were found with NTR Debugger:<br />
If you want to access the value, grab the given "NTR address" and add 0x08815000.<br />
<br />
{| class="wikitable"<br />
|-<br />
! Data<br />
! NTR address<br />
! Variation (hex)<br />
! Notes<br />
|-<br />
| Face style<br />
| 0x894<br />
| 00-0B<br />
| Not ordered as in editor, read below<br />
|-<br />
| Face color<br />
| 0x898<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Wrinkles<br />
| 0x89C<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Makeup<br />
| 0x8A0<br />
| 00-0B<br />
| Same order as displayed in editor<br />
|-<br />
| Hair style<br />
| 0x8A4<br />
| 00-84<br />
| Not ordered as in editor, read below<br />
|-<br />
| Hair color<br />
| 0x8A8<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Hair flipped<br />
| 0x8AC<br />
| 1 if true<br />
| From top to bottom<br />
|-<br />
| Eye style<br />
| 0x8B0<br />
| 00-3C<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyes color<br />
| 0x8B4<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Eyes size<br />
| 0x8B8<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Eyes thickness<br />
| 0x8BC<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyes rotation<br />
| 0x8C0<br />
| 00-07<br />
| <br />
|-<br />
| Eyes spacing<br />
| 0x8C4<br />
| 00-0C<br />
| <br />
|-<br />
| Eyes height<br />
| 0x8C8<br />
| 00-12<br />
| <br />
|-<br />
| Eyebrows style<br />
| 0x8CC<br />
| 00-18<br />
| Not ordered as in editor, read below<br />
|-<br />
| Eyebrows color<br />
| 0x8D0<br />
| 00-07<br />
| From top to bottom<br />
|-<br />
| Eyebrows size<br />
| 0x8D4<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows thickness<br />
| 0x8D8<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Eyebrows rotation<br />
| 0x8DC<br />
| 00-0B<br />
| <br />
|-<br />
| Eyebrows spacing<br />
| 0x8E0<br />
| 00-0C<br />
| <br />
|-<br />
| Eyebrows height<br />
| 0x8E4<br />
| 03-12<br />
| Yup, minimum is 0x03<br />
|-<br />
| Nose style<br />
| 0x8E8<br />
| 00-11<br />
| Not ordered as in editor, read below<br />
|-<br />
| Nose size<br />
| 0x8EC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Nose height<br />
| 0x8F0<br />
| 00-12<br />
| <br />
|-<br />
| Mouth style<br />
| 0x8F4<br />
| 00-23<br />
| Not ordered as in editor, read below<br />
|-<br />
| Mouth color<br />
| 0x8F8<br />
| 00-04<br />
| From top to bottom.<br />
|-<br />
| Mouth size<br />
| 0x8FC<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mouth thickness<br />
| 0x900<br />
| 06-00<br />
| Left button increases value.<br />
|-<br />
| Mouth height<br />
| 0x904<br />
| 00-12<br />
| <br />
|-<br />
| Mustache style<br />
| 0x908<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Beard style<br />
| 0x90C<br />
| 00-05<br />
| Order like in editor.<br />
|-<br />
| Mustache/Beard color<br />
| 0x910<br />
| 00-07<br />
| From top to button.<br />
|-<br />
| Mustache size<br />
| 0x914<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mustache height<br />
| 0x918<br />
| 00-10<br />
| <br />
|-<br />
| Glasses style<br />
| 0x91C<br />
| 00-08<br />
| Order like in editor.<br />
|-<br />
| Glasses color<br />
| 0x920<br />
| 00-05<br />
| From top to bottom<br />
|-<br />
| Glasses size<br />
| 0x924<br />
| 07-00<br />
| Left button increases value.<br />
|-<br />
| Glasses height<br />
| 0x928<br />
| 00-14<br />
| <br />
|-<br />
| Mole enable<br />
| 0x92C<br />
| 1 if enabled, 0 else.<br />
| <br />
|-<br />
| Mole size<br />
| 0x930<br />
| 08-00<br />
| Left button increases value.<br />
|-<br />
| Mole horiz pos<br />
| 0x934<br />
| 00-10<br />
| <br />
|-<br />
| Mole vert pos<br />
| 0x938<br />
| 00-1E<br />
| <br />
|-<br />
| Mii height<br />
| 0x93C<br />
| 00-7F<br />
| <br />
|-<br />
| Mii weight<br />
| 0x940<br />
| 00-7F<br />
| <br />
|-<br />
| Mii name<br />
| 0x944-0x959<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Creator's name<br />
| 0x95A-96F<br />
| UTF-16<br />
| Terminated with 0x0000. Not updated immediatly?<br />
|-<br />
| Mii gender<br />
| 0x970<br />
| 0: Male, 1: Female<br />
| <br />
|-<br />
| Birthdate month<br />
| 0x974<br />
| 01-0C<br />
| <br />
|-<br />
| Birthdate day<br />
| 0x978<br />
| 01-1F<br />
| <br />
|-<br />
| Mii shirt color<br />
| 0x97C<br />
| 00-0B<br />
| Ordered like in editor.<br />
|-<br />
| Favorite<br />
| 0x980<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Allow copy<br />
| 0x981<br />
| 0: false, 1: true<br />
| <br />
|-<br />
| Unused byte?<br />
| 0x982<br />
| <br />
| <br />
|-<br />
| Allow sharing<br />
| 0x983<br />
| 0: true, 1: false<br />
|<br />
|-<br />
| ???<br />
| 0x984-0x98F<br />
| All zero?<br />
|<br />
|-<br />
| ???<br />
| 0x990-0x997<br />
| 4?<br />
|<br />
|}<br />
0x08815998: Same 4 bytes as encrypted Mii: first 4 bits for Mii type, 4 next for number of seconds since 01/01/2010 00:00:00 UTC+3 (should be verified by other country and region 3DS) divided by 2.<br />
0x0881599C: 6 bytes of MAC address of the 3DS that created the Mii.<br />
0x088159A2: 6 unknow uses bytes<br />
0x088159A8: Same 8 bytes as decrypted Mii at 0x04 through 0x0B. Seems NAND specific, kept the same on Miis created on same NAND but different 3DS via System Transfer. Might be a coincidence but the two first bytes are in ID0 folder name in the Nintendo 3DS folder.<br />
<br />
===Mapped Editor <-> Hex values===<br />
<br />
Most of the values are ordered (left button decreases, right increases, color choices are top to bottom...) but for most "main" part of the UI, where you choose the style of the part being edited, hex values has no correlation with displayed order.<br />
Here is a JSON that can go from a Part, a Page and Position to the right hex value. This is 0 indexed (eg: datas["face"][0][11]).<br />
<br />
<nowiki>{<br />
face: [<br />
0x00,0x01,0x08,<br />
0x02,0x03,0x09,<br />
0x04,0x05,0x0a,<br />
0x06,0x07,0x0b<br />
],<br />
hairs: [<br />
[0x21,0x2f,0x28,<br />
0x25,0x20,0x6b,<br />
0x30,0x33,0x37,<br />
0x46,0x2c,0x42],<br />
[0x34,0x32,0x26,<br />
0x31,0x2b,0x1f,<br />
0x38,0x44,0x3e,<br />
0x73,0x4c,0x77],<br />
[0x40,0x51,0x74,<br />
0x79,0x16,0x3a,<br />
0x3c,0x57,0x7d,<br />
0x75,0x49,0x4b],<br />
[0x2a,0x59,0x39,<br />
0x36,0x50,0x22,<br />
0x17,0x56,0x58,<br />
0x76,0x27,0x24],<br />
[0x2d,0x43,0x3b,<br />
0x41,0x29,0x1e,<br />
0x0c,0x10,0x0a,<br />
0x52,0x80,0x81],<br />
[0x0e,0x5f,0x69,<br />
0x64,0x06,0x14,<br />
0x5d,0x66,0x1b,<br />
0x04,0x11,0x6e]<br />
[0x7b,0x08,0x6a,<br />
0x48,0x03,0x15,<br />
0x00,0x62,0x3f,<br />
0x5a,0x0b,0x78],<br />
[0x05,0x4a,0x6c,<br />
0x5e,0x7c,0x19,<br />
0x63,0x45,0x23,<br />
0x0d,0x7a,0x71],<br />
[0x35,0x18,0x55,<br />
0x53,0x47,0x83,<br />
0x60,0x65,0x1d,<br />
0x07,0x0f,0x70],<br />
[0x4f,0x01,0x6d,<br />
0x7f,0x5b,0x1a,<br />
0x3d,0x67,0x02,<br />
0x4d,0x12,0x5c],<br />
[0x54,0x09,0x13,<br />
0x82,0x61,0x68,<br />
0x2e,0x4e,0x1c,<br />
0x72,0x7e,0x6f]<br />
],<br />
eyebrows: [<br />
[0x06,0x00,0x0c,<br />
0x01,0x09,0x13,<br />
0x07,0x15,0x08,<br />
0x11,0x05,0x04],<br />
[0x0b,0x0a,0x02,<br />
0x03,0x0e,0x14,<br />
0x0f,0x0d,0x16,<br />
0x12,0x10,0x17]<br />
],<br />
nose: [<br />
[0x01,0x0a,0x02,<br />
0x03,0x06,0x00,<br />
0x05,0x04,0x08,<br />
0x09,0x07,0x0B],<br />
[0x0d,0x0e,0x0c,<br />
0x11,0x10,0x0f]<br />
],<br />
mouth: [<br />
[0x17,0x01,0x13,<br />
0x15,0x16,0x05,<br />
0x00,0x08,0x0a,<br />
0x10,0x06,0x0d],<br />
[0x07,0x09,0x02,<br />
0x11,0x03,0x04,<br />
0x0f,0x0b,0x14,<br />
0x12,0x0e,0x0c],<br />
[0x1b,0x1e,0x18,<br />
0x19,0x1d,0x1c,<br />
0x1a,0x23,0x1f,<br />
0x22,0x21,0x20]<br />
]<br />
}</nowiki></div>
Oreo639