Difference between revisions of "GPU/Pitfalls"

From 3dbrew
< GPU
Jump to navigation Jump to search
m (style)
m
 
(2 intermediate revisions by the same user not shown)
Line 7: Line 7:
 
=== Vertex attribute alignment ===
 
=== Vertex attribute alignment ===
  
Vertex components which are defined through [[GPU/Internal_Registers#GPUREG_ATTRIBBUFFERi_CONFIG1|GPUREG_ATTRIBBUFFERi_CONFIG1]] will be aligned.
+
Vertex components which are defined through [[GPU/Internal_Registers#GPUREG_ATTRIBBUFFERi_CONFIG1|GPUREG_ATTRIBBUFFERi_CONFIG1]] will be accessed aligned by the GPU.
 
* Vertex attributes will be aligned to their component element size.
 
* Vertex attributes will be aligned to their component element size.
 
* Padding attributes (Component type > 11) will always aligned to 4 byte offets into the buffer.
 
* Padding attributes (Component type > 11) will always aligned to 4 byte offets into the buffer.
 +
* The stride which is passed to the GPU should be passed unaligned.
  
 
=== Vertex stride in GPUREG_ATTRIBBUFFERi_CONFIG2 ===
 
=== Vertex stride in GPUREG_ATTRIBBUFFERi_CONFIG2 ===
Line 19: Line 20:
 
=== Output mapping in GPUREG_SH_OUTMAP_MASK ===
 
=== Output mapping in GPUREG_SH_OUTMAP_MASK ===
  
The output masking in [[GPU/Internal_Registers#GPUREG_SH_OUTMAP_MASK|GPUREG_SH_OUTMAP_MASK]] influences how the registers starting at [[GPU/Internal_Registers#GPUREG_SH_OUTMAP_Oi|GPUREG_SH_OUTMAP_Oi]] will map to outputs in the shader.
+
The output masking in [[GPU/Internal_Registers#GPUREG_SH_OUTMAP_MASK|GPUREG_SH_OUTMAP_MASK]] influences how the registers starting at [[GPU/Internal_Registers#GPUREG_SH_OUTMAP_Oi|GPUREG_SH_OUTMAP_Oi]] map to outputs in the shader.
  
 
If an output is disabled in [[GPU/Internal_Registers#GPUREG_SH_OUTMAP_MASK|GPUREG_SH_OUTMAP_MASK]] it means that no slot in the [[GPU/Internal_Registers#GPUREG_SH_OUTMAP_Oi|GPUREG_SH_OUTMAP_Oi]] registers is consumed.
 
If an output is disabled in [[GPU/Internal_Registers#GPUREG_SH_OUTMAP_MASK|GPUREG_SH_OUTMAP_MASK]] it means that no slot in the [[GPU/Internal_Registers#GPUREG_SH_OUTMAP_Oi|GPUREG_SH_OUTMAP_Oi]] registers is consumed.
Line 54: Line 55:
 
== Shaders ==
 
== Shaders ==
  
=== Write output components exactly once ===
+
=== Configued Output components must be written exactly once ===
  
 
Each configured output component has to be written exactly once or the PICA freezes.
 
Each configured output component has to be written exactly once or the PICA freezes.

Latest revision as of 20:52, 14 March 2016

This page collects some oddities and pitfalls of the PICA GPU which is used in the 3DS.

Internal Registers[edit]

Vertex attribute alignment[edit]

Vertex components which are defined through GPUREG_ATTRIBBUFFERi_CONFIG1 will be accessed aligned by the GPU.

  • Vertex attributes will be aligned to their component element size.
  • Padding attributes (Component type > 11) will always aligned to 4 byte offets into the buffer.
  • The stride which is passed to the GPU should be passed unaligned.

Vertex stride in GPUREG_ATTRIBBUFFERi_CONFIG2[edit]

The vertex stride set in GPUREG_ATTRIBBUFFERi_CONFIG2 must match the actual size of the vertex contained in the buffer or the PICA will freeze or it won't draw anything.

If you want to use a different stride you have to pad the data accordingly with padding attributes.

Output mapping in GPUREG_SH_OUTMAP_MASK[edit]

The output masking in GPUREG_SH_OUTMAP_MASK influences how the registers starting at GPUREG_SH_OUTMAP_Oi map to outputs in the shader.

If an output is disabled in GPUREG_SH_OUTMAP_MASK it means that no slot in the GPUREG_SH_OUTMAP_Oi registers is consumed. GPUREG_SH_OUTMAP_TOTAL configures the number of used consecutive slots in the outmap.

Example:

Register Value Meaning
GPUREG_SH_OUTMAP_TOTAL 0x00000002 2 outputs enabled
GPUREG_SH_OUTMAP_MASK 0x00000011 o0 enabled, o4 enabled
GPUREG_SH_OUTMAP_O0 0x03020100 o0 = pos.xyzw
GPUREG_SH_OUTMAP_O1 0x0B0A0908 o4 = color.rgba
GPUREG_SH_OUTMAP_O2 ... (unused)

Shaders[edit]

Configued Output components must be written exactly once[edit]

Each configured output component has to be written exactly once or the PICA freezes.

MOVA instructions can't be adjacent[edit]

Having 2 consecutive MOVA instructions will freeze the PICA. This can be relaxed by placing a NOP between 2 MOVAs or by rearranging the code.