Changes

Update GPUREG_VSH_COM_MODE from hardware findings
Line 1: Line 1: −
[[Category:GFX]]
+
[[Category:GPU]]
(this page is hugely WIP)
      
== Overview ==
 
== Overview ==
    
GPU internal registers are written to through GPU commands. They are used to control the GPU's behavior, that is to say tell it to draw stuff and how we want it drawn.
 
GPU internal registers are written to through GPU commands. They are used to control the GPU's behavior, that is to say tell it to draw stuff and how we want it drawn.
 +
 +
Each command is at least 8 bytes wide. The first word is the command parameter and the second word constitutes the command header. Optionally, more parameter words may follow (potentially including a padding word to align commands to multiples of 8 bytes).
 +
 +
In the simplest case, a command is exactly 8 bytes wide. You can think of such a command as writing the parameter word to an internal register (the index of which is given in the command header). The more general case where more than one parameter word is given is equivalent to multiple simple commands (one for each parameter word). If consecutive writing mode is enabled in the command header, the current command index will be incremented after each parameter write. Otherwise, the parameters will be consecutively written to the same register.
 +
 +
For example, the sequence "0xAAAAAAAA 0x802F011C 0xBBBBBBBB 0xCCCCCCCC" is equivalent to a call to commands 0xF011C with parameter 0xAAAAAAAA, 0xF011D with parameter 0xBBBBBBBB and 0xF011E with parameter 0xCCCCCCCC. If consecutive writing mode were disabled, the command would be equivalent to three consecutive calls to 0xF011C (once with parameter 0xAAAAAAAA, once with 0xBBBBBBBB, and finally with 0xCCCCCCCC).
 +
 +
Invalid GPU command parameters including NaN floats can cause the GPU to hang, which then causes the GSP module to hang as well.
 +
 +
The size of GPU command buffers must be 0x10-byte aligned; the lower 3 bits of the size are cleared. A common pitfall is having the finalization command (write to register 0x0010) not executed because it was the last 8 bytes of a non-0x10 byte aligned command buffer, and having the GPU hang as a result.
 +
 +
=== Command Header ===
 +
{| class="wikitable" border="1"
 +
!  Bit
 +
!  Description
 +
|-
 +
| 0-15
 +
| Command ID
 +
|-
 +
| 16-19
 +
| Parameter mask
 +
|-
 +
| 20-27
 +
| Number of extra parameters (may be zero)
 +
|-
 +
| 28-30
 +
| Unused
 +
|-
 +
| 31
 +
| Consecutive writing mode
 +
|}
 +
 +
=== Parameter masking ===
 +
 +
Using a value other than 0xF, parts of a word in internal GPU memory can be updated without touching the other bits of it. For example, setting bit 16 to zero indicates that the least significant byte of the parameter will not be overwritten, setting bit 17 to zero indicates that the parameter's second LSB will not be overwritten, etc. This means that for instance commands 0x00010107 and 0x00020107 refer to the same thing but write different parts of the parameter.
    
=== Types ===
 
=== Types ===
    
There are three main types of registers :
 
There are three main types of registers :
* configuration registers, which directly map to various rendering properties (for example : [[#GPUREG_FACECULLING_CONFIG|GPUREG_FACECULLING_CONFIG]])
+
* configuration registers, which directly map to various rendering properties (for example: [[#GPUREG_FACECULLING_CONFIG|GPUREG_FACECULLING_CONFIG]])
* data transfer registers, which can be seen as FIFOs that let us send sequential chunks of data to the GPU, such as shader code or 1D samplers (for example : [[#GPUREG_GSH_CODETRANSFER_DATA|GPUREG_GSH_CODETRANSFER_DATA]])
+
* data transfer registers, which can be seen as FIFOs that let us send sequential chunks of data to the GPU, such as shader code or 1D samplers (for example: [[#GPUREG_SH_CODETRANSFER_DATA|GPUREG_GSH_CODETRANSFER_DATA]])
* action triggering registers, which tell the GPU to do something, like draw a primitive (for example : [[#GPUREG_DRAWARRAYS|GPUREG_DRAWARRAYS]])
+
* action triggering registers, which tell the GPU to do something, like draw a primitive (for example: [[#GPUREG_DRAWARRAYS|GPUREG_DRAWARRAYS]])
    
=== Aliases ===
 
=== Aliases ===
   −
It is possible for multiple register (sequential) IDs to correspond to the same register. This is done to leverage the consecutive writing mode for [[GPU Commands]], which makes it possible for a single command to write data to multiple sequential register IDs. For example, register IDs 02C1 through 02C8 all correspond to [[#GPUREG_VSH_FLOATUNIFORM_DATA|GPUREG_VSH_FLOATUNIFORM_DATA]] so that a consecutively writing command based at 02C0 will write its first parameter to [[#GPUREG_VSH_FLOATUNIFORM_CONFIG|GPUREG_VSH_FLOATUNIFORM_CONFIG]] and ever subsequent ones to [[#GPUREG_VSH_FLOATUNIFORM_DATA|GPUREG_VSH_FLOATUNIFORM_DATA]]
+
It is possible for multiple register (sequential) IDs to correspond to the same register. This is done to leverage the consecutive writing mode for GPU commands, which makes it possible for a single command to write data to multiple sequential register IDs. For example, register IDs 02C1 through 02C8 all correspond to [[#GPUREG_VSH_FLOATUNIFORM_DATAi|GPUREG_VSH_FLOATUNIFORM_DATA''i'']] so that a consecutively writing command based at 02C0 will write its first parameter to [[#GPUREG_VSH_FLOATUNIFORM_INDEX|GPUREG_VSH_FLOATUNIFORM_INDEX]] and ever subsequent ones to [[#GPUREG_VSH_FLOATUNIFORM_DATAi|GPUREG_VSH_FLOATUNIFORM_DATA''i'']]
    
=== Data Types ===
 
=== Data Types ===
Line 2,907: Line 2,941:  
|-
 
|-
 
| 0233
 
| 0233
| [[#GPUREG_FIXEDATTRIB_DATA0|GPUREG_FIXEDATTRIB_DATA0]]
+
| [[#GPUREG_FIXEDATTRIB_DATAi|GPUREG_FIXEDATTRIB_DATA0]]
 
|?
 
|?
 
|PICA_REG_VS_FIXED_ATTR_DATA0
 
|PICA_REG_VS_FIXED_ATTR_DATA0
 
|-
 
|-
 
| 0234
 
| 0234
| [[#GPUREG_FIXEDATTRIB_DATA1|GPUREG_FIXEDATTRIB_DATA1]]
+
| [[#GPUREG_FIXEDATTRIB_DATAi|GPUREG_FIXEDATTRIB_DATA1]]
 
|?
 
|?
 
|PICA_REG_VS_FIXED_ATTR_DATA1
 
|PICA_REG_VS_FIXED_ATTR_DATA1
 
|-
 
|-
 
| 0235
 
| 0235
| [[#GPUREG_FIXEDATTRIB_DATA2|GPUREG_FIXEDATTRIB_DATA2]]
+
| [[#GPUREG_FIXEDATTRIB_DATAi|GPUREG_FIXEDATTRIB_DATA2]]
 
|?
 
|?
 
|PICA_REG_VS_FIXED_ATTR_DATA2
 
|PICA_REG_VS_FIXED_ATTR_DATA2
Line 4,127: Line 4,161:     
These registers map components of the corresponding vertex shader output register to specific fixed-function semantics.
 
These registers map components of the corresponding vertex shader output register to specific fixed-function semantics.
 +
 +
Semantics that have not been mapped to a component of an output register have a value of 1
    
Semantic values:
 
Semantic values:
Line 4,333: Line 4,369:  
|-
 
|-
 
| 0-9
 
| 0-9
| unsigned, X
+
| signed, X
 
|-
 
|-
 
| 16-25
 
| 16-25
| unsigned, Y
+
| signed, Y
 
|}
 
|}
   Line 4,532: Line 4,568:  
|-
 
|-
 
| 4-5
 
| 4-5
| unsigned, ETC1 (0 = not ETC1, 2 = ETC1)
+
| unsigned, ETC1 (0 = not ETC1, 2 = ETC1) note: still 0 for ETC1A4
 
|-
 
|-
 
| 8-10
 
| 8-10
Line 4,650: Line 4,686:  
|}
 
|}
   −
This register is used to set a texture unit's physical address(es) in memory.
+
This register is used to set a texture unit's physical address(es) in memory. Individual texels in a texture are laid out in memory as a [http://en.wikipedia.org/wiki/Z-order_curve Z-order curve]. Mipmap data is stored directly following the main texture data.
    
If the texture is a cube:
 
If the texture is a cube:
Line 4,686: Line 4,722:  
|-
 
|-
 
| 0
 
| 0
| unsigned, Perspective (0 = not perspective, 1 = perspective)
+
| unsigned, Perspective (0 = perspective, 1 = not perspective)
 
|-
 
|-
 
| 1-23
 
| 1-23
Line 4,701: Line 4,737:  
|-
 
|-
 
| 0-3
 
| 0-3
| unsigned, [[GPU_Textures#Texture_color_types|Format]]
+
| unsigned, Format
 
|}
 
|}
    
This register is used to set a texture unit's data format.
 
This register is used to set a texture unit's data format.
 +
 +
Format values:
 +
 +
{| class="wikitable" border="1"
 +
!  Value
 +
!  Description
 +
!  GL Format
 +
!  GL Data Type
 +
|-
 +
| 0x0
 +
| RGBA8888
 +
| GL_RGBA
 +
| GL_UNSIGNED_BYTE
 +
|-
 +
| 0x1
 +
| RGB888
 +
| GL_RGB
 +
| GL_UNSIGNED_BYTE
 +
|-
 +
| 0x2
 +
| RGBA5551
 +
| GL_RGBA
 +
| GL_UNSIGNED_SHORT_5_5_5_1
 +
|-
 +
| 0x3
 +
| RGB565
 +
| GL_RGB
 +
| GL_UNSIGNED_SHORT_5_6_5
 +
|-
 +
| 0x4
 +
| RGBA4444
 +
| GL_RGBA
 +
| GL_UNSIGNED_SHORT_4_4_4_4
 +
|-
 +
| 0x5
 +
| IA8
 +
| GL_LUMINANCE_ALPHA
 +
| GL_UNSIGNED_BYTE
 +
|-
 +
| 0x6
 +
| HILO8
 +
|
 +
|
 +
|-
 +
| 0x7
 +
| I8
 +
| GL_LUMINANCE
 +
| GL_UNSIGNED_BYTE
 +
|-
 +
| 0x8
 +
| A8
 +
| GL_ALPHA
 +
| GL_UNSIGNED_BYTE
 +
|-
 +
| 0x9
 +
| IA44
 +
| GL_LUMINANCE_ALPHA
 +
| GL_UNSIGNED_BYTE_4_4_EXT
 +
|-
 +
| 0xA
 +
| I4
 +
|
 +
|
 +
|-
 +
| 0xB
 +
| A4
 +
| GL_ALPHA
 +
| GL_UNSIGNED_NIBBLE_EXT
 +
|-
 +
| 0xC
 +
| ETC1
 +
| GL_ETC1_RGB8_OES
 +
|
 +
|-
 +
| 0xD
 +
| ETC1A4
 +
|
 +
|
 +
|}
    
=== GPUREG_LIGHTING_ENABLE0 ===
 
=== GPUREG_LIGHTING_ENABLE0 ===
Line 4,786: Line 4,901:  
|-
 
|-
 
| 1
 
| 1
| U2
+
|
 
|-
 
|-
 
| 2
 
| 2
Line 4,792: Line 4,907:  
|-
 
|-
 
| 3
 
| 3
| V2
+
|
 
|-
 
|-
 
| 4
 
| 4
| U + V
+
| (U + V) / 2
 
|-
 
|-
 
| 5
 
| 5
| U2 + V2
+
| (U² + V²) / 2
 
|-
 
|-
 
| 6
 
| 6
| sqrt(U2 + V2)
+
| sqrt(+ )
 
|-
 
|-
 
| 7
 
| 7
Line 4,883: Line 4,998:  
| unsigned, Minification filter
 
| unsigned, Minification filter
 
|-
 
|-
| 3-10
+
| 3-6
| 0x60
+
| Min LOD (usually 0)
 +
|-
 +
| 7-10
 +
| Max LOD (usually 6)
 
|-
 
|-
 
| 11-18
 
| 11-18
Line 4,927: Line 5,045:  
|-
 
|-
 
| 0-7
 
| 0-7
| unsigned, Texture offset
+
| unsigned, Texture offset (Mipmap level 0 / base level)
 
|-
 
|-
| 8-31
+
| 8-15
| 0xE0C080
+
| unsigned, mipmap level 1 offset (usually 0x80)
 +
|-
 +
| 16-23
 +
| unsigned, mipmap level 2 offset (usually 0xC0)
 +
|-
 +
| 24-31
 +
| unsigned, mipmap level 3 offset (usually 0xE0)
 
|}
 
|}
   −
This register is used to set the procedural texture unit's offset.
+
This register is used to set the procedural texture unit's offset. Mipmap level 4-7 seems to be hardcoded at offset 0xF0, 0xF8, 0xFC and 0xFE .
    
=== GPUREG_PROCTEX_LUT ===
 
=== GPUREG_PROCTEX_LUT ===
Line 4,995: Line 5,119:  
|-
 
|-
 
| 12-23
 
| 12-23
| fixed1.0.11, Difference from next element
+
| fixed0.0.12 with two's complement ( [0.5,1.0) mapped to [-1.0,0) ), Difference from next element
 
|}
 
|}
   Line 5,009: Line 5,133:  
|-
 
|-
 
| 12-23
 
| 12-23
| fixed1.0.11, Difference from next element
+
| fixed0.0.12 with two's complement, Difference from next element
 
|}
 
|}
   Line 5,023: Line 5,147:  
|-
 
|-
 
| 12-23
 
| 12-23
| fixed1.0.11, Difference from next element
+
| fixed0.0.12 with two's complement, Difference from next element
 
|}
 
|}
   Line 5,054: Line 5,178:  
|-
 
|-
 
| 0-7
 
| 0-7
| fixed1.0.7, Red difference between current and next color table elements
+
| signed, Half of red difference between current and next color table elements
 
|-
 
|-
 
| 8-15
 
| 8-15
| fixed1.0.7, Green difference between current and next color table elements
+
| signed, Half of green difference between current and next color table elements
 
|-
 
|-
 
| 16-23
 
| 16-23
| fixed1.0.7, Blue difference between current and next color table elements
+
| signed, Half of blue difference between current and next color table elements
 
|-
 
|-
 
| 24-31
 
| 24-31
| fixed1.0.7, Alpha difference between current and next color table elements
+
| signed, Half of alpha difference between current and next color table elements
 
|}
 
|}
   Line 5,129: Line 5,253:  
| Previous
 
| Previous
 
|}
 
|}
 +
 +
Using previous source in the first TEV stage returns the primary color, while previous buffer returns zero.
    
=== GPUREG_TEXENV''i''_OPERAND ===
 
=== GPUREG_TEXENV''i''_OPERAND ===
Line 5,593: Line 5,719:  
This register is used to configure the blending function.
 
This register is used to configure the blending function.
   −
Equation values:
+
'''Equation values:'''
    
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
Line 5,615: Line 5,741:  
|}
 
|}
   −
Function values:
+
Blend equations 5, 6, 7 appear to behave the same as blend equation 0 (Add)
 +
 
 +
'''Function values:'''
    
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
Line 5,937: Line 6,065:     
This register is used to depth testing and framebuffer write masking.
 
This register is used to depth testing and framebuffer write masking.
 +
 +
Note that setting the "Depth test enabled" bit to 0 will ''not'' also disable depth writes. It will instead behave as if the depth function were set to "Always". To completely disable depth-related operations both the depth test and depth write bits must be disabled.
    
Depth function values:
 
Depth function values:
Line 6,251: Line 6,381:  
| 0-7
 
| 0-7
 
| unsigned, View shading effect in line-of-sight direction
 
| unsigned, View shading effect in line-of-sight direction
 +
|-
 +
| 8
 +
| Gas color LUT input
 
|}
 
|}
   −
This register configures gas light shading in the line-of-sight direction.
+
This register configures gas light shading in the line-of-sight direction, and the input to the gas color LUT.
 +
 
 +
Color LUT input values:
 +
 
 +
{| class="wikitable" border="1"
 +
! Value
 +
! Description
 +
|-
 +
| 0
 +
| Gas density
 +
|-
 +
| 1
 +
| Light factor
 +
|}
    
=== GPUREG_GAS_LUT_INDEX ===
 
=== GPUREG_GAS_LUT_INDEX ===
Line 6,321: Line 6,467:  
| 0-23
 
| 0-23
 
| fixed0.16.8, Depth direction attenuation proportion
 
| fixed0.16.8, Depth direction attenuation proportion
 +
|-
 +
| 24-25
 +
| unsigned, Depth function
 
|}
 
|}
   −
This register is used to configure the gas depth direction attenuation proportion.
+
This register is used to configure the gas depth direction attenuation proportion, as well as the gas depth function.
 +
 
 +
Gas depth function values:
 +
 
 +
{| class="wikitable" border="1"
 +
! Value
 +
! Description
 +
|-
 +
| 0
 +
| Never
 +
|-
 +
| 1
 +
| Always
 +
|-
 +
| 2
 +
| Greater than/Greater than or equal
 +
|-
 +
| 3
 +
| Less than/Less than or equal/Equal/Not equal
 +
|}
    
=== GPUREG_FRAGOP_SHADOW ===
 
=== GPUREG_FRAGOP_SHADOW ===
Line 6,738: Line 6,906:  
| unsigned, Fresnel FR LUT disabled (0 = enabled, 1 = disabled)
 
| unsigned, Fresnel FR LUT disabled (0 = enabled, 1 = disabled)
 
|-
 
|-
| 20-22
+
| 20
| unsigned, Term 1 reflection component RB LUT disabled (0 = enabled, 7 = disabled)
+
| unsigned, Term 1 reflection component RB LUT disabled (0 = enabled, 1 = disabled)
 
|-
 
|-
 
| 21
 
| 21
| unsigned, Term 1 reflection component RG LUT disabled (0 = enabled, 7 = disabled)
+
| unsigned, Term 1 reflection component RG LUT disabled (0 = enabled, 1 = disabled)
 
|-
 
|-
 
| 22
 
| 22
| unsigned, Term 1 reflection component RR LUT disabled (0 = enabled, 7 = disabled)
+
| unsigned, Term 1 reflection component RR LUT disabled (0 = enabled, 1 = disabled)
 
|-
 
|-
 
| 24
 
| 24
Line 7,435: Line 7,603:  
This register selects the index of the fixed attribute to be input with GPUREG_FIXEDATTRIB_DATA''i''. See [[GPU/Fixed Vertex Attributes]] and [[GPU/Immediate-Mode Vertex Submission]] for usage info.
 
This register selects the index of the fixed attribute to be input with GPUREG_FIXEDATTRIB_DATA''i''. See [[GPU/Fixed Vertex Attributes]] and [[GPU/Immediate-Mode Vertex Submission]] for usage info.
   −
=== GPUREG_FIXEDATTRIB_DATA0 ===
+
=== GPUREG_FIXEDATTRIB_DATA''i'' ===
    
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
Line 7,441: Line 7,609:  
! Description
 
! Description
 
|-
 
|-
| 0-23
+
| colspan="2" | '''DATA0:'''
| float1.7.16, Vertex attribute element 1
+
|-
 +
| 0-7
 +
| float1.7.16, Vertex attribute element 3 (Z) (bits 16-23)
 +
|-
 +
| 8-31
 +
| float1.7.16, Vertex attribute element 4 (W)
 +
|-
 +
| colspan="2" | '''DATA1:'''
 
|-
 
|-
| 24-31
+
| 0-15
| float1.7.16, Vertex attribute element 2 (lower 8 bits)
+
| float1.7.16, Vertex attribute element 2 (Y) (bits 8-23)
|}
  −
 
  −
Accepts the first part of the four 24-bit floating-point values that make up a vertex attribute. Stored in the fixed attribute currently specified with GPUREG_FIXEDATTRIB_INDEX. If immediate-mode vertex submission is enabled (by writing 0xF to the index register) then vertex data is input here directly.
  −
 
  −
=== GPUREG_FIXEDATTRIB_DATA1 ===
  −
 
  −
{| class="wikitable" border="1"
  −
! Bits
  −
! Description
   
|-
 
|-
| 0-23
+
| 16-31
| float1.7.16, Vertex attribute element 2 (upper 16 bits)
+
| float1.7.16, Vertex attribute element 3 (Z) (bits 0-15)
 
|-
 
|-
| 24-31
+
| colspan="2" | '''DATA2:'''
| float1.7.16, Vertex attribute element 3 (lower 16 bits)
  −
|}
  −
 
  −
Accepts the second part of the four 24-bit floating-point values that make up a vertex attribute. Stored in the fixed attribute currently specified with GPUREG_FIXEDATTRIB_INDEX. If immediate-mode vertex submission is enabled (by writing 0xF to the index register) then vertex data is input here directly.
  −
 
  −
=== GPUREG_FIXEDATTRIB_DATA2 ===
  −
 
  −
{| class="wikitable" border="1"
  −
! Bits
  −
! Description
   
|-
 
|-
 
| 0-23
 
| 0-23
| float1.7.16, Vertex attribute element 3 (upper 8 bits)
+
| float1.7.16, Vertex attribute element 1 (X)
 
|-
 
|-
 
| 24-31
 
| 24-31
| float1.7.16, Vertex attribute element 4
+
| float1.7.16, Vertex attribute element 2 (Y) (bits 0-7)
 
|}
 
|}
   −
Accepts the third part of the four 24-bit floating-point values that make up a vertex attribute. Stored in the fixed attribute currently specified with GPUREG_FIXEDATTRIB_INDEX. If immediate-mode vertex submission is enabled (by writing 0xF to the index register) then vertex data is input here directly.
+
Accepts four 24-bit floating-point values that make up a vertex attribute. Stored in the fixed attribute currently specified with GPUREG_FIXEDATTRIB_INDEX. If immediate-mode vertex submission is enabled (by writing 0xF to the index register) then vertex data is input here directly.
    
=== GPUREG_CMDBUF_SIZE0 ===
 
=== GPUREG_CMDBUF_SIZE0 ===
Line 7,575: Line 7,731:     
This register sets whether to use the geometry shader configuration or reuse the vertex shader configuration for the geometry shader shading unit.
 
This register sets whether to use the geometry shader configuration or reuse the vertex shader configuration for the geometry shader shading unit.
 +
When disabled and the geometry unit is not in use, as configured by GPUREG_GEOSTAGE_CONFIG, uniforms, outmap mask, program code and swizzle data are propagated to the geometry shader unit.
    
=== GPUREG_START_DRAW_FUNC0 ===
 
=== GPUREG_START_DRAW_FUNC0 ===
Trusted
34

edits