Difference between revisions of "GSP Shared Memory"

From 3dbrew
Jump to navigation Jump to search
(47 intermediate revisions by 11 users not shown)
Line 1: Line 1:
This page describes the structure of the GSP [[GSPGPU:RegisterInterruptRelayQueue|shared]] memory. GX commands and framebuffer info is stored here, and other unknown data. After writing the command data to GSP shared memory, [[GSPGPU:TriggerCmdReqQueue|TriggerCmdReqQueue]] must be used to trigger GSP processing for the command.
+
This page describes the structure of the GSP [[GSPGPU:RegisterInterruptRelayQueue|shared]] memory. GX commands and framebuffer info is stored here, and other unknown data.
 +
 
 +
 
 +
=Interrupt info=
 +
The Interrupt info structure is located at sharedmemvadr + process_gsp_index*0x40.
 +
 
 +
It is a list of interrupts (id's 0-6 exist).
 +
 
 +
{| class="wikitable" border="1"
 +
|-
 +
!  Index Byte
 +
!  Description
 +
|-
 +
| 0x0
 +
| Index of the last processed data (field size is 0x33) (must be updated manually)
 +
|-
 +
| 0x1
 +
| To be processed datafields, (max 0x20 for PDC interrupts else the missed PDC filds are used,max 0x34 for all other if more interrupts happen and the Errorflag is 0 the Errorflag is set to 1)
 +
|-
 +
| 0x2
 +
| Errorflag (if the first bit of Errorflag is set future PDC interrupts are ignored)
 +
|-
 +
| 0x3
 +
| not used
 +
|-
 +
| 0x4-0x7
 +
| missed PDC0
 +
|-
 +
| 0x8-0xB
 +
| missed PDC1
 +
|-
 +
| 0xC-0x3F
 +
| u8 Interrupttypefield (0=PSC0, 1=PSC1, 2=PDC0/VBlankTop (sent to all threads), 3=PDC1/VBlankBottom (sent to all threads), 4=PPF, 5=P3D, 6=DMA)
 +
|}
  
 
=Framebuffer info=
 
=Framebuffer info=
Line 20: Line 53:
 
|}
 
|}
  
When a process sets this framebuffer info, it sets index to <nowiki>(index+1) & 1</nowiki>. Then it writes the framebuffer info entry, and sets flag to value 1. The GSP module seems to load this framebuffer info entry data into GSP state once the [[GPU]] finishes processing GX commands 3 or 4. Once the GSP module finishes loading this framebuffer info, it sets flag to value 0, then it will not load the framebuffer info again until flag is value 1. After loading this entry data into GSP state, the GSP module then writes this framebuffer state to the [[LCD]] registers.
+
When a process sets this framebuffer info, it sets index to <nowiki>(index+1) & 1</nowiki>. Then it writes the framebuffer info entry, and sets flag to value 1. The GSP module loads this framebuffer info entry data into GSP state once the [[GPU]] finishes processing GX commands 3 or 4. Once the GSP module finishes loading this framebuffer info, it sets flag to value 0, then it will not load the framebuffer info again until flag is value 1. After loading this entry data into GSP state, the GSP module then writes this framebuffer state to the [[LCD]] registers. GSP module automatically updates the LCD framebuffer registers each time GX commands 3 or 4 finish, even when this shared memory data was not updated by the application.(GSP module toggles the active framebuffer register when automatically updating LCD registers, when shared memory data is not used)
  
 
The two 0x1C-byte framebuffer info entries are located at framebufferinfo+4.
 
The two 0x1C-byte framebuffer info entries are located at framebufferinfo+4.
 +
 +
=3D Slider and 3D [[GSPGPU:SetLedForceOff|LED]]=
 +
See [[Configuration Memory]].
  
 
=Command Buffer Header=
 
=Command Buffer Header=
 +
 +
The command buffer is located at sharedmem + 0x800 + [[GSPGPU:RegisterInterruptRelayQueue|threadindex]]*0x200. After writing the command data to shared memory, [[GSPGPU:TriggerCmdReqQueue|TriggerCmdReqQueue]] must be used to trigger GSP processing for the command when the total commands field is value 1.
 +
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
Line 31: Line 70:
 
|-
 
|-
 
|  0
 
|  0
Command Index, must be <=15
+
Current command index. This index is updated by GSP module after loading the command data, right before the command is processed. When this index is updated by GSP module, the total commands field is decreased by one as well.
 
|-
 
|-
 
|  1
 
|  1
Must not be value 0
+
Total commands to process, must not be value 0 when GSP module handles commands. This must be <=15 when writing a command to shared memory. This is incremented by the application when writing a command to shared memory, after increasing this value [[GSPGPU:TriggerCmdReqQueue|TriggerCmdReqQueue]] is only used if this field is value 1.
 
|-
 
|-
 
|  2
 
|  2
Line 45: Line 84:
 
|  u32 Error code for the last GX command which failed
 
|  u32 Error code for the last GX command which failed
 
|}
 
|}
 
The command buffer is located at sharedmem + 0x800 + [[GSPGPU:RegisterInterruptRelayQueue|threadindex]]*0x200.
 
  
 
=Command Header=
 
=Command Header=
Line 61: Line 98:
 
|-
 
|-
 
|  3
 
|  3
|  When non-zero GSP module may check flags for the specified cmdID, command handling is aborted when the flags are set.
+
|  When non-zero GSP module may check flags for the specified cmdID, command handling is aborted when the flags are set. The corresponding flag for each CmdID is set once the command is handled by GSP module, this flag is likely cleared once the GPU finishes processing the command.
 
|}
 
|}
  
Line 68: Line 105:
 
=Commands=
 
=Commands=
  
==GX Command 0==
+
== Trigger DMA Request ==
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
Line 86: Line 123:
 
| Size
 
| Size
 
|-
 
|-
| 7-4
+
| 6-4
 
| Unused
 
| Unused
 +
|-
 +
| 7
 +
| Flush source (0 = don't flush, 1 = flush)
 
|}
 
|}
  
This command is normally used to DMA data from the application GSP [[Memory_layout|heap]] to VRAM.
+
This command is normally used to DMA data from the application GSP [[Memory_layout|heap]] to VRAM. When flushing is enabled and the source buffer is not located within VRAM, svcFlushProcessDataCache is used to flush the source buffer.
  
==GX Command 1==
+
== Trigger Command List Processing ==
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
Line 108: Line 148:
 
|-
 
|-
 
| 3
 
| 3
| Flag, bit0 is written to GSP module state
+
| Update gas additive blend results (0 = don't update, 1 = update)
 
|-
 
|-
 
| 6-4
 
| 6-4
Line 114: Line 154:
 
|-
 
|-
 
| 7
 
| 7
| When non-zero, call svcFlushProcessDataCache() with the specified buffer
+
| Flush buffer (0 = don't flush, 1 = flush)
 
|}
 
|}
  
This command converts the specified address to a physical address, then writes the physical address and size to the [[GPU]] registers at 0x1EF018E0. This buffer contains [[GPU_Commands|GPU commands]].
+
This command converts the specified address to a physical address, then writes the physical address and size to the [[GPU]] registers at 0x1EF018E0. This buffer contains [[GPU/Internal_Registers|GPU commands]]. When flushing is enabled, svcFlushProcessDataCache is used to flush the buffer.
  
==GX Command 2==
+
== Trigger Memory Fill ==
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
Line 129: Line 169:
 
|-
 
|-
 
| 1
 
| 1
| Buf0 address
+
| Buf0 start address (0 = don't fill anything)
 
|-
 
|-
 
| 2
 
| 2
| Buf0 size
+
| Buf0 value
 
|-
 
|-
 
| 3
 
| 3
| Associated buf0 address
+
| Buf0 end address
 
|-
 
|-
 
| 4
 
| 4
| Buf1 address
+
| Buf1 start address (0 = don't fill anything)
 
|-
 
|-
 
| 5
 
| 5
| Buf1 size
+
| Buf1 value
 
|-
 
|-
 
| 6
 
| 6
| Associated buf1 address
+
| Buf1 end address
 
|-
 
|-
 
| 7
 
| 7
| The low u16 is used with buf0, while the high u16 is used with buf1
+
| Control0 <nowiki>|</nowiki> (Control1 << 16)
 
|}
 
|}
  
This commands converts the specified addresses to physical addresses, then writes these addresses and the specified parameters to the [[GPU]] registers at 0x1EF00010 and 0x1EF00020. The associated buffer address must not be <= to the main buffer address, thus the associated buffer address must not be zero as well. When the bufX address is zero, processing for the bufX parameters is skipped.
+
This command converts the specified addresses to physical addresses, then writes these addresses and the specified parameters to the [[GPU]] registers at 0x1EF00010 and 0x1EF00020. Doing so fills the specified buffers with the associated 4-byte value. This is used to clear GPU framebuffers.
 +
The associated buffer address must not be <= to the main buffer address, thus the associated buffer address must not be zero as well. When the bufX address is zero, processing for the bufX parameters is skipped.
 +
 
 +
The values of Control0 and Control1 give information about the type of memory fill. See [[GPU/External_Registers#Memory Fill|here]] for more information about memory fill parameters.
  
==GX Command 3==
+
== Trigger Display Transfer ==
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
Line 162: Line 205:
 
|-
 
|-
 
| 1
 
| 1
| VRAM framebuffer address
+
| Input framebuffer address
 
|-
 
|-
 
| 2
 
| 2
Line 168: Line 211:
 
|-
 
|-
 
| 3
 
| 3
| VRAM framebuffer [[GPU|dimensions]]
+
| Input framebuffer [[GPU|dimensions]]
 
|-
 
|-
 
| 4
 
| 4
Line 174: Line 217:
 
|-
 
|-
 
| 5
 
| 5
| Unknown, for applications this is 0x1001000 for the main screen, and 0x1000 for the sub screen. When bit24 is set, the actual framebuffer width is input width>>1.
+
| [[GPU|Flags]], for applications this is 0x1001000 for the main screen, and 0x1000 for the sub screen.
 
|-
 
|-
 
| 7-6
 
| 7-6
Line 180: Line 223:
 
|}
 
|}
  
This command converts the specified addresses to physical addresses, then writes these physical addresses and parameters to the [[GPU]] registers at 0x1EF00C00. This command writes the rendered framebuffer data from the VRAM framebuffer address to the specified output framebuffer.
+
This command converts the specified addresses to physical addresses, then writes these physical addresses and parameters to the [[GPU]] registers at 0x1EF00C00. This GPU command copies the already rendered framebuffer data from the input GPU framebuffer address to the specified output LCD framebuffer. The input framebuffer is normally located in VRAM.
 +
 
 +
The GPU color buffer is stored in the same Z-curve (tiled) format as textures. By default, SetDisplayTransfer converts the given buffer from the tiled format to a linear format adapted to the LCD framebuffers.
 +
 
 +
Display transfers are performed asynchronously, so after requesting a display transfer you should wait for the PPF interrupt to fire before reading the output data.
 +
 
 +
Some color formats seem to require specific input / output sizes when performing a display transfer, doing an RGB5A1->RGBA4 display transfer would never fire the PPF interrupt with a 32x32 buffer, increasing the buffer to 128x128 made it fire correctly.
  
==GX Command 4==
+
== Trigger Texture Copy ==
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-
Line 192: Line 241:
 
|-
 
|-
 
| 1
 
| 1
| Buf0 address
+
| Input buffer address.
 
|-
 
|-
 
| 2
 
| 2
| Buf1 address
+
| Output buffer address.
 
|-
 
|-
 
| 3
 
| 3
| ?
+
| Total bytes to copy, not including gaps.
 
|-
 
|-
 
| 4
 
| 4
| ?
+
| Bits 0-15: Size of input line, in bytes. Bits 16-31: Gap between input lines, in bytes.
 
|-
 
|-
 
| 5
 
| 5
| ?
+
| Same as 4, but for the output.
 
|-
 
|-
 
| 6
 
| 6
| ?
+
| Flags, corresponding to the [[GPU/External_Registers#Transfer_Engine|Transfer Engine flags]]. However, for TextureCopy commands, bit 3 is always set, bit 2 is set if any output dimension is smaller than the input, and other bits are always 0.
 
|-
 
|-
 
| 7
 
| 7
Line 213: Line 262:
 
|}
 
|}
  
This command is similar to cmd3, this command also writes to the [[GPU]] registers at 0x1EF00C00.
+
This command is similar to cmd3. It also triggers the [[GPU/External_Registers#Transfer_Engine|GPU Transfer Engine]], but setting the TextureCopy parameters.
  
==GX Command 5==
+
== Flush Cache Regions ==
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
 
|-
 
|-

Revision as of 04:12, 11 December 2015

This page describes the structure of the GSP shared memory. GX commands and framebuffer info is stored here, and other unknown data.


Interrupt info

The Interrupt info structure is located at sharedmemvadr + process_gsp_index*0x40.

It is a list of interrupts (id's 0-6 exist).

Index Byte Description
0x0 Index of the last processed data (field size is 0x33) (must be updated manually)
0x1 To be processed datafields, (max 0x20 for PDC interrupts else the missed PDC filds are used,max 0x34 for all other if more interrupts happen and the Errorflag is 0 the Errorflag is set to 1)
0x2 Errorflag (if the first bit of Errorflag is set future PDC interrupts are ignored)
0x3 not used
0x4-0x7 missed PDC0
0x8-0xB missed PDC1
0xC-0x3F u8 Interrupttypefield (0=PSC0, 1=PSC1, 2=PDC0/VBlankTop (sent to all threads), 3=PDC1/VBlankBottom (sent to all threads), 4=PPF, 5=P3D, 6=DMA)

Framebuffer info

The framebuffer info structure for the main LCD is located at sharedmemvadr + 0x200 + threadindex*0x80. The framebuffer info structure for the sub LCD is located at sharedmemvadr + 0x240 + threadindex*0x80.

Framebuffer info header

Index Byte Description
0 Framebuffer info entry index
1 Flag
3-2 Padding

When a process sets this framebuffer info, it sets index to (index+1) & 1. Then it writes the framebuffer info entry, and sets flag to value 1. The GSP module loads this framebuffer info entry data into GSP state once the GPU finishes processing GX commands 3 or 4. Once the GSP module finishes loading this framebuffer info, it sets flag to value 0, then it will not load the framebuffer info again until flag is value 1. After loading this entry data into GSP state, the GSP module then writes this framebuffer state to the LCD registers. GSP module automatically updates the LCD framebuffer registers each time GX commands 3 or 4 finish, even when this shared memory data was not updated by the application.(GSP module toggles the active framebuffer register when automatically updating LCD registers, when shared memory data is not used)

The two 0x1C-byte framebuffer info entries are located at framebufferinfo+4.

3D Slider and 3D LED

See Configuration Memory.

Command Buffer Header

The command buffer is located at sharedmem + 0x800 + threadindex*0x200. After writing the command data to shared memory, TriggerCmdReqQueue must be used to trigger GSP processing for the command when the total commands field is value 1.

Index Byte Description
0 Current command index. This index is updated by GSP module after loading the command data, right before the command is processed. When this index is updated by GSP module, the total commands field is decreased by one as well.
1 Total commands to process, must not be value 0 when GSP module handles commands. This must be <=15 when writing a command to shared memory. This is incremented by the application when writing a command to shared memory, after increasing this value TriggerCmdReqQueue is only used if this field is value 1.
2 Must not be value 1. When the error-code u32 is set, this u8 is set to value 0x80.
3 Bit0 must not be set
4 u32 Error code for the last GX command which failed

Command Header

Index Byte Description
0 Command ID
2-1 ?
3 When non-zero GSP module may check flags for the specified cmdID, command handling is aborted when the flags are set. The corresponding flag for each CmdID is set once the command is handled by GSP module, this flag is likely cleared once the GPU finishes processing the command.

The command is located at cmdbuf + 0x20 + cmdindex*0x20, the size of each command is 0x20-bytes. The command parameters are located at command+4. Addresses specified in parameters are application vaddrs, these are usually located in either the process GSP heap or VRAM. For applications these addresses are normally located in the GSP heap, while for other processes these addresses are located in VRAM. Addresses/sizes specified in parameters except for cmd0 and cmd5 must be 8-byte aligned.

Commands

Trigger DMA Request

Index Word Description
0 u8 CommandID is 0x00
1 Source address
2 Destination address
3 Size
6-4 Unused
7 Flush source (0 = don't flush, 1 = flush)

This command is normally used to DMA data from the application GSP heap to VRAM. When flushing is enabled and the source buffer is not located within VRAM, svcFlushProcessDataCache is used to flush the source buffer.

Trigger Command List Processing

Index Word Description
0 u8 CommandID is 0x01
1 Buffer address
2 Buffer size
3 Update gas additive blend results (0 = don't update, 1 = update)
6-4 Unused
7 Flush buffer (0 = don't flush, 1 = flush)

This command converts the specified address to a physical address, then writes the physical address and size to the GPU registers at 0x1EF018E0. This buffer contains GPU commands. When flushing is enabled, svcFlushProcessDataCache is used to flush the buffer.

Trigger Memory Fill

Index Word Description
0 u8 CommandID is 0x02
1 Buf0 start address (0 = don't fill anything)
2 Buf0 value
3 Buf0 end address
4 Buf1 start address (0 = don't fill anything)
5 Buf1 value
6 Buf1 end address
7 Control0 | (Control1 << 16)

This command converts the specified addresses to physical addresses, then writes these addresses and the specified parameters to the GPU registers at 0x1EF00010 and 0x1EF00020. Doing so fills the specified buffers with the associated 4-byte value. This is used to clear GPU framebuffers. The associated buffer address must not be <= to the main buffer address, thus the associated buffer address must not be zero as well. When the bufX address is zero, processing for the bufX parameters is skipped.

The values of Control0 and Control1 give information about the type of memory fill. See here for more information about memory fill parameters.

Trigger Display Transfer

Index Word Description
0 u8 CommandID is 0x03
1 Input framebuffer address
2 Output framebuffer address
3 Input framebuffer dimensions
4 Output framebuffer dimensions
5 Flags, for applications this is 0x1001000 for the main screen, and 0x1000 for the sub screen.
7-6 Unused

This command converts the specified addresses to physical addresses, then writes these physical addresses and parameters to the GPU registers at 0x1EF00C00. This GPU command copies the already rendered framebuffer data from the input GPU framebuffer address to the specified output LCD framebuffer. The input framebuffer is normally located in VRAM.

The GPU color buffer is stored in the same Z-curve (tiled) format as textures. By default, SetDisplayTransfer converts the given buffer from the tiled format to a linear format adapted to the LCD framebuffers.

Display transfers are performed asynchronously, so after requesting a display transfer you should wait for the PPF interrupt to fire before reading the output data.

Some color formats seem to require specific input / output sizes when performing a display transfer, doing an RGB5A1->RGBA4 display transfer would never fire the PPF interrupt with a 32x32 buffer, increasing the buffer to 128x128 made it fire correctly.

Trigger Texture Copy

Index Word Description
0 u8 CommandID is 0x04
1 Input buffer address.
2 Output buffer address.
3 Total bytes to copy, not including gaps.
4 Bits 0-15: Size of input line, in bytes. Bits 16-31: Gap between input lines, in bytes.
5 Same as 4, but for the output.
6 Flags, corresponding to the Transfer Engine flags. However, for TextureCopy commands, bit 3 is always set, bit 2 is set if any output dimension is smaller than the input, and other bits are always 0.
7 Unused

This command is similar to cmd3. It also triggers the GPU Transfer Engine, but setting the TextureCopy parameters.

Flush Cache Regions

Index Word Description
0 u8 CommandID is 0x05
1 Buf0 address
2 Buf0 size
3 Buf1 address
4 Buf1 size
5 Buf2 address
6 Buf2 size
7 Unused

The application buffer addresses specified in the parameters are used with svcFlushProcessDataCache. The input buf0 size must not be zero. When buf1 size is zero, svcFlushProcessDataCache() for buf1 and buf2 are skipped. When buf2 size is zero, svcFlushProcessDataCache() for buf2 is skipped.