Difference between revisions of "GPU/External Registers"
(→Map: Clarify address mappings) |
(Documented some registers; removed the length column because it's unnecessary) |
||
Line 136: | Line 136: | ||
== LCD Source Framebuffer Setup == | == LCD Source Framebuffer Setup == | ||
+ | |||
+ | All of these registers must be accessed with 32bit operations regardless of the registers' actual bit size. | ||
+ | |||
{| class="wikitable" border="1" | {| class="wikitable" border="1" | ||
! Offset | ! Offset | ||
− | |||
! Name | ! Name | ||
! Comments | ! Comments | ||
|- | |- | ||
| 0x00 | | 0x00 | ||
− | + | | Pixel clock | |
− | | Pixel | + | | Higher values are slower, 12bits. |
− | | Higher values are slower, | + | |
+ | Setting this value too low will make the screen not be able to sync any pixels other than a single one from the wrong location. The lowest the screen can handle is 0x1C1, with rare glitching. | ||
|- | |- | ||
| 0x04 | | 0x04 | ||
− | | | + | | HBlank timer(?) |
− | + | | Seems to determine the horizontal blanking interval. | |
− | | | + | |
− | + | ||
+ | Setting this to lower than <code>HTotal - HDisp</code> will make the screen not catch up with the scanlines, some will be skipped, some will be misaligned. | ||
+ | |||
+ | Setting this to higher than <code>HTotal - HDisp</code> will make the displayed image misaligned to the right. | ||
+ | |||
+ | Setting this to higher than <code>HTotal</code> seems to make the horizontal synchronization never happen. | ||
|- | |- | ||
| 0x08 | | 0x08 | ||
− | |||
| ? | | ? | ||
| must be >= REG#0x00 | | must be >= REG#0x00 | ||
|- | |- | ||
| 0x0C | | 0x0C | ||
− | |||
| ? | | ? | ||
| must be >= REG#0x08 | | must be >= REG#0x08 | ||
|- | |- | ||
| 0x10 | | 0x10 | ||
− | | | + | | ??? |
− | + | | Some sort of delay in signal, probably in the pixel clock | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
|- | |- | ||
| 0x14 | | 0x14 | ||
− | | | + | | HBlank length(?) |
− | + | | ??? Seems to offset the screen to the left if this value is high enough, but can glitch out the syncing on the bottom screen | |
− | | if this value is | ||
− | |||
− | |||
|- | |- | ||
| 0x18 | | 0x18 | ||
− | |||
| ??? | | ??? | ||
| should be < REG#0x10 | | should be < REG#0x10 | ||
|- | |- | ||
| 0x20 | | 0x20 | ||
− | |||
| HFrontPorch(?) | | HFrontPorch(?) | ||
| ??? the screen gets vertically offset with wrap-around | | ??? the screen gets vertically offset with wrap-around | ||
Line 200: | Line 187: | ||
|- | |- | ||
| 0x24 | | 0x24 | ||
− | |||
| HSync timer? | | HSync timer? | ||
| | | | ||
|- | |- | ||
| 0x28 | | 0x28 | ||
− | |||
| VDispStart(?) or VFrontPorch | | VDispStart(?) or VFrontPorch | ||
| | | | ||
|- | |- | ||
| 0x30 | | 0x30 | ||
− | |||
| VTotal | | VTotal | ||
| Total amount of vertical scanlines | | Total amount of vertical scanlines | ||
|- | |- | ||
| 0x34 | | 0x34 | ||
− | |||
| VDisp(?) | | VDisp(?) | ||
| Total amonut of vertical scanlines displayed (only for top screen it seems like) | | Total amonut of vertical scanlines displayed (only for top screen it seems like) | ||
|- | |- | ||
| 0x44 | | 0x44 | ||
− | |||
| ??? | | ??? | ||
| similar functionality to 0x10 | | similar functionality to 0x10 | ||
|- | |- | ||
| 0x4C | | 0x4C | ||
− | |||
| Overscan filler color | | Overscan filler color | ||
| | | | ||
|- | |- | ||
| 0x50 | | 0x50 | ||
− | |||
| Horizontal position counter | | Horizontal position counter | ||
| read-only | | read-only | ||
|- | |- | ||
| 0x54 | | 0x54 | ||
− | |||
| Horizontal scanline (HBlank) counter | | Horizontal scanline (HBlank) counter | ||
| read-only | | read-only | ||
|- | |- | ||
| 0x5C | | 0x5C | ||
− | |||
| ??? | | ??? | ||
| low u16: framebuffer width | | low u16: framebuffer width | ||
Line 246: | Line 224: | ||
|- | |- | ||
| 0x60 | | 0x60 | ||
− | |||
| ??? | | ??? | ||
| low u16: timing data(?) | | low u16: timing data(?) | ||
Line 252: | Line 229: | ||
|- | |- | ||
| 0x64 | | 0x64 | ||
− | |||
| ??? | | ??? | ||
| low u16: unknown | | low u16: unknown | ||
Line 258: | Line 234: | ||
|- | |- | ||
| 0x68 | | 0x68 | ||
− | |||
| Framebuffer A first address | | Framebuffer A first address | ||
| For top screen, this is the left eye 3D framebuffer. | | For top screen, this is the left eye 3D framebuffer. | ||
|- | |- | ||
| 0x6C | | 0x6C | ||
− | |||
| Framebuffer A second address | | Framebuffer A second address | ||
| For top screen, this is the left eye 3D framebuffer. | | For top screen, this is the left eye 3D framebuffer. | ||
|- | |- | ||
| 0x70 | | 0x70 | ||
− | |||
| Framebuffer format | | Framebuffer format | ||
| Bit0-15: framebuffer format, bit16-31: unknown | | Bit0-15: framebuffer format, bit16-31: unknown | ||
|- | |- | ||
| 0x78 | | 0x78 | ||
− | |||
| Framebuffer select | | Framebuffer select | ||
| Bit0: which framebuffer to display, bit1-7: unknown | | Bit0: which framebuffer to display, bit1-7: unknown | ||
|- | |- | ||
| 0x80 | | 0x80 | ||
− | |||
| Color lookup table index select | | Color lookup table index select | ||
| 8bits, write-only | | 8bits, write-only | ||
|- | |- | ||
| 0x84 | | 0x84 | ||
− | |||
| Color lookup table indexed element | | Color lookup table indexed element | ||
| Contains the value of the color lookup table indexed by the above register, 24bits, RGB8 (0x00BBGGRR) | | Contains the value of the color lookup table indexed by the above register, 24bits, RGB8 (0x00BBGGRR) | ||
Line 289: | Line 259: | ||
|- | |- | ||
| 0x90 | | 0x90 | ||
− | |||
| Framebuffer stride | | Framebuffer stride | ||
| Distance in bytes between the start of two framebuffer rows (must be a multiple of 8). | | Distance in bytes between the start of two framebuffer rows (must be a multiple of 8). | ||
|- | |- | ||
| 0x94 | | 0x94 | ||
− | |||
| Framebuffer B first address | | Framebuffer B first address | ||
| For top screen, this is the right eye 3D framebuffer. Unused for bottom screen. | | For top screen, this is the right eye 3D framebuffer. Unused for bottom screen. | ||
|- | |- | ||
| 0x98 | | 0x98 | ||
− | |||
| Framebuffer B second address | | Framebuffer B second address | ||
| For top screen, this is the right eye 3D framebuffer. Unused for bottom screen. | | For top screen, this is the right eye 3D framebuffer. Unused for bottom screen. |
Revision as of 21:25, 18 April 2018
This page describes the address range accessible from the ARM11, used to configure the basic GPU functionality. For information about the internal registers used for 3D rendering, see GPU/Internal Registers.
Map
Address mappings for the external registers. GSPGPU:WriteHWRegs takes these addresses relative to 0x1EB00000.
User VA | PA | Length | Name | Comments |
---|---|---|---|---|
0x1EF00004 | 0x10400004 | 4 | ? | |
0x1EF00010 | 0x10400010 | 16 | Memory Fill1 "PSC0" | GX command 2 |
0x1EF00020 | 0x10400020 | 16 | Memory Fill2 "PSC1" | GX command 2 |
0x1EF00030 | 0x10400030 | 4 | ? | |
0x1EF00034 | 0x10400034 | 4 | GPU Busy | Bit31 = cmd-list busy, bit27 = PSC0 busy, bit26 = PSC1 busy. |
0x1EF00050 | 0x10400050 | 4 | ? | Writes 0x22221200 on GPU init. |
0x1EF00054 | 0x10400054 | 4 | ? | Writes 0xFF2 on GPU init. |
0x1EF000C0 | 0x104000C0 | 4 | Backlight control | Writes 0x0 to allow backlights to turn off, 0x20000000 to force them always on. |
0x1EF00400 | 0x10400400 | 0x100 | Framebuffer Setup "PDC0" (top screen) | |
0x1EF00500 | 0x10400500 | 0x100 | Framebuffer Setup "PDC1" (bottom) | |
0x1EF00C00 | 0x10400C00 | ? | Transfer Engine "DMA" | |
0x1EF01000 | 0x10401000 | 0x4 | ? | Writes 0 on GPU init and before the Command List is used |
0x1EF01080 | 0x10401080 | 0x4 | ? | Writes 0x12345678 on GPU init. |
0x1EF010C0 | 0x104010C0 | 0x4 | ? | Writes 0xFFFFFFF0 on GPU init. |
0x1EF010D0 | 0x104010D0 | 0x4 | ? | Writes 1 on GPU init. |
0x1EF014?? | 0x104014?? | 0x14 | "PPF" ? | |
0x1EF018E0 | 0x104018E0 | 0x14 | Command List "P3D" |
Memory Fill
User VA | Description |
---|---|
0x1EF000X0 | Buffer start physaddr >> 3 |
0x1EF000X4 | Buffer end physaddr >> 3 |
0x1EF000X8 | Fill value |
0x1EF000XC | Control. bit0: start/busy, bit1: finished, bit8-9: fill-width (0=16bit, 1=3=24bit, 2=32bit) |
Memory fills are used to initialize buffers in memory with a given value, similar to memset. A memory fill is triggered by setting bit0 in the control register. Doing so aborts any running memory fills on that filling unit. Upon completion, the hardware unsets bit0 and sets bit1 and fires interrupt PSC0.
These registers are used by GX SetMemoryFill.
LCD Source Framebuffer Setup
All of these registers must be accessed with 32bit operations regardless of the registers' actual bit size.
Offset | Name | Comments |
---|---|---|
0x00 | Pixel clock | Higher values are slower, 12bits.
Setting this value too low will make the screen not be able to sync any pixels other than a single one from the wrong location. The lowest the screen can handle is 0x1C1, with rare glitching. |
0x04 | HBlank timer(?) | Seems to determine the horizontal blanking interval.
Setting this to higher than Setting this to higher than |
0x08 | ? | must be >= REG#0x00 |
0x0C | ? | must be >= REG#0x08 |
0x10 | ??? | Some sort of delay in signal, probably in the pixel clock |
0x14 | HBlank length(?) | ??? Seems to offset the screen to the left if this value is high enough, but can glitch out the syncing on the bottom screen |
0x18 | ??? | should be < REG#0x10 |
0x20 | HFrontPorch(?) | ??? the screen gets vertically offset with wrap-around
horizontal timing gets messed up |
0x24 | HSync timer? | |
0x28 | VDispStart(?) or VFrontPorch | |
0x30 | VTotal | Total amount of vertical scanlines |
0x34 | VDisp(?) | Total amonut of vertical scanlines displayed (only for top screen it seems like) |
0x44 | ??? | similar functionality to 0x10 |
0x4C | Overscan filler color | |
0x50 | Horizontal position counter | read-only |
0x54 | Horizontal scanline (HBlank) counter | read-only |
0x5C | ??? | low u16: framebuffer width
high u16: framebuffer height??? (seems to be unused) |
0x60 | ??? | low u16: timing data(?)
high u16: framebuffer total width (amount of pixels blitted regardless of framebuffer width) |
0x64 | ??? | low u16: unknown
high u16: framebuffer total height (amount of scanlines blitted regardless of framebuffer height) |
0x68 | Framebuffer A first address | For top screen, this is the left eye 3D framebuffer. |
0x6C | Framebuffer A second address | For top screen, this is the left eye 3D framebuffer. |
0x70 | Framebuffer format | Bit0-15: framebuffer format, bit16-31: unknown |
0x78 | Framebuffer select | Bit0: which framebuffer to display, bit1-7: unknown |
0x80 | Color lookup table index select | 8bits, write-only |
0x84 | Color lookup table indexed element | Contains the value of the color lookup table indexed by the above register, 24bits, RGB8 (0x00BBGGRR)
Accessing this register will increase the index register by one |
0x90 | Framebuffer stride | Distance in bytes between the start of two framebuffer rows (must be a multiple of 8). |
0x94 | Framebuffer B first address | For top screen, this is the right eye 3D framebuffer. Unused for bottom screen. |
0x98 | Framebuffer B second address | For top screen, this is the right eye 3D framebuffer. Unused for bottom screen. |
Framebuffer format
Bit | Description |
---|---|
2-0 | Color format |
3 | ? |
4 | Unused? |
5 | Enable parallax barrier (i.e. 3D). |
6 | 1 = main screen, 0 = sub screen. However if bit5 is set, this bit is cleared. |
7 | ? |
9-8 | Value 1 = unknown: get rid of rainbow strip on top of screen, 3 = unknown: black screen. |
15-10 | Unused? |
GSP module only allows the LCD stereoscopy to be enabled when bit5=1 and bit6=0 here. When GSP module updates this register, GSP module will automatically disable the stereoscopy if those bits are not set for enabling stereoscopy.
Framebuffer color formats
Value | Description |
---|---|
0 | GL_RGBA8_OES |
1 | GL_RGB8_OES |
2 | GL_RGB565_OES |
3 | GL_RGB5_A1_OES |
4 | GL_RGBA4_OES |
Color components are laid out in reverse byte order, with the most significant bits used first (i.e. non-24-bit pixels are stored as a little-endian values). For instance, a raw data stream of two GL_RGB565_OES pixels looks like GGGBBBBB RRRRRGGG GGGBBBBB RRRRRGGG.
Transfer Engine
Register address | Description |
---|---|
0x1EF00C00 | Input physical address >> 3 |
0x1EF00C04 | Output physical address >> 3 |
0x1EF00C08 | DisplayTransfer output width (bits 0-15) and height (bits 16-31). |
0x1EF00C0C | DisplayTransfer input width and height. |
0x1EF00C10 | Transfer flags. (See below) |
0x1EF00C14 | GSP module writes value 0 here prior to writing to 0x1EF00C18, for cmd3. |
0x1EF00C18 | Setting bit0 starts the transfer. Upon completion, bit0 is unset and bit8 is set. |
0x1EF00C1C | ? |
0x1EF00C20 | TextureCopy total amount of data to copy, in bytes. |
0x1EF00C24 | TextureCopy input line width (bits 0-15) and gap (bits 16-31), in 16 byte units. |
0x1EF00C28 | TextureCopy output line width and gap. |
These registers are used by GX command 3 and 4. For cmd4, *0x1EF00C18 |= 1 is used instead of just writing value 1. The DisplayTransfer registers are only used if bit 3 of the flags is unset and ignored otherwise. The TextureCopy registers are likewise only used if bit 3 is set, and ignored otherwise.
Flags Register - 0x1EF00C10
Bit | Description |
---|---|
0 | When set, the framebuffer data is flipped vertically. |
1 | When set, the input framebuffer is treated as linear and converted to tiled in the output, converts tiled->linear when unset. |
2 | This bit is required when the output width is less than the input width for the hardware to properly crop the lines, otherwise the output will be mis-aligned. |
3 | Uses a TextureCopy mode transfer. See below for details. |
4 | Not writable |
5 | Don't perform tiled-linear conversion. Incompatible with bit 1, so only tiled-tiled transfers can be done, not linear-linear. |
7-6 | Not writable |
10-8 | Input framebuffer color format, value0 and value1 are the same as the LCD Source Framebuffer Formats (usually zero) |
11 | Not writable |
14-12 | Output framebuffer color format |
15 | Not writable |
16 | Use 32x32 block tiling mode, instead of the usual 8x8 one. Output dimensions must be multiples of 32, even if cropping with bit 2 set above. |
17-23 | Not writable |
24-25 | Scale down the input image using a box filter. 0 = No downscale, 1 = 2x1 downscale. 2 = 2x2 downscale, 3 = invalid |
31-26 | Not writable |
TextureCopy
When bit 3 of the control register is set, the hardware performs a TextureCopy-mode transfer. In this mode, all other bits of the control register (except for bit 2, which still needs to be set correctly) and the regular dimension registers are ignored, and no format conversions are done. Instead, it performs a raw data copy from the source to the destination, but with a configurable gap between lines. The total amount of bytes to copy is specified in the size register, and the hardware loops reading lines from the input and writing them to the output until this amount is copied. The "gap" specified in the input/output dimension register is the number of chunks to skip after each "width" chunks of the input/output, and is NOT counted towards the total size of the transfer.
By correctly calculating the input and output gap sizes it is possible to use this functionality to copy arbitrary sub-rectangles between differently-sized framebuffers or textures, which is one of its main uses over a regular no-conversion DisplayTransfer. When copying tiled textures/framebuffers it's important to remember that the contents of a tile are laid out sequentially in memory, and so this should be taken into account when calculating the transfer parameters.
Specifying invalid/junk values for the TextureCopy dimensions can result in the GPU hanging while attempting to process this TextureCopy.
Command List
Register address | Description |
---|---|
0x1EF018E0 | Buffer size in bytes >> 3 |
0x1EF018E8 | Buffer physical address >> 3 |
0x1EF018F0 | Setting bit0 to 1 enables processing GPU command execution. Upon completion, bit0 seems to be reset to 0. |
These 3 registers are used by GX command 1. This is used for GPU commands.
Framebuffers
These LCD framebuffers normally contain the last rendered frames from the GPU. The framebuffers are drawn from left-to-right, instead of top-to-bottom.(Thus the beginning of the framebuffer is drawn starting at the left side of the screen)
Both of the 3D screen left/right framebuffers are displayed regardless of the 3D slider's state, however when the 3D slider is set to "off" the 3D effect is disabled. Normally when the 3D slider's state is set to "off" the left/right framebuffer addresses are set to the same physical address. When the 3D effect is disabled and the left/right framebuffers are set to separate addresses, the LCD seems to alternate between displaying the left/right framebuffer each frame.
Init Values from nngxInitialize for Top Screen
- 0x1EF00400 = 0x1C2
- 0x1EF00404 = 0xD1
- 0x1EF00408 = 0x1C1
- 0x1EF0040C = 0x1C1
- 0x1EF00410 = 0
- 0x1EF00414 = 0xCF
- 0x1EF00418 = 0xD1
- 0x1EF0041C = 0x1C501C1
- 0x1EF00420 = 0x10000
- 0x1EF00424 = 0x19D
- 0x1EF00428 = 2
- 0x1EF0042C = 0x1C2
- 0x1EF00430 = 0x1C2
- 0x1EF00434 = 0x1C2
- 0x1EF00438 = 1
- 0x1EF0043C = 2
- 0x1EF00440 = 0x1960192
- 0x1EF00444 = 0
- 0x1EF00448 = 0
- 0x1EF0045C = 0x19000F0
- 0x1EF00460 = 0x1c100d1
- 0x1EF00464 = 0x1920002
- 0x1EF00470 = 0x80340
- 0x1EF0049C = 0
More Init Values from nngxInitialize for Top Screen
- 0x1EF00468 = 0x18300000, later changed by GSP module when updating state, framebuffer
- 0x1EF0046C = 0x18300000, later changed by GSP module when updating state, framebuffer
- 0x1EF00494 = 0x18300000
- 0x1EF00498 = 0x18300000
- 0x1EF00478 = 1, doesn't stay 1, read as 0
- 0x1EF00474 = 0x10501