Everything You Have Always Wanted to Know about the Playstation
But Were Afraid to Ask

Memory


The PSX's memory consists of four 512k 60ns SRAM chips creating 2 megabytes of system memory. The RAM is arranged so that the addresses at 0x00xxxxxx, 0xA0xxxxxx, 0x80xxxxxx all point to the same physical memory. The PSX has a special coprocessor called cop0 that handles almost every aspect of memory management. Let us first examine how the memory looks and then how it is managed.

The PSX Memory Map

0x0000_0000-0x0000_ffff Kernel (64K)

0x0001_0000

0x001f_ffff

User Memory (1.9 Meg)
   
0x1f00_0000-0x1f00_ffff Parallel Port (64K)
   
0x1f80_0000-0x1f80_03ff Scratch Pad (1024 bytes)
   
0x1f80_1000-0x1f80_2fff Hardware Registers (8K)
   
0x8000_0000

0x801f_ffff

Kernel and User Memory Mirror (2 Meg)

Cached

   

0xa000_0000

0xa01f_ffff

Kernel and User Memory Mirror (2 Meg)

Uncached

   
0xbfc0_0000-0xbfc7_ffff BIOS (512K)

All blank areas represent the absence of memory. The mirrors are used mostly for caching and exception handling purposes. The kernel is also mirrored in all three user memory spaces.

Virtual Memory

The PSX uses a memory architecture known as "Virtual Memory" to help with general system memory and cache management. In a nutshell, what the PSX does is mirror the two megs of addressable space into three segments at three different virtual addresses. The names of these segments are Kuseg, Kseg0, and Kseg1.

Kuseg spans from 0x0000_0000 to 0x001f_ffff. This is what you might call "real" memory. This facilitates the kernel having direct access to user memory regions.

Kseg0 begins at virtual address 0x8000_0000 and goes to 0x801f_ffff. This segment is always translated to a linear 2MB region of the physical address space starting at physical address 0. All references through this segment are cacheable. When the most significant three bits of the virtual address are "100", the virtual address resides in Kseg0. The physical address is constructed by replacing these three bits of the virtual address with the value "000".

Kseg1 is also a linear 2MB region from 0xa000_0000 to 0xa01f_ffff pointing to the same address at address 0. When the most significant three bits of the virtual address are "101", the virtual address resides in Kseg1. The physical address is constructed by replacing these three bits of the virtual address with the value "000". Unlike Kseg0, references through Kseg1 are not cacheable.

Looking a little deeper into how virtual memory works, the following shows the anatomy of an R3000A virtual address. The most significant 20-bits of the 32-bit virtual address are called the virtual page number, or VPN. Only the three highest bits (segment number) are involved in the virtual to physical address translation.

31 0
VPN Offset
31 30 29 20 12

bits 31-29

0xx kuseg

100 kseg0

101 kseg1

The three most significant bits of the virtual address identify which virtual address segment the processor is currently referencing; these segments have associated with them the mapping algorithm to be employed, and whether virtual addresses in that segment may reside in the cache. Pages are mapped by substituting a 20-bit physical frame number (PFN) for the 20-bit virtual page number field of the virtual address. This substitution is performed through the use of the on-chip Translation Lookaside Buffer (TLB). The TLB is a fully associative memory that holds 64 entries to provide a mapping of 64 4kB pages. When a virtual reference to kuseg each TLB entry is probed to see if it maps the corresponding VPN.

The System Control Coprocessor (Cop0)

This unit is actually part of the R3000A. This particular cop0 has been modified from the original R3000A cop0 architecture with the addition of a few registers and functions. Cop0 contains 16 32-bit control registers that control the various aspects of memory management, system interrupt (exception) management, and breakpoints. Much of it is compatible with the normal R3000A cop0. The following is an overview of the Cop0 registers.

Cop0 Registers

Number Mnemonic Name Read/Write Usage
0 INDX Index r/w Index to an entry in the 64-entry TLB file
1 RAND Random r Provides software with a "suggested" random TLB entry to be written with the correct translation
2 TLBL TBL low r/w Provides the data path for operations which read, write, or probe the TLB file (first 32-bits)
3 BPC Breakpoint PC r/w Sets the breakpoint address to break on execute
4 CTXT Context r Duplicates information in the BADV register, but provides this information in a form that may be more useful for a software TLB exception handler
5 BDA Breakpoint data r/w Sets the breakpoint address for load/store operations
6 PIDMASK PID Mask r/w Process ID mask
7 DCIC Data/Counter interrupt control r/w Breakpoint control
8 BADV Bad Virtual Address r Contains the address whose reference caused an exception
9 BDAM Break data mask r/w Data fetch address is ANDed with this value and then compared to the value in BDA
10 TLBH TBL high r/w Provides the data path for operations which read, write, or probe the TLB file (second 32-bits)
11 BPCM Break point counter mask r/w Program counter is ANDed with this value and then compared to the value in BPC
12 SR System status register r/w Contains all the major status bits
13 CAUSE Cause r Describes the most recently recognized exception
14 EPC Exception Program Counter r Contains the return address after an exception
15 PRID Processor ID r Cop0 type and revision level
16 ERREG ??? ? ????

Note that some of these registers will be explained later in the part on exception handling. But for now we will return to how the Cop0 is used in memory management.

Returning to the TLB

As stated before, the TLB is a fully associative memory that holds 64 entries to provide a mapping of 64 4kB pages. Each TLB entry is 64-bits wide. This is referenced by the Index, Random, TBL high, and TBL low. It is used to virtual to physical address mapping.

The Index Register

The Index register is a 32-bit, read-write register, which has a 6-bit field used to index to a specific entry in the 64-entry TLB file. The high-order bit of the register is a status bit which reflects the success or failure of a TLB Probe (tlbp) instruction. The Index register also specifies the TLB entry that will be affected by the TLB Read (tlbr) and TLB Write Index (tlbwi) instructions. the following shows the format of the Index register.

31 30 14 13 8 7 0
P 0 Index 0
1 17 6 8
P
Probe failure. Set to 1 when the last TLBProbe (tlbp) instruction was unsuccessful.
Index
Index to the TLB entry that will be affected by the TLBRead and TLBWrite instructions.
0
Reserved. Must be written as 0. Returns 0 when read.

The Random Register

The Random register is a 32-bit read-only register. The format of the Random register is below. The six-bit Random field indexes a random entry in the TLB. It is basically a counter which decrements on every clock cycle, but is constrained to count in the range of 63 to 8. That is, software is guaranteed that the Random register will never index into the first 8 TLB entries. These entries can be "locked" by software into the TLB file, guaranteeing that no TLB miss exceptions will occur in operations which use those virtual address. This is useful for particularly critical areas of the operating system.

0 Random 0
18 6 8
Random
A random index (with a value from 8 to 63) to a TLB entry.
0
Reserved. Returns 0 when read.

The Random register is typically used in the processing of a TLB miss exception. The Random register provides software with a "suggested" TLB entry to be written with the correct translation; although slightly less efficient than a Least Recently Used (LRU) algorithm, Random replacement offers substantially similar performance while allowing dramatically simpler hardware and software management. To perform a TLB replacement, the TLB Write Random (tlbwr) instruction is used to write the TLB entry indexed by this register. At reset, this counter is preset to the value "63". Thus, it is possible for two processors to operate in "lock-step", even when using the Random TLB replacement algorithm. Also, software may directly read this register, although this feature probably has little utility outside of device testing and diagnostics.

TBL High and TBL Low Registers

These two registers provide the data path for operations which read, write, or probe the TLB file. The format of these registers is the same as the format of a TLB entry.

TBL High TBL Low
VPN PID 0 FPN N D V G 0
20 6 6 20 1 1 1 1 8
VPN
Virtual Page Number. Bits 31-12 of virtual address.
PID
Process ID field. A 6-bit field which lets multiple processes share the TLB while each process has a distinct mapping of otherwise identical virtual page numbers.
PFN
Page Frame Number. Bits 31-12 of the physical address.
N
Non-cacheable. If this bit is set, the page is marked as non-cacheable
D
Dirty. If this bit is set, the page is marked as "dirty" and therefore writable. This bit is actually a "write-protect" bit that software can use to prevent alteration of data
V
Valid. If this bit is set, it indicates that the TLB entry is valid; otherwise, a TLBL or TLBS Miss occurs.
G
Global. If this bit is set, the R3000A ignores the PID match requirement for valid translation. In kseg2, the Global bit lets the kernel access all mapped data without requiring it to save or restore PID (Process ID) values.
0
Reserved. Must be written as 0. Returns 0 when read.

Exception Handling

There are times when in is necessary to suspend a program in order to process a hardware or software function. The exception processing capability of the R3000A is provided to assure an orderly transfer of control from an executing program to the kernel. Exceptions may be broadly divided into two categories: they can be caused by an instruction or instruction sequence, including an unusual condition arising during its execution; or can be caused by external events such as interrupts. When an R3000A detects an exception, the normal sequence of instruction flow is suspended; the processor is forced to kernel mode where it can respond to the abnormal or asynchronous event. The table below lists the exceptions recognized by the R3000A.

Exception Mnemonic Cause
Reset Reset Assertion of the Reset signal causes an exception that transfers control to the special vector at virtual address 0xbfc0_0000 (The start of the BIOS)
Bus Error

IBE

DBE (Data)

Assertion of the Bus Error input during a read operation, due to such external events as bus timeout, backplane memory errors, invalid physical address, or invalid access types.
Address Error

AdEL (Load)

AdES (Store)

Attempt to load, fetch, or store an unaligned word; that is, a word or halfword at an address not evenly divisible by four or two, respectively. Also caused by reference to a virtual address with most significant bit set while in User Mode.
Overflow Ovf Twos complement overflow during add or subtract.
System Call Sys Execution of the SYSCALL Trap Instruction
Breakpoint Bp Execution of the break instruction
Reserved Instruction RI Execution of an instruction with an undefined or reserved major operation code (bits 31:26), or a special instruction whose minor opcode (bits 5:0) is undefined.
Co-processor Unusable CpU Execution of a co-processor instruction when the CU (Co-processor usable) bit is not set for the target co-processor.
TLB Miss

TLBL (Load)

TLBS (Store)

A referenced TLB entry's Valid bit isn't set
TLB Modified Mod During a store instruction, the Valid bit is set but the dirty bit is not set in a matching TLB entry.
Interrupt Int Assertion of one of the six hardware interrupt inputs or setting of one of the two software interrupt bits in the Cause register.

Returning to the Cop0

The Cop0 controls the exception handling with the use of the Cause register, the EPC register, the Status register, the BADV register, and the Context register. A brief description of each follows, after which the rest of the Cop0 registers for breakpoint management will be described for the sake of completeness.

The Cause Register

The contents of the Cause register describe the last exception. A 5-bit exception code indicates the cause of the current exception; the remaining fields contain detailed information specific to certain exceptions. All bits in this register, with the exception of the SW bits, are read-only.

31 0
BD 0 CE 0 IP SW 0 EXECODE 0
1 1 2 12 6 2 1 5 2
BD
Branch Delay. The Branch Delay bit is set (1) if the last exception was taken while the processor was executing in the branch delay slot. If so, then the EPC will be rolled back to point to the branch instruction, so that it can be re-executed and the branch direction re-determined.
CE
Coprocessor Error, Contains the coprocessor number if the exception occurred because of a coprocessor instruction for a coprocessor which wasn't enabled in SR.
IP
Interrupts Pending. It indicates which interrupts are pending. Regardless of which interrupts are masked, the IP field can be used to determine which interrupts are pending.
SW
Software Interrupts. The SW bits can be written to set or reset software interrupts. As long as any of the bits are set within the SW field they will cause an interrupt if the corresponding bit is set in SR under the interrupt mask field.
0
Reserved. Must be written as 0. Returns 0 when read.
EXECODE
Exception Code Field. Describes the type of exception that occurred. The following table lists the type of exception that it was.
Number Mnemonic Description
0 INT External Interrupt
1 MOD TLB Modification Exception
2 TLBL TLB miss Exception (Load or instruction fetch)
3 TLBS TLB miss exception (Store)
4 ADEL Address Error Exception (Load or instruction fetch)
5 ADES Address Error Exception (Store)
6 IBE Bus Error Exception (for Instruction Fetch)
7 DBE Bus Error Exception (for data Load or Store)
8 SYS SYSCALL Exception
9 BP Breakpoint Exception
10 RI Reserved Instruction Exception
11 CPU Co-Processor Unusable Exception
12 OVF Arithmetic Overflow Exception
13-31 - Reserved

The EPC (Exception Program Counter) Register

The 32-bit EPC register contains the virtual address of the instruction which took the exception, from which point processing resumes after the exception has been serviced. When the virtual address of the instruction resides in a branch delay slot, the EPC contains the virtual address of the instruction immediately preceding the exception (that is, the EPC points to the Branch or Jump instruction).

BADV Register

The BADV register saves the entire bad virtual address for any addressing exception.

Context Register

The Context register duplicates some of the information in the BADV register, but provides this information in a form that may be more useful for a software TLB exception handler. The following illustrates the layout of the Context register. The Context register is used to allow software to quickly determine the main memory address of the page table entry corresponding to the bad virtual address, and allows the TLB to be updated by software very quickly (using a nine-instruction code sequence).

PTE Base BADV 0
11 19 2
0
Reserved. Must be written as 0. Returns 0 when read.
BADV
Failing virtual page number (set by hardware read only derived from BADV register)
PTE Base
Base address of page table entry, set by the kernel

The Status Register

The Status register contains all the major status bits; any exception puts the system in Kernel mode. All bits in the status register, with the exception of the TS (TLB Shutdown) bit, are readable and writable; the TS bit is read-only. Figure 5.4 shows the functionality of the various bits in the status register. The status register contains a three level stack (current, previous, and old) of the kernel/user mode bit (KU) and the interrupt enable (IE) bit. The stack is pushed when each exception is taken, and popped by the Restore From Exception instruction. These bits may also be directly read or written. At reset, the SWc, KUc, and IEc bits are set to zero; BEV is set to one; and the value of the TS bit is set to 0 (TS = 0) The rest of the bit fields are undefined after reset.

31 0
CU 0 RE 0 BEV TS PE CM PZ SwC IsC IntMask 0 KUo IEo KUp IEp KUc IEc
4 2 1 2 1 1 1 1 1 1 1 8 2 1 1 1 1 1 1

The various bits of the status register are defined as follows:

CU
Co-processor Usability. These bits individually control user level access to co-processor operations, including the polling of the BrCond input port and the manipulation of the System Control Co-processor (CP0). CU2 is for the GTE, CU1 is for the FPA, which is not available in the PSX.
RE
Reverse Endianness. The R3000A allows the system to determine the byte ordering convention for the Kernel mode, and the default setting for user mode, at reset time. If this bit is cleared, the endianness defined at reset is used for the current user task. If this bit is set, then the user task will operate with the opposite byte ordering convention from that determined at reset. This bit has no effect on kernel mode.
BEV
Bootstrap Exception Vector. The value of this bit determines the locations of the exception vectors of the processor. If BEV = 1, then the processor is in "Bootstrap" mode, and the exception vectors residein the BIOS ROM. If BEV = 0, then the processor is in normal mode, and the exception vectors reside in RAM.
TS
TLB Shutdown. This bit reflects whether the TLB is functioning.
PE
Parity Error. This field should be written with a "1" at boot time. Once initialized, this field will always be read as "0".
CM
Cache Miss. This bit is set if a cache miss occurred while the cache was isolated. It is useful in determining the size and operation of the internal cache subsystem.
PZ
Parity Zero. This field should always be written with a "0".
SwC
Swap Caches. Setting this bit causes the execution core to use the on-chip instruction cache as a data cache and vice-versa. Resetting the bit to zero unswaps the caches. This is useful for certain operations such as instruction cache flushing. This feature is not intended for normal operation with the caches swapped.
IsC
Isolate Cache. If this bit is set, the data cache is "isolated" from main memory; that is, store operations modify the data cache but do not cause a main memory write to occur, and load operations return the data value from the cache whether or not a cache hit occurred. This bit is also useful in various operations such as flushing.
IM
Interrupt Mask. This 8-bit field can be used to mask the hardware and software interrupts to the execution engine (that is, not allow them to cause an exception). IM(1:0) are used to mask the software interrupts, and IM (7:2) mask the 6 external interrupts. A value of "0" disables a particular interrupt, and a "1" enables it. Note that the IE bit is a global interrupt enable; that is, if the IE is used to disable interrupts, the value of particular mask bits is irrelevant; if IE enables interrupts, then a particular interrupt is selectively masked by this field.
KUo
Kernel/User old. This is the privilege state two exceptions previously. A "0" indicates kernel mode.
IEo
Interrupt Enable old. This is the global interrupt enable state two exceptions previously. A "1" indicates that interrupts were enabled, subject to the IM mask.
KUp
Kernel/User previous. This is the privilege state prior to the current exception A "0" indicates kernel mode.
IEp
Interrupt Enable previous. This is the global interrupt enable state prior to the current exception. A "1" indicates that interrupts were enabled, subject to the IM mask.
KUc
Kernel/User current. This is the current privilege state. A "0" indicates kernel mode.
IEc
Interrupt Enable current. This is the current global interrupt enable state. A "1" indicates that interrupts are enabled, subject to the IM mask.
0
Fields indicated as "0" are reserved; they must be written as "0", and will return "0" when read.

PRID Register

This register is useful to software in determining which revision of the processor is executing the code. The format of this register is illustrated below.

0 Imp Rev
16 8 8
Imp
3 CoP0 type R3000A
7 IDT unique (3041) use REV to determine correct configuration.
Rev
Revision level.

Exception Vector Locations

The R3000A separates exceptions into three vector spaces. The value of each vector depends on the BEV (Boot Exception Vector) bit of the status register, which allows two alternate sets of vectors (and thus two different pieces of code) to be used. Typically, this is used to allow diagnostic tests to occur before the functionality of the cache is validated; processor reset forces the value of the BEV bit to a 1.

Exception Vectors When BEV = 0
Exception Virtual Address Physical Address
Reset 0xbfc0_0000 0x1fc0_0000
UTLB Miss 0x8000_0000 0x0000_0000
General 0x8000_0080 0x0000_0080

Exception Vectors When BEV = 1
Exception Virtual Address Physical Address
Reset 0xbfc0_0000 0x1fc0_0000
UTLB Miss 0xbfc0_0100 0x1fc0_0100
General 0xbfc0_0180 0x1fc0_0180

Exception Priority

The following is a priority list of exceptions:

  1. Reset At any time (highest)
  2. AdEL Memory (Load instruction)
  3. AdES Memory (Store instruction)
  4. DBE Memory (Load or store)
  5. MOD ALU (Data TLB)
  6. TLBL ALU (DTLB Miss)
  7. TLBS ALU (DTLB Miss)
  8. Ovf ALU
  9. Int ALU
  10. Sys RD (Instruction Decode)
  11. Bp RD (Instruction Decode)
  12. RI RD (Instruction Decode)
  13. CpU RD (Instruction Decode)
  14. TLBL I-Fetch (ITLB Miss)
  15. AdEL IVA (Instruction Virtual Address)
  16. IBE RD (end of I-Fetch, lowest)

Breakpoint Management

The following is a listing of the registers in Cop0 that are used for breakpoint management. These registers are very useful for low-level debugging.

BPC

Breakpoint on execute. Sets the breakpoint address to break on execute.

BDA

Breakpoint on data access. Sets the breakpoint address for load/store operations

DCIC

Breakpoint control. To use the Execution breakpoint, set PC. To use the Data access breakpoint, set DA and either R, W or both. Both breakpoints can be used simultaneously. When a breakpoint occurs, the PSX jumps to 0x0000_0040.

1 1 1 0 W R DA PC 1 0
1 1 1 1 1 1 1 1 1 23

Description of DCIC register
Bit Default When switched
W 0 Break on write
R 0 Break on read
DA 0 Data access breakpoint enabled
PC 0 Execution breakpoint enabled

BDAM

Data Access breakpoint mask. Data fetch address is ANDed with this value and then compared to the value in BDA.

BPCM

Execute breakpoint mask. Program counter is ANDed with this value and then compared to the value in BPC.

DMA

From time to time the PSX will need to take the CPU off the main bus in order to give a device access directly to memory. The devices able to take control of the bus are the CD-ROM, MDEC, GPU, SPU, and the Parallel port. There are 7 DMA channels in all (The GPU and MDEC use two) The DMA registers reside between 0x1f80_1080 and 0x1f80_10f4. The DMA channel registers are located starting at 0x1f80_1080. The base address for each channel is as follows:

Base Address Channel Number Device
0x1f80_1080 DMA channel 0 MDECin
0x1f80_1090 DMA channel 1 MDECout
0x1f80_10a0 DMA channel 2 GPU (lists + image data)
0x1f80_10b0 DMA channel 3 CD-ROM
0x1f80_10c0 DMA channel 4 SPU
0x1f80_10d0 DMA channel 5 PIO
0x1f80_10e0 DMA channel 6 GPU OTC (reverse clear the Ordering Table)

Each channel has three 32-bit control registers at a offset of the base address for that particular channel. These registers are the DMA Memory Address Register (D_MADR) at the base address, DMA Block Control Register (D_BCR) at base+4, and the DMA Channel Control Register (D_CHCR) at base+8.

In order to use DMA the appropriate channel must be enabled. This is done using the DMA Primary Control Register (DPCR) located at 0x1f80_10f0.

DMA Primary Control Register (DPCR) 0x1f80_10f0
DMA6 DMA5 DMA4 DMA3 DMA2 DMA1 DMA0
4 4 4 4 4 4 4 4

Each register has a 4-bit control block allocated in this register.

Bit 0
Unknown
Bit 1
Unknown
Bit 2
Unknown
Bit 3
DMA enabled when set to 1.

Bit 3 must be set for a channel to operate.

As stated above, each device has three 32-bit control registers within it's own DMA address space. The following describes their functions. The n represents 8,9,a,b,c,d,e for DMA channels 0,1,2,3,4,5,6 respectively.

DMA Memory Address Register (D_MADR) 0x1f80_10n0
31 0
MADR
MADR
Pointer to the virtual address the DMA will start reading from/writing to.
DMA Block Control Register (D_BCR) 0x1f80_10n4
31 0
BA BS
16 16
BA
Amount of blocks
BS
Block size (words)

The channel will transfer BA blocks of BS words. Take care not to set the size larger than the buffer of the corresponding unit can hold. (GPU and SPU both have a $10 word buffer). A larger block size, means a faster transfer.

DMA Channel Control Register (D_CHCR) 0x1f80_10n8
31 0
0 TR 0 LI CO 0 DR
7 1 13 1 1 8 1
TR
If 0, no DMA transfer. If 1, start DMA transfer or DMA transfer busy.
LR
If 1, transfer linked list. (GPU only)
CO
If 1, transfer continuous stream of data.
DR
Controls direction from memory.

The last register is used to control DMA interrupts. The usage is currently unknown.