Introduction to VR4121 64 bit microprocessor in VR series manufactured by NEC. Designed for high performance handheld portable computing devices. Designed around the MIPS RISC architecture developed by MIPS technologies.
Block Diagram HSP RTC DSU ICU PMU CMU DCU DMAU GIU KIU LED AIU PIU SIU FIR D/A A/D PLL VR4120 CPU core 131/168 MHz BUS CONTROL UNIT OSB I/O keyboard RS232 driver LED Touch panel LCD module PC card ROM/ Flash EDO/ SDRAM BUS CONTROL UNIT HSP RTC DSU ICU PMU CMU DCU DMAU GIU KIU LED AIU PIU SIU FIR
VR4120 CPU core CPU Coprocessor 0 Instruction cache Data cache CPU bus interface Clock generator
CPU registers R0 = 0 R1 R2 R30...... R31 = Link Address 63 0 HI 63 0 LO 63 0 PC 63 0 General purpose registers Multiply / divide registers Program counter
Data format and addressing Double word 64 bits Word 32 bits Half word 16 bits Byte 8 bits VR4121 supports the little-endian order only
CPU instruction set Load and store instructions Computational instructions Jump and branch instructions Coprocessor 0 instructions Special instructions Load and store instructions Computational instructions Jump and branch instructions Special instructions 32 bit length instruction MIPS III16 bit length instruction MIPS 16
Memory Management Unit : MMU Virtual addresses are translated into physical addresses using an on-chip TLB. The on-chip TLB is a fully- associative memory that holds 32 entries. These pages can have five different sizes, 1 K, 4 K, 16 K, 64 K, and 256 K, and can be specified in each entry.
Cache memory (1) Cache controller I-cache D-cache Main memory VR4120 CPU core
Cache memory (2) The instruction cache is 16 Kbytes and the data cache is 8 Kbytes. The line size for the instruction/data cache is 4 words (16 bytes). The VR4120 CPU core uses write- back policy.
Conclusion : Features of VR4121 Employs 64-bit RISC CPU Core (VR 4120 equivalent) Internal 64-bit data processing. Optimized 6-stage pipeline. On-chip cache memory Instruction cache: 16 Kbytes, Data cache: 8 Kbytes. Address space Physical address space: 32 bits, Virtual address space: 40 bits.
Conclusion : Features of VR4121 Memory controller (supports ROM, EDO-type DRAM, SDRAM, SROM, and flash memory) Keyboard, Touch panel, audio interface. DMA, interrupt controller. Serial interface. General-purpose ports. IrDA interface.
Conclusion : Features of VR4121 Effective power management features, which include the following four operating modes:Full-speed mode, Standby mode, Suspend mode, Hibernate mode. External input clock: 32.768 kHz, 18.432 MHz (for internal CPU core and peripheral unit operation), 48 MHz(dedicated for IrDA interface)
Pipeline Stages –The pipeline is controlled by PClock. Pipeline in MIPS III(32-bit length) instruction mode –The execution of each instruction takes at least 5 PCycles. IF RF EX DC WB Stage PClock Pcycle
V R 4121 Pipeline Pipeline in MIPS16 (16-bit length) instruction mode –The execution of each instruction requires at least 6 PCycles. Stage PClock Pcycle IF IT RF EX DCWB
Delay in Pipeline Branch Delay, a one-cycle branch delay occurs when: –Target address is calculated by a jump instruction. –Branch condition of branch instruction is met and then logical operation starts for branch-destination comparison. Load Delay –A load instruction that does not allow its result to be used by the instruction immediately following is called a delayed load instruction.
Delay in Pipeline (Branch delay slot) Target IF RF EX DC WB Branch Branch delay Branch Delay (in MIPS III Instruction Mode)
Delay in Pipeline Branch delay IF IT RF EX DCWB IF IT RF EX DCWB IF IT RF EX DCWB (Branch delay slot) Target Branch Delay (in MIPS 16 Instruction Mode) Branch
Interlock and Exception Handling Pipeline flow is interrupted when cache misses or exceptions occur, or when data dependencies are detected. Faults ExceptionsInterlocks StallAbortSlip
Exception conditions Example: Interrupt exception, ITLB exception. When an exception condition occurs, the relevant instruction and all those that follow it in the pipeline are cancelled. IF RF EX DC WB IF RF IF Exception 1 2 Exception vector Discard stage Interpret
Stall conditions Example: Instruction TLB Miss and Instruction Cache Miss. When a stall occurs, the processor will resolve the condition and then the pipeline will continue.
Stall Conditions IF RF EX DC DC IF RF EX EX IF RF RF EX EX EX DC WB RF RF RF EX DCWB IF RF EX DC WBWB DC DC DC WB WB WB WB Data Cache Miss Stall 12 3 1. Detect data cache miss 2. Starting moving data cache line to write buffer 3. Get last word into cache and restart pipeline
Stall conditions IF RF EX DC DC IF RF EX EX IF RF RF EX EX EX DC WB RF RF RF EX DCWB DC DC DC WB WB WB WB Cache Instruction Stall 1 2 IF RF EX DC WBWB 1. Cache instruction start 2. Cache instruction complete
Slip conditions If all of the source operands are available (either from the register file or via the internal bypass logic) and all the hardware resources necessary to complete the instruction will be available whenever required, then the instruction "run"; otherwise, the instruction will "slip". Example: Load Data Interlock and MD Busy Interlock.
Slip conditions Load Data Interlock IF RF EX DC WB IF RF EX DC DC2WB 12 IF RF EX DC DC2WB IF RF RF RF EXDC WB 1 Bypass Load A Load B Add A,B 1. Detect load interlock 2.Get the target data
Slip conditions IF RF EX DC WB 121 Bypass IF RF EX EX EXEXWBDC 11 IF RF EX EX EXEXWBDC MUL/DIV Multiply/Division opration MFHI/MFLO 1. Detect MD busy Interlock 2. Get target data MD Busy Interlock
Bypassing Operand bypass allows an instruction in the EX stage to continue without having to wait for data or conditions to be written to the register file at the end of the WB stage.