Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Philips TriMedia A VLIW Architecture By Jurjen Westra.

Similar presentations


Presentation on theme: "The Philips TriMedia A VLIW Architecture By Jurjen Westra."— Presentation transcript:

1 The Philips TriMedia A VLIW Architecture By Jurjen Westra

2 TM-1 Block Diagram SDRAM Main Memory Interface I2C Interface Timers PCI interface Sync Serial Interface Audio Out Au d io In Video In VLD Coprocessor 32K I$ 16K D$ Video Out VLIW CPU Image Coprocessor TM has 128 general purpose 32 bit Registers

3 VLIW means relying on compiler techniques Only Cache-misses are run-time handled Compiler Scheduling / Instruction Level Parallelism Operation guarding Speculation Profiling for recompiling Grafting (loop unrolling) Alias analysis

4 Traditional Scheduling ABCDABCD BCADBCAD VLIW Scheduling ABCDABCD CBADCBAD

5 Instruction Cache Issue Slot 1 Issue Slot 3 Issue Slot 4 Issue Slot 5 Issue Slot 2 Execution Unit 1 Execution Unit 2 Execution Unit 27 But not all Issue Slots have access to all (types of) Execution Units!

6 Issue slot latency12345 CONSTxxxxx ALUxxxxx SHIFTERxx FALU 3xx DSPALU2xx DSPMUL3xx BRANCH3xxx IFMUL3xx FCOMPx DMEM3xx DMEMSPEC3x FTOUGH17/16x

7 Guarding C-code If(R2>R3) R4=R4+R5; Else R4=R4+R6; Assembly igtr R7 R2 R3add R4 R4 R6…… IF R7 add R4 R4 R5………...

8 Characteristics (1) Custom Ops => loss of VLIW-character Big or Little Endian R0 and R1 have values 0 and 1 respectively Geen Integer-Status-Flags but case-specific bit-patterns 32 Interrupt-vectors Interrupts are delayed

9 Characteristics (2) 11 cycle read-miss-penalty 3 cycle write-miss-penalty Functional units require 1 cycle recovery time Byte-addressable; 8-, 16- and 32-bit Loads and Stores Register File supports up to 5 Writes per cycle (Latency) Register File supports up to 15 Reads per cycle Paging (64 bytes) Instruction Length: 2-23 bytes; compressed

10 Example: MPEG-2 decoder DVD-batman bitstream (4-9 Mbits/s) 7 % Instruction-cache misses 27% Data-cache misses CPI (clock cycles/VLIW instruction): 1.37 Total performance: 2,9 ops/clock


Download ppt "The Philips TriMedia A VLIW Architecture By Jurjen Westra."

Similar presentations


Ads by Google