Presentation is loading. Please wait.

Presentation is loading. Please wait.

Portable SystemC-on-a-Chip

Similar presentations


Presentation on theme: "Portable SystemC-on-a-Chip"— Presentation transcript:

1 Portable SystemC-on-a-Chip
Scott Sirowy, Bailey Miller, and Frank Vahid† Department of Computer Science and Engineering University of California, Riverside {ssirowy,bmiller, †Also with the Center for Embedded Computer Systems at UC Irvine This work was supported in part by the National Science Foundation and the Office of Naval Research

2 Introduction: Prototyping Circuits and Systems
Memory Controller s1 s2 s3 s4 s6 s7 s8 s9 go + - MIN 255 data address Edge Detector Pixel Value Task: Create a custom ASIC/FPGA circuit to detect edges in an image

3 Introduction: Prototyping Circuits and Systems
address data go Edge Detector Memory Controller s1 s2 s3 s4 s6 s7 s8 s9 + + + + + + + + + + + + - - + 255 MIN Capture in HDL -- VHDL/Verilog File Entity Edge_Detector is Port { clk : in std_logic; rst : in std_logic; data: in std_logic_vec };

4 Introduction: Prototyping Circuits and Systems
address data go Edge Detector SystemC C++ based Creation, instantiation, and connection of components Precisely timed communication and execution among concurrently executing components Supports both “software” and “hardware” constructs and semantics Memory Controller s1 s2 s3 s4 s6 s7 s8 s9 + + + + + + + + + + + + - - + 255 MIN Pixel Value Capture in HDL class EDGE_DETECTOR : public sc_module { //signal declarations EDGE_DETECTOR() { SC_method(mainComp); sensitive << dataReady; SC_method(getPixel); sensitive << clock.pos();

5 Introduction: Prototyping Circuits and Systems
Memory Controller s1 s2 s3 s4 s6 s7 s8 s9 go - MIN + 255 data address Edge Detector Simulation Requires environment modeling Sometimes hard! Does not interact with real I/O Capture in HDL class EDGE_DETECTOR : public sc_module { //signal declarations EDGE_DETECTOR() { SC_method(mainComp); sensitive << dataReady; SC_method(getPixel); sensitive << clock.pos(); Simulation on Desktop PC

6 Introduction: Prototyping Circuits and Systems
Memory Controller s1 s2 s3 s4 s6 s7 s8 s9 go - MIN + 255 data address Edge Detector Implementation Mapping to microprocessor / coprocessor system Interfacing Issues Synthesis Issues Size Constraints Capture in HDL class EDGE_DETECTOR : public sc_module { //signal declarations EDGE_DETECTOR() { SC_method(mainComp); sensitive << dataReady; SC_method(getPixel); sensitive << clock.pos(); Mapping & Synthesis

7 Introduction: Prototyping Circuits and Systems
Memory Controller s1 s2 s3 s4 s6 s7 s8 s9 go - MIN + 255 data address Edge Detector In-System Emulation Quickly-obtained simulation interaction with real I/O Prior to time-consuming mapping and synthesis But slower Capture in HDL class EDGE_DETECTOR : public sc_module { //signal declarations EDGE_DETECTOR() { SC_method(mainComp); sensitive << dataReady; SC_method(getPixel); sensitive << clock.pos(); Emulation

8 In-System Emulation of SystemC
How? Port publicly available SystemC libraries to target platforms SystemC executable has built-in event kernel Libraries are large and require OS support SystemC Description FPGA Processor Processor

9 Bytecode Modern portability approach Java, C# Java, C# Bytecode
Compiler Bytecode Virtual Machine (VM): Program that executes bytecode May JIT compile to native architecture VM VM Opteron VM Pentium Atom

10 SystemC Bytecode? SystemC SystemC Bytecode Compiler VM VM VM
Pentium Opteron + FPGA FPGA

11 Portable SystemC-on-a-Chip
Task: Create a custom circuit to detect edges in an image Processor Emulation Engine SystemC Bytecode Compiler SystemC Description SystemC Bytecode Processor Processor Emulation Engine Processor FPGA SystemC bytecode can run on any platform that supports the SystemC emulation engine, without the need for recompilation or synthesis Emulation Engine Emulation Accelerators

12 SystemC Bytecode Compiler
class EDGE_DETECTOR : public sc_module { //signal declarations EDGE_DETECTOR() { SC_method(mainComp); sensitive << dataReady; SC_method(getPixel); sensitive << clock.pos(); } Pinapa Front End (Moy, EMSOFT’05) Extracts architectural features and behavior of each process Uses modified versions of GCC and the SystemC kernel Bytecode Back End Flattens original SystemC circuit Generates SystemC bytecode that preserves architecture and behavioral information Output is a human-readable text file SystemC Description Pinapa Front End ELAB AST Link Bytecode Back End Register Allocation Code Generation SystemC Bytecode Compiler SystemC Bytecode

13 SystemC Bytecode Sequential Instructions Spatial Instructions
Based on the RISC MIPS instruction set Efficient emulation (Davis 2003) Spatial Instructions Includes meta instructions for defining architectural features, bit width specific computations, and reading and writing signals --header signal clock : 1 signal reset : 1 signal memory_in : 32 signal fb_data : 32 signal leds : 4 process(clock) READ $1 memory_in ADD $2 $0 3 ADD $3 $2 $1 WRITE $3 s1 ADDI $1 $0 1 WRITE $1 dataReady END process(dataReady) READ $5 val6 SW $5 24($0) READ $5 val7 ADDI $10 $0 0 ADDI $7 $0 0 ADDI $13 $0 8 SystemC Bytecode Spatial Constructs MIPS-like sequential instructions

14 SystemC Emulation Engine
Must support a basic SystemC interface Clock Reset 16 I/O pins 8KB Input Memory 8KB Output Memory UART Platforms with more advanced I/O might support more features Increased Memory Extended General Purpose I/O Output I/O SystemC Circuit Clock UART Tx Reset Input Mem Addr Input I/O Input Mem Stream UART Rx Input Mem Data Output Mem Addr Output Mem Data

15 SystemC Emulation Engine
Real I/O Peripherals Representative of many systems Emulation Engine Kernel Virtual Machine Discrete Event Kernel Peripheral Access and Hooks Optional USB Download Interface Emulation Engine Main Processor Input Memory Output Memory USB Interface Instruction Memory UART Read Signal Memory Buttons Write Signal Memory LEDs USB Download Interface I/O Peripherals Emulation Engine Kernel and Support Peripherals

16 Emulation Engine Acceleration
For some SystemC applications, emulation can be slow An Edge Detection circuit required ~10 minutes to process a 320x240 image * Input Memory Main Processor SystemC bytecode Output Memory Instruction Memory UART Read Signal Memory USB Interface Buttons Write Signal Memory LEDs * on a 100 MHz/SRAM Microblaze SystemC Emulation Engine implementation

17 Emulation Engine Acceleration
For some SystemC applications, emulation can be slow An Edge Detection circuit required ~10 minutes to process a 320x240 image * Input Memory Main Processor SystemC bytecode Output Memory Instruction Memory UART Read Signal Memory USB Interface Buttons Write Signal Memory LEDs If available, use platform FPGA to create bytecode accelerators Execute SystemC bytecode natively Accelerator 1 Accelerator 2 Accelerator 3 FPGA Accelerators speedup emulation * on a 100 MHz Microblaze SystemC Emulation Engine implementation

18 SystemC Bytecode Accelerators
MIPS-like multicycle RISC datapath Communicates to core emulator via memory-mapped registers # of accelerators limited to # of masters allowed on bus Emulation Engine Input Memory Main Processor SystemC bytecode Output Memory Instruction Memory UART Read Signal Memory USB Interface Buttons Write Signal Memory LEDs Accelerator RISC Datapath Register File Local Mem Bus, start, load logic Accelerator 1 Accelerator 2 Accelerator 3 FPGA

19 SystemC-on-a-Chip Implementation
Xilinx Spartan 3E Virtex4 Ml403 Virtex5 VLX110T * Platform *Currently building Microblaze (50 MHz) PowerPC (50 MHz) Microblaze (100 MHz) Main Processor Bus Platform OPB PLB PLB SRAM SRAM+BRAM Main Memory BRAM # Emulation Accelerators 0-1 1-2 >3 Accelerator Accelerator Accelerator Accelerator Accelerator Accelerator * Demo

20 SystemC-on-a-Chip Implementation
Pinapa ELAB AST Link Back End SystemC Bytecode Compiler SystemC Bytecode compiler 3,500 lines of code + Pinapa (20,000 lines) Emulation Engine Input Memory Main Processor Output Memory Instruction Memory UART Read Signal Memory SystemC Emulation Engine 3,000 lines of C + 8,000 lines of VHDL USB Interface Buttons Write Signal Memory LEDs Accelerator 1 Accelerator 2 Accelerator 3 FPGA

21 SystemC-on-a-Chip Implementation
Emulation Engine SystemC Bytecode Accelerator 2,000 lines of VHDL Area: ~3000 Slices Clock Frequency: MHz Input Memory Main Processor Output Memory Instruction Memory UART Read Signal Memory USB Interface Buttons Write Signal Memory LEDs Accelerator RISC Datapath Register File Local Mem Bus, start, load logic Accelerator 1 Accelerator 2 Accelerator 3 FPGA

22 SystemC-on-a-Chip Experiments
Competitive with SystemC PC Simulation, but with the benefits of real I/O Emulation Engine Execution Time Main Processor Input Memory Output Memory Instruction Memory UART Read Signal Memory USB Interface Base Emulation on Virtex 4 Buttons Base Emulation on Virtex 5 Write Signal Memory Emulation + Accelerators (Virtex 4) LEDs Emulation + Accelerators (Virtex 5) Accelerator 1 Execution Time Normalized to SystemC running on a 2.8 GHz Intel Xeon Accelerator 2 Accelerator 3

23 Conclusions Introduced SystemC Bytecode as a means to emulate SystemC for prototyping For platforms with FPGA resources, introduced bytecode accelerators to speed up SystemC performance Outperforms emulation by over 100X As proof of concept, built 3 test platforms and tested multiple SystemC circuits without having to recompile or synthesize Future Directions Emulation architecture improvements Synthesizing SystemC just-in-time?


Download ppt "Portable SystemC-on-a-Chip"

Similar presentations


Ads by Google