Download presentation
Presentation is loading. Please wait.
1
OpenSPARC-Xilinx Collaboration Durgam Vahia Paul Hartke Durgam.Vahia@Sun.ComDurgam.Vahia@Sun.Com paul.hartke@xilinx.compaul.hartke@xilinx.com OpenSPARC Engineering Xilinx University Program (XUP) RAMP Retreat, UC Berkeley, January 2007
2
OpenSPARC-Xilinx Collaboration 2 Agenda Goals OpenSPARC T1 – Quick Recap What we have been up to – T1 on FPGAs Current Status and Results Road-map
3
OpenSPARC-Xilinx Collaboration 3 Big Goals Proliferation of Sun OpenSPARC technology Proliferation of Xilinx FPGA technology – Make OpenSPARC FPGA-friendly – Create reference design with complete system functionality and proven path to hardware – Boot Solaris/Linux on the reference design – Open it up.. – Seed ideas in the community Significant enabler for future research in multi-core
4
OpenSPARC-Xilinx Collaboration 4 What is OpenSPARC T1 SPARC V9 implementation Eight cores, four thread each - 32 simultaneous threads All cores connect through a 134.4GB/s crossbar switch High BW 12 way associative 3MB on-chip L2 cache 4 DDR2 channels (23 GB/s) 70W power ~300M transistors
5
OpenSPARC-Xilinx Collaboration 5 OpenSPARC T1: Design Choices Simpler core architecture to maximize cores on die Caches, DRAM channels shared across cores Shared L2 decreases cost of coherence misses significantly Crossbar good for b/w, latency and functional verification Double-click to add graphics
6
OpenSPARC-Xilinx Collaboration 6 OpenSPARC Core Four threads per core Single issue 6 stage pipeline 16KB I-cache, 8KB D-cache Unique resources per thread – Registers – Portions of I-fetch datapath – Store and Miss buffers Resources shared by 4 threads – Caches, TLBs, Execution units – Pipeline registers and DP IFU EXU MUL TRAP MMU LSU
7
OpenSPARC-Xilinx Collaboration 7 OpenSPARC Pipeline http://opensparc-t1.sunsource.net/specs/OpenSPARCT1_Micro_Arch.pdf All processor IO (including interrupts) via Crossbar interface
8
OpenSPARC-Xilinx Collaboration 8 OpenSPARC T1 on FPGAs Create single core, single thread implementation of T1 for FPGAs Map it on Xilinx FPGA board and use board peripherals to build the working hardware system Boot commercial OS on it
9
OpenSPARC-Xilinx Collaboration 9 OpenSPARC FPGA Implementation Single core, single thread implementation of T1 – Small, clean and modular FPGA implementation About 39K 4-input LUTs, 123 BRAMs (synplicity on Virtex{2/2Pro/4}) Synchronous, no latches or gated clocks Better utilization of FPGA resources (BRAMs, Multiplier) – Functionally equivalent to custom implementation, except 8 entry Fully Associative TLB as opposed to 64 entry Removed Crypto unit (modular arithmetic operations)
10
OpenSPARC-Xilinx Collaboration 10 Single Thread T1 on FPGAs Functionally stable – Passing mini and full regressions Completely routed – No timing violations – Easily meets 20ns (50MHz) cycle time Expandable to more threads – Reasonable overhead for most blocks (~30% for 4 threads) – Some bottlenecks exist (Multi-port register files)
11
OpenSPARC-Xilinx Collaboration 11 System Block Diagram SPARC T1 Core processor-to- crossbar interface (PCX) Microblaze Proc Fast Simplex Links interface (FSL) PCX-FSL Interposer External DDR2 Dimm MCH-OPB MemCon Microblaze Debug UART IBM Coreconnect OPB Bus SPARC T1 UART 10/100 Ethernet MultiPort Memory Controller FPGA Boundary Xilinx Embedded Developer’s (EDK) Design Block must be developed
12
OpenSPARC-Xilinx Collaboration 12 System Theory of Operation OpenSPARC T1 core communicates exclusively via the processor-to-crossbar interface (PCX) – PCX is a packet based interface Microblaze softcore will sit in a polling loop and accept these packets, perform any protocol conversion, and forward them to the appropriate peripheral – Could even implement floating point operations via the Microblaze FPU unit Microblaze will also poll (or accept interrupts from) the peripherals, convert the info to a PCX packet, and forward it to the PCX interface – Microblaze has its own UART for its own diagnostic input/output
13
OpenSPARC-Xilinx Collaboration 13 Implementation Results XC4VFX100-11FF1152 FPGA – 42,649/84,352 LUT4s (50%) – 131/376 BRAM-16kbits (34%) – 50MHz operation Have not attempted any faster – Synplicity Synthesis: 25 minutes – Place and Route: 42 minutes (Microblaze & Related Logic)
14
OpenSPARC-Xilinx Collaboration 14 Preliminary Virtex5 Results Virtex5 xc5vlx110tff1136 – Same as Bee3 FPGA 30,508/69,120 LUT6s (44%) 119/148 BRAM-36kbits (80%) – Working through mapping issues… 50MHz placed and routed design – Have not attempted any faster
15
OpenSPARC-Xilinx Collaboration 15 OpenSPARC FPGA HW Roadmap Current reference design occupies about 45% of XC4V100FX FPGA. This design includes – Single core, single thread of OpenSPARC T1 – Microblaze to communicate with peripherals (DRAM, Ethernet) – Glue logic to connect T1 core with Microblaze More design paths exist, e.g. 1) Two single thread cores in single FPGA 2) Up to 4 threads per FPGA
16
OpenSPARC-Xilinx Collaboration 16 OpenSPARC FPGA SW Roadmap Boot Solaris and Linux on a single thread FPGA version of the design – Include support for all packet types with Microblaze – Hypervisor changes to support this variant of T1 Reduction in TLB size – Device driver support for the system – Emulation routines in OS for floating point operations Mainly for ISA compliance
17
OpenSPARC-Xilinx Collaboration 17 Reference Design ml410 board with Virtex4-100 FPGA (aka ml411) – Bit file and elf is stored on CompactFlash card Each design is a hardware implementation of one regression suite test – Microblaze soft-core sends the test packets to the OpenSPARC core and verifies the return packets
18
OpenSPARC-Xilinx Collaboration 18 www.opensparc.net All of this will be available under GPL(2) license – Complete verilog code of FPGA T1 and glue logic to Microblaze – Synplicity scripts for synthesis – The whole reference design
19
OpenSPARC-Xilinx Collaboration 19 www.opensparc.net (2)www.opensparc.net Verification Environment – Very very important – Change and VERIFY – Scripts for running regression in three modes Chip8 – Full-chip test-suit Core1 – Single core (four threads) test-suit Thread1 – Single core, Single thread test-suit for FPGA version – Supports Synopsys VCS and Cadence NC-Verilog Considering supporting Mentor ModelSim as well Bring down as many barriers as possible
20
OpenSPARC-Xilinx Collaboration 20 Development Team Sun OpenSPARC Team – Durgam Vahia – Ismet Bayraktaroglu – Thomas Thatcher Xilinx University Program – Paul Hartke
21
OpenSPARC-Xilinx Collaboration 21 OpenSPARC & Xilinx FPGAs!!
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.