Download presentation
Presentation is loading. Please wait.
1
SSS 4/9/99CMU Reconfigurable Computing1 The CMU Reconfigurable Computing Project April 9, 1999 Mihai Budiu mihaib@cs.cmu.edu
2
SSS 4/9/99CMU Reconfigurable Computing2 Current Project Members ECE Department Herman Schmit Srihari Cadambi Matt Moe Robert Taylor Ronald Laufer CS Department Seth Copen Goldstein Mihai Budiu
3
SSS 4/9/99CMU Reconfigurable Computing3 Why Study Reconfigurable Hardware? It is a nice computation paradigm (wire your own computer)
4
SSS 4/9/99CMU Reconfigurable Computing4 Why Study Reconfigurable Hardware
5
SSS 4/9/99CMU Reconfigurable Computing5 Commercial Players Source: In-stat April 1998 *Does not include software, hardwire or support EPROMs
6
SSS 4/9/99CMU Reconfigurable Computing6 What Is “Reconfigurable Hardware?” Universal gates and/or storage elements Interconnection network Switches
7
SSS 4/9/99CMU Reconfigurable Computing7 Basic Ingredient: RAM cell 00010001 Universal gate = RAM a0 a1 a0 a1 data a1 & a2
8
SSS 4/9/99CMU Reconfigurable Computing8 A switch is controlled by a 1-bit RAM cell 0 1 1 1 Basic Ingredients (ctd)
9
SSS 4/9/99CMU Reconfigurable Computing9 Outline What is reconfigurable hardware RH vs other computation paradigms Challenges in RH research PipeRench: the CMU project: –the hardware –the software Conclusions
10
SSS 4/9/99CMU Reconfigurable Computing10 RH vs ASICs Generally Application-Specific Integrated Circuits will be faster than RH : –RH wires are slow & big –RH bit-slices are costly to interconnect –RH devices must store configuration on the chip but RH can be reprogrammed –new algorithms –to fix bugs RH cheaper in small production RH tolerates faults better RH sometimes faster with staged computation
11
SSS 4/9/99CMU Reconfigurable Computing11 RH vs Microprocessors RH less flexible (like a VLIW with fixed instructions) but RH provides more (customized) computation elements RH can decrease memory traffic RH can be tailored for specific algorithms and data types RH will not replace mP, but complement them
12
SSS 4/9/99CMU Reconfigurable Computing12 Types of RH FPGAs: bit-level logic functionality (the basic processing elements compute on 1 bit) word-based architectures: PipeRench (CMU) (basic PE operates on 8 bits) (basic PE is a small ALU) coarse architectures: RAW (MIT) (basic PE is a MIPS 2000 core)
13
SSS 4/9/99CMU Reconfigurable Computing13 RH In A System
14
SSS 4/9/99CMU Reconfigurable Computing14 Challenges In RC Software tools: –Programming RC like software development –Automatic compilation from HLL –Automatic program partitioning Mapping efficiently algorithms (no ISA) System issues –interfaces –find “ideal” RC fabric
15
SSS 4/9/99CMU Reconfigurable Computing15 The CMU Reconfigurable Computing Project
16
SSS 4/9/99CMU Reconfigurable Computing16 Hardware Goals To build a complete reconfigurable hardware device To build the system integration hardware To host the device in a PC
17
SSS 4/9/99CMU Reconfigurable Computing17 Our Device: Word processing elements Pipelined architecture Virtualized hardware Local interconnection network Wide pipelined bus
18
SSS 4/9/99CMU Reconfigurable Computing18 Configuration memory Stripes Data & Config controller Processing elements
19
SSS 4/9/99CMU Reconfigurable Computing19 Hardware Virtualization Instructions currently in hardware Instructions paged out Actual available hardware Program
20
SSS 4/9/99CMU Reconfigurable Computing20 Hardware Virtualization (2) compute configure Page in Page out Program in configuration memory hardware Overlap configuration with computation.
21
SSS 4/9/99CMU Reconfigurable Computing21 Processing Elements Look-up table Any 3-to-1 function a b Cin out PE2PE0PE1
22
SSS 4/9/99CMU Reconfigurable Computing22 The Interconnection Network Word-level cross-bar P*B bits Pass Registers 0 P*B*N bits B bits PE PE N PE 1
23
SSS 4/9/99CMU Reconfigurable Computing23 The PCI Board
24
SSS 4/9/99CMU Reconfigurable Computing24 Our Target Applications Pipelineable applications –Stream processing (e.g. DSP, encryption) –Multimedia processing –Vector processing –Limited data dependencies v7 v8 v9 v6 v5 v4 v3 v2 v1 HW Input data Output data Computational power stems from massive parallelism
25
SSS 4/9/99CMU Reconfigurable Computing25 Software Goal To program reconfigurable devices using the standard software development processes: –Compile C or Java –Do it quickly Partitioner DIL Java Data-flow Intermediate Language Configuration Reconfigurable HW CPU Built
26
SSS 4/9/99CMU Reconfigurable Computing26 Building Circuits From DIL a = b + c * d; e = c - d; variables wires operators gates + * c b d a - e
27
SSS 4/9/99CMU Reconfigurable Computing27 Mapping Circuits To - + a b c - + a b c -+ a b c -+ a b c
28
SSS 4/9/99CMU Reconfigurable Computing28 The DIL Compiler Front-End Parser Evaluator Loader Dil input file Circuit component library Component circuits Backend
29
SSS 4/9/99CMU Reconfigurable Computing29 The DIL Compiler Backend Circuit (expanded) Optimizer Placer- Router Circuit (placed) Code generator AsmC++ Front-end C++xfig The whole compilation process is very fast (compared to classical CAD tools). We can compile two orders of magnitude faster.
30
SSS 4/9/99CMU Reconfigurable Computing30 Processing Element Size Tradeoffs
31
SSS 4/9/99CMU Reconfigurable Computing31 Stripe Width Tradeoffs
32
SSS 4/9/99CMU Reconfigurable Computing32 Bus Width Tradeoffs
33
SSS 4/9/99CMU Reconfigurable Computing33 Clock Speed Tradeoffs (run-time) + 24 + + + 8 8 8
34
SSS 4/9/99CMU Reconfigurable Computing34
35
SSS 4/9/99CMU Reconfigurable Computing35
36
SSS 4/9/99CMU Reconfigurable Computing36 Project Status Operational: –Behavioral and structural models of Piperench in Verilog –Assembler, simulator –Tools for visualization and debugging –One tile fabricated and tested –Very fast compiler from intermediate language In work: –Prototype PipeRench to be taped this summer –PCI board to host PipeRench in a PC
37
SSS 4/9/99CMU Reconfigurable Computing37 Simulated Speed-up vs. UltraSparc @ 300Mhz
38
SSS 4/9/99CMU Reconfigurable Computing38 Future Work Build the PCI board Build the OS device drivers Start investigating HLL issues: –automatic partitioning –translation to DIL –special code transformations
39
SSS 4/9/99CMU Reconfigurable Computing39 Conclusions A set of important applications can benefit from RC devices RC offer potential for substantial performance improvement at a low cost RC devices will soon be mainstream in the embedded computing world; perhaps in the future they will also permeate the desktop Pentium V UVRUVR
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.