1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.

Slides:



Advertisements
Similar presentations
VHDL Design of Multifunctional RISC Processor on FPGA
Advertisements

© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
Give qualifications of instructors: DAP
ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
University Of Vaasa Telecommunications Engineering Automation Seminar Signal Generator By Tibebu Sime 13 th December 2011.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 3: Input/output and co-processors dr.ir. A.C. Verschueren.
EELE 367 – Logic Design Module 2 – Modern Digital Design Flow Agenda 1.History of Digital Design Approach 2.HDLs 3.Design Abstraction 4.Modern Design Steps.
Implementation methodology for Emerging Reconfigurable Systems With minimum optimization an appreciable speedup of 3x is achievable for this program with.
Graduate Computer Architecture I Lecture 15: Intro to Reconfigurable Devices.
CS 151 Digital Systems Design Lecture 37 Register Transfer Level
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
1 Performed By: Khaskin Luba Einhorn Raziel Einhorn Raziel Instructor: Rivkin Ina Spring 2004 Spring 2004 Virtex II-Pro Dynamical Test Application Part.
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
Aug. 24, 2007ELEC 5200/6200 Project1 Computer Design Project ELEC 5200/6200-Computer Architecture and Design Fall 2007 Vishwani D. Agrawal James J.Danaher.
Configurable System-on-Chip: Xilinx EDK
1 FPGA Lab School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. An Entropy-based Learning Hardware Organization.
FPGA BASED IMAGE PROCESSING Texas A&M University / Prairie View A&M University Over the past few decades, the improvements from machine language to objected.
1 Chapter 7 Design Implementation. 2 Overview 3 Main Steps of an FPGA Design ’ s Implementation Design architecture Defining the structure, interface.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
1 VERILOG Fundamentals Workshop סמסטר א ' תשע " ה מרצה : משה דורון הפקולטה להנדסה Workshop Objectives: Gain basic understanding of the essential concepts.
ECE 2372 Modern Digital System Design
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
ASIC/FPGA design flow. FPGA Design Flow Detailed (RTL) Design Detailed (RTL) Design Ideas (Specifications) Design Ideas (Specifications) Device Programming.
Automated Design of Custom Architecture Tulika Mitra
1 C.H. Ho © Rapid Prototyping of FPGA based Floating Point DSP Systems C.H. Ho Department of Computer Science and Engineering The Chinese University of.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
GRECO - CIn - UFPE1 A Reconfigurable Architecture for Multi-context Application Remy Eskinazi Sant´Anna Federal University of Pernambuco – UFPE GRECO.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
ECE 449: Computer Design Lab Coordinator: Kris Gaj TAs: Tuesday session: Pawel Chodowiec Thursday session: Nghi Nguyen.
Array Synthesis in SystemC Hardware Compilation Authors: J. Ditmar and S. McKeever Oxford University Computing Laboratory, UK Conference: Field Programmable.
PROCStar III Performance Charactarization Instructor : Ina Rivkin Performed by: Idan Steinberg Evgeni Riaboy Semestrial Project Winter 2010.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
TOPIC : SYNTHESIS INTRODUCTION Module 4.3 : Synthesis.
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
A Monte Carlo Simulation Accelerator using FPGA Devices Final Year project : LHW0304 Ng Kin Fung && Ng Kwok Tung Supervisor : Professor LEONG, Heng Wai.
1 Hardware Description Languages: a Comparison of AHPL and VHDL By Tamas Kasza AHPL&VHDL Digital System Design 1 (ECE 5571) Spring 2003 A presentation.
Evaluating and Improving an OpenMP-based Circuit Design Tool Tim Beatty, Dr. Ken Kent, Dr. Eric Aubanel Faculty of Computer Science University of New Brunswick.
Introduction to VHDL Simulation … Synthesis …. The digital design process… Initial specification Block diagram Final product Circuit equations Logic design.
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
PentiumPro 450GX Chipset Synthesis Steen Larsen Presentation 1 for ECE572 Nov
Graphical Design Environment for a Reconfigurable Processor IAmE Abstract The Field Programmable Processor Array (FPPA) is a new reconfigurable architecture.
Introduction to ASIC flow and Verilog HDL
04/26/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Introduction to VHDL 12.2: VHDL versus Verilog (Separate File)
Register Transfer Languages (RTL)
CDA 4253 FPGA System Design RTL Design Methodology 1 Hao Zheng Comp Sci & Eng USF.
1 Implementation of Polymorphic Matrix Inversion using Viva Arvind Sudarsanam, Dasu Aravind Utah State University.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Teaching Digital Logic courses with Altera Technology
The Effect of Data-Reuse Transformations on Multimedia Applications for Application Specific Processors N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
Corflow Online Tutorial Eric Chung
Software tools for digital LLRF system integration at CERN 04/11/2015 LLRF15, Software tools2 Andy Butterworth Tom Levens, Andrey Pashnin, Anthony Rey.
System on a Programmable Chip (System on a Reprogrammable Chip)
1 Introduction to Engineering Spring 2007 Lecture 18: Digital Tools 2.
Programmable Logic Devices
COMPUTER ORGANIZATION & ASSEMBLY LANGUAGE
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
Design Flow System Level
Anne Pratoomtong ECE734, Spring2002
Mobile Development Workshop
ECE-C662 Introduction to Behavioral Synthesis Knapp Text Ch
RTL Design Methodology Transition from Pseudocode & Interface
THE ECE 554 XILINX DESIGN PROCESS
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL code ECE 448 – FPGA and ASIC Design.
Digital Designs – What does it take
THE ECE 554 XILINX DESIGN PROCESS
Presentation transcript:

1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of Computer Science and Engineering The Chinese University of Hong Kong 2 Institute of Microelectronic Systems Darmstadt University of Technology, Germany

2 Overview Introduction Introduction Contribution Contribution Fly Programming Language Fly Programming Language GCD Example GCD Example Floating Point extension Floating Point extension Solving Differential Equation Example Solving Differential Equation Example Discussion Discussion Conclusion Conclusion

3 Introduction RTL based development method compare with software development RTL based development method compare with software development Hardware design are parallel and people think in von-Neumann patterns Hardware design are parallel and people think in von-Neumann patterns Complex in decomposing a hardware design into datapath and control Complex in decomposing a hardware design into datapath and control Hardware Interface for FPGA board must be developed Hardware Interface for FPGA board must be developed RTL based design  Low productivity RTL based design  Low productivity

4 Introduction HDL address some of the issue HDL address some of the issue Higher level abstraction Higher level abstraction Translate the behavior model into RTL Translate the behavior model into RTL Fly Fly Translate the Perl-like code into VHDL Translate the Perl-like code into VHDL Open source license and easily understood and modified by other users Open source license and easily understood and modified by other users Supports a simple memory mapped interface between a host processor and FPGA Supports a simple memory mapped interface between a host processor and FPGA

5 Introduction Fly support Fly support While loop While loop If – else branch If – else branch Integer arithmetic Integer arithmetic Parallel statements Parallel statements Register assignment Register assignment Fly is easily extendible Fly is easily extendible Simple FPGA – host interface readily suit for different FPGA vendor Simple FPGA – host interface readily suit for different FPGA vendor Module design make adding new arithmetic possible Module design make adding new arithmetic possible

6 Fly Programming Language 6 Main Element: 6 Main Element: Assignment Assignment Parallel Statement Parallel Statement Arithmetic Expression Arithmetic Expression While Loop While Loop If-Else Branching If-Else Branching Condition Evaluation Condition Evaluation

7 Compilation Technique Compilation Technique Each statement has the start and end signals Each statement has the start and end signals Construct a one-hot statement machine by cascading the signals Construct a one-hot statement machine by cascading the signals Using Perl language and a parse generator Parse::ResDescent for parsing the program for simpler and concise code Using Perl language and a parse generator Parse::ResDescent for parsing the program for simpler and concise code Output VHDL code instead of netlist Output VHDL code instead of netlist cope with many FPGA and ASIC design tools cope with many FPGA and ASIC design tools perform further logic optimization by the synthesis tools perform further logic optimization by the synthesis tools Fly Programming Language

8 Main element Main element ConstructElementsExample Assignment var = expr; var1 = tempvar; Parallel statement [ {…}{…}…] [{a=b;}{b=a*c;}] Loop val op expr; Valid op: *,/,+,- a = b * c; If-else if (condition){…} else {…} if (i<=j ) {a=b;} else {a=c;} Condition expr rel expr Valid rels: >, =,==,!= i >= c

9 Development Environment Fly Supports the Pilchard FPGA platform Fly Supports the Pilchard FPGA platform Pilchard use DIMM memory bus interface instead of PCI bus Pilchard use DIMM memory bus interface instead of PCI bus Less latency but Larger Bandwidth Less latency but Larger Bandwidth Using register $din[x] for transferring data and handshaking Using register $din[x] for transferring data and handshaking Porting to other FPGA platform is possible Porting to other FPGA platform is possible Transparent the whole compilation and implementation process to user by using shell script Transparent the whole compilation and implementation process to user by using shell script Host driver written by Perl and inlined C in critical sections Host driver written by Perl and inlined C in critical sections

10 GCD Examples: GCD Examples:{ $s = $din[1]; $l = $din[2]; while ($s != $l) { $a = $l - $s; if ($a > 0) { $l = $a; } else { [{$s = $l;}{$l = $s;}] }} $dout[1] = $1; } Fly Programming Language

11 Result A GCD coprocessor is implemented using Fly System A GCD coprocessor is implemented using Fly System Xilinx XCV300E-8 Device Xilinx XCV300E-8 Device Maximum Frequency: 126 MHz Maximum Frequency: 126 MHz Slices Used: 135 out of 3072 slices Slices Used: 135 out of 3072 slices Compute a GCD every 1.63  s (including the interface overhead) Compute a GCD every 1.63  s (including the interface overhead)

12 Floating Point Extension Fly can be extended other arithmetic like floating point arithmetic Fly can be extended other arithmetic like floating point arithmetic Module library providing the floating point adder and multiplier Module library providing the floating point adder and multiplier start and end signals are added to the floating point operator start and end signals are added to the floating point operator Using block RAM as the interface to host processor Using block RAM as the interface to host processor New built-in function read_host() and write_host() New built-in function read_host() and write_host() Allow data between the host and FPGA to be buffered Allow data between the host and FPGA to be buffered Three new floating point operator “.+”, “.-” and “*” for invoking floating point operation Three new floating point operator “.+”, “.-” and “*” for invoking floating point operation Change the parser to enforce the operator precedence and instantiate the floating module appropriately Change the parser to enforce the operator precedence and instantiate the floating module appropriately

13 Using modified fly compiler to solve ordinary differential equation Using modified fly compiler to solve ordinary differential equation Based on Euler method, h is step size Based on Euler method, h is step size Involving floating addition, subtraction, multiplication Involving floating addition, subtraction, multiplication Application to Solving Differential Equation

14 Solving Differential Equation { $h = &read_host(1); [{$t=0.0;}{$y=1.0;}{$dy=0.0;}{onehalf=0.5;}{$index=0;}] while ($t <3.0) { [{$t = $h.* $onehalf;}{$t2 = $t.- $y;}] [{$dy = $t1.* $t2;}{$t = $t.+ $h;}] [{$y = $y.+ $dy;}{$index = $index + 1;}] $void = &write_host($y, $index); }}

15 Result Using parallel statement can achieve 1.43 speedup Using parallel statement can achieve 1.43 speedup Implemented on Xilinx XCV300E-8 Device Implemented on Xilinx XCV300E-8 Device Maximum Frequency: 53.9 MHz Maximum Frequency: 53.9 MHz Slices Used: 2,349 out of 3,072 slices Slices Used: 2,349 out of 3,072 slices For h = 1/16, need 28.7us for an execution including all interface overheads For h = 1/16, need 28.7us for an execution including all interface overheads

16 Discussion Fly is a flexible design Fly is a flexible design Adapt to different arithmetic by adding more library Adapt to different arithmetic by adding more library Bit parallel Arithmetic  Digit Serial Arithmetic Bit parallel Arithmetic  Digit Serial Arithmetic Fixed Point Arithmetic  Floating Point Arithmetic Fixed Point Arithmetic  Floating Point Arithmetic Fly is easy to modify for different HDL Fly is easy to modify for different HDL The grammar can be redefined The grammar can be redefined Fly enhance the productivity on building FPGA systems Fly enhance the productivity on building FPGA systems Using C-like language can significantly reduced the design time Using C-like language can significantly reduced the design time

17 Conclusion Using Programming Language for FPGA design Using Programming Language for FPGA design Reuse the code Reuse the code Optimization is done by the tools Optimization is done by the tools Reduce design time Reduce design time Interface of host/FPGA is defined Interface of host/FPGA is defined Using Floating Point Arithmetic on FPGA Using Floating Point Arithmetic on FPGA Make use of high FPGA density Make use of high FPGA density More application for FPGA despite of fixed point arithmetic More application for FPGA despite of fixed point arithmetic

18 Q & A