Device Tradeoffs Greg Stitt ECE Department University of Florida.

Slides:



Advertisements
Similar presentations
EEL4720/5721 Reconfigurable Computing The state-of-the-art Reconfigurable Computing equipment available for this course is made possible by a generous.
Advertisements

EEL4930/5934 Reconfigurable Computing The state-of-the-art Reconfigurable Computing equipment available for this course is made possible by a generous.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
Floating-Point FPGA (FPFPGA) Architecture and Modeling (A paper review) Jason Luu ECE University of Toronto Oct 27, 2009.
Implementation methodology for Emerging Reconfigurable Systems With minimum optimization an appreciable speedup of 3x is achievable for this program with.
CompSci Today’s Topics Technology for Computers A Technology Driven Field Upcoming Computer Architecture (Chapter 8) Reading (not in text)
Reconfigurable Computing: What, Why, and Implications for Design Automation André DeHon and John Wawrzynek June 23, 1999 BRASS Project University of California.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
Lesson 12 Economics of ASICs.
Some Thoughts on Technology and Strategies for Petaflops.
Parallel Programming Henri Bal Vrije Universiteit Faculty of Sciences Amsterdam.
Spring 08, Jan 15 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
Chapter 1. Introduction This course is all about how computers work But what do we mean by a computer? –Different types: desktop, servers, embedded devices.
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
EE314 Basic EE II Silicon Technology [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
The Challenging (and Fun!) World of Computer Engineering Professor Dave Meyer School of Electrical & Computer Engineering Purdue University.
February 4, 2002 John Wawrzynek
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
SSS 4/9/99CMU Reconfigurable Computing1 The CMU Reconfigurable Computing Project April 9, 1999 Mihai Budiu
Chapter 1 Sections 1.1 – 1.3 Dr. Iyad F. Jafar Introduction.
Digital Circuit Implementation. Wafers and Chips  Integrated circuit (IC) chips are manufactured on silicon wafers  Transistors are placed on the wafers.
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
Embedded Systems. 2 A “short list” of embedded systems And the list goes on and on Anti-lock brakes Auto-focus cameras Automatic teller machines Automatic.
Real time DSP Professors: Eng. Julian Bruno Eng. Mariano Llamedo Soria.
WHAT IS A COMPUTER? Computer is an electronic device designed to manipulate data so that useful information can be generated. Computer is multifunctional.
BR 1/001 Implementation Technologies We can implement a design with many different implementation technologies - different implementation technologies.
1 VLSI and Computer Architecture Trends ECE 25 Fall 2012.
Overview of Future Hardware Technologies The chairman’s notes from the plenary session appear in blue throughout this document. Subject to these notes.
February 12, 1998 Aman Sareen DPGA-Coupled Microprocessors Commodity IC’s for the Early 21st Century by Aman Sareen School of Electrical Engineering and.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
Reconfigurable Devices Presentation for Advanced Digital Electronics (ECNG3011) by Calixte George.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Publication: Ra Inta, David J. Bowman, and Susan M. Scott. Int. J. Reconfig. Comput. 2012, Article 2 (January 2012), 1 pages. DOI= /2012/ Naveen.
VLSI & ECAD LAB Introduction.
Embedded System Design 王佑中 Yu-Chung
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
J. Christiansen, CERN - EP/MIC
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
Dean Tullsen UCSD.  The parallelism crisis has the feel of a relatively new problem ◦ Results from a huge technology shift ◦ Has suddenly become pervasive.
EE3A1 Computer Hardware and Digital Design
EEL4720/5721 Reconfigurable Computing Greg Stitt Associate Professor.
© GCSE Computing Computing Hardware Starter. Creating a spreadsheet to demonstrate the size of memory. 1 byte = 1 character or about 1 pixel of information.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
Moore’s Law and Its Future Mark Clements. 15/02/2007EADS 2 This Week – Moore’s Law History of Transistors and circuits The Integrated circuit manufacturing.
Exploiting Parallelism
Introduction to VLSI Design Amit Kumar Mishra ECE Department IIT Guwahati.
CS203 – Advanced Computer Architecture
Programmable Logic Devices
Conclusions on CS3014 David Gregg Department of Computer Science
EEL4720/5721 Reconfigurable Computing
ELEC 7770 Advanced VLSI Design Spring 2016 Introduction
EEL4720/5721 Reconfigurable Computing
Chapter 1: Introduction
Introduction to Reconfigurable Computing
Solving the SoC Design Dilemma for IoT Applications with Embedded FPGA
ELEC 7770 Advanced VLSI Design Spring 2014 Introduction
EEL4930/5934 Reconfigurable Computing
EEL4712 Digital Design.
Dynamically Reconfigurable Architectures: An Overview
EEL4712 Digital Design.
Chapter 1 Introduction.
HIGH LEVEL SYNTHESIS.
Greg Stitt ECE Department University of Florida
EEL4720/5721 Reconfigurable Computing
Unit -4 Introduction to Embedded Systems Tuesday.
EEL4930/5934 Reconfigurable Computing
Presentation transcript:

Device Tradeoffs Greg Stitt ECE Department University of Florida

When to use RC? MicroprocessorASICRC (FPGA,CPLD, etc.) Performance Why not use an ASIC for everything? Implementation Possibilities RC devices enable design of digital circuits without fabricating a device Therefore, RC can be used anytime a digital circuit is needed Examples: ASIC prototyping, ASIC replacement, replacing/accelerating microprocessors But, when should be RC be used instead of alternative technologies?

Moore’s Law Moore's Law is the empirical observation made in 1965 that the number of transistors on an integrated circuit doubles every 18 months [Wikipedia] 2007: >1 BILLION transistors!!!! 1993: 1 Million transistors Becoming extremely difficult to design this - ASICs are expensive!

Moore’s Law Solution: Make billions of transistors into a reconfigurable fabric - fabricate 1 big chip and use it for many things Area overhead: circuit in FPGA can require 20x more transistors But, that’s still equivalent to a > 50 million transistor ASIC Pentium IV ~ 42 million transistors Modern FPGAs reportedly support millions of logic gates! 2007: >1 BILLION transistors!!!! Solution: Make this reconfigurable

When should RC be used? 1) When it provides the cheapest solution Depends on: NRE Cost - Non-recurring engineering cost Cost involved with designing application Unit cost - cost of a manufacturing/purchasing a single system Volume - # of units Total cost = NRE + unit cost * volume RC is typically more cost effective for low volume applications RC: low NRE, high unit cost ASIC: very high NRE, low unit cost

What about microprocessors? Similar cost issues µP (microprocessor) trends low NRE cost (coding is cheap) Unit cost varies from several dollars to several thousand Wouldn’t cheapest microprocessor always be the cheapest solution? Yes, but …

What about microprocessors? Often, microprocessors cannot meet performance constraints e.g. video decoder must achieve minimum frame rate Common reason for using custom circuit implementation

Example FPGA: Unit cost = 5, NRE cost = 200,000 Microprocessor (µP): Unit cost = 8, NRE cost = 100,000 Problem: Find cheapest implementation for all possible volumes (assume both implementations meet constraints) Volume Cost FPGA µP 100k 200k 5v+200k = 8v+100k v = 33k 33k Answer: For volumes less than 33k, µP is cheapest solution. For all other volumes, FPGA is cheapest solution.

Example: Your Turn FPGA Unit cost: 6, NRE cost: 300,000 ASIC Unit cost: 2, NRE cost: 3,000,000 Microprocessor (µP) Unit cost: 10, NRE cost: 100,000 Problem: Find cheapest implementation for all possible volumes (assume that all possibilities meet performance constraints)

Another Example FPGA Unit cost: 7, NRE cost: 300,000 ASIC Unit cost: 4, NRE cost: 3,000,000 Microprocessor (µP) Unit cost: 1, NRE cost: 100,000 Volume Cost FPGA µP ASIC Answer: µP cheapest solution at any volume – not uncommon

When should RC be used? 2) When time to market is critical Huge effect on total revenue Time Revenue Growth Decline Total revenue = area of triangle Time to market Delayed time to market = less revenue RC has faster time to market than ASIC

When should RC be used? 3) When circuit may have to be modified Can’t change ASIC - hardware Can change circuit implemented in FPGA Uses When standards change Codec changes after devices fabricated Allows addition of new features to existing devices Fault tolerance/recovery “Partial reconfiguration” allows virtual device with arbitrary size - analogous to virtual memory Without RC Anything that may have to be reconfigured is implemented in software Performance loss

How to choose a device? 1. Determine architectures that meet performance requirements Not trivial, requires performance analysis/estimation - important problem Will study later in semester And, other constraints - power, size, etc. 2. Estimate volume of device 3. Determine cheapest solution The best architecture for an application is typically the cheapest one that meets all design constraints.

RC Markets Embedded Systems FPGAs appearing in set-top boxes, routers, audio equipment, etc. Advantages RC achieves performance close to ASIC, sometimes at much lower cost Many other embedded systems still use ASIC due to high volume Cell phones, iPod, game consoles, etc. Reconfigurable! If standards change, architecture is not fixed Can add new features after production

RC Markets High-performance embedded computing (HPEC) High-performance/super computing with special needs (low power, low size/weight, etc.) Satellite image processing Target recognition in a UAV RC Advantages Much smaller/lower power than a supercomputer Fault tolerance

RC Markets High-performance computing (HPC) Cray, SGI, DRC, GiDEL, Nallatech, XtremeData Combine high-performance microprocessors with FPGA accelerators Novo-G 192 Altera Stratix III FPGAs integrated with 24 quad-core microprocessors RC advantages HPC used for many scientific apps Low volume, ASIC rarely feasible, microprocessor too slow Lower power consumption Increasingly important Cooling and energy costs are dominant factor in total cost of ownership

RC Markets General-purpose computing??? Ideal situation: desktop machine/OS uses RC to speedup up all applications (similar to GPU trend) Problems RC can be very fast, but not for all applications Generally requires parallel algorithms Coding constructs used in many applications not appropriate for hardware Subject of tremendous amount of past and likely future research How to use extra transistors on general purpose CPUs? More cache More microprocessor cores FPGA GPU Something else?

Limitations of RC 1) Not all applications can be improved 2) Tools need serious improvement! 3) Design strategies are often ad hoc 4) Floating point? Requires a lot of area, but performance is becoming competitive with other devices Already superior in terms of energy Embedded Applications – Large Speedups Desktop Applications – No Speedup