The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.

Slides:



Advertisements
Similar presentations
The CPU The Central Presentation Unit What is the CPU?
Advertisements

DSPs Vs General Purpose Microprocessors
Programmable FIR Filter Design
Distributed Arithmetic
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 9 Programmable Configurations Read Only Memory (ROM) – –a fixed array of AND gates.
ECE 734: Project Presentation Pankhuri May 8, 2013 Pankhuri May 8, point FFT Algorithm for OFDM Applications using 8-point DFT processor (radix-8)
3. Digital Implementation of Mo/Demodulators
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
Configurable System-on-Chip: Xilinx EDK
Retrospective on the VIRAM-1 Design Decisions Christoforos E. Kozyrakis IRAM Retreat January 9, 2001.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
Octavo: An FPGA-Centric Processor Architecture Charles Eric LaForest J. Gregory Steffan ECE, University of Toronto FPGA 2012, February 24.
FPGA Based Fuzzy Logic Controller for Semi- Active Suspensions Aws Abu-Khudhair.
Delevopment Tools Beyond HDL
Xilinx at Work in Hot New Technologies ® Spartan-II 64- and 32-bit PCI Solutions Below ASSP Prices January
Real time DSP Professors: Eng. Julian Bruno Eng. Mariano Llamedo Soria.
Programmable Solutions in Video Capture/Editing. Overview  Xilinx - Industry Leader in FPGAs/CPLDs High-density, high-speed, programmable, low cost logic.
Basics and Architectures
Highest Performance Programmable DSP Solution September 17, 2015.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
® Programmable Solutions in ISDN Modems. ® Overview  Xilinx - Industry Leader in FPGAs/CPLDs —High-density, high-speed, programmable,
Lessons Learned The Hard Way: FPGA  PCB Integration Challenges Dave Brady & Bruce Riggins.
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
® SPARTAN Series High Volume System Solution. ® Spartan/XL Estimated design size (system gates) 30K 5K180K XC4000XL/A XC4000XV Virtex S05/XL.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Academy - Xilinx DSP Page 1 Academy - Xilinx DSP Page 2 Existing DSP Solutions Fixed function DSP devices ASICs Standard DSP processors (only programmable.
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
This material exempt per Department of Commerce license exception TSU Xilinx On-Chip Debug.
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –
What is a Microprocessor ? A microprocessor consists of an ALU to perform arithmetic and logic manipulations, registers, and a control unit Its has some.
1 Computer System Overview Chapter 1. 2 Operating System Exploits the hardware resources of one or more processors Provides a set of services to system.
Sridhar Rajagopal Bryan A. Jones and Joseph R. Cavallaro
Computer Hardware What is a CPU.
Computers’ Basic Organization
M. Bellato INFN Padova and U. Marconi INFN Bologna
Fang Fang James C. Hoe Markus Püschel Smarahara Misra
Microprocessor and Microcontroller Fundamentals
Chapter 2.1 CPU.
Difference Between SOC (System on Chip) and Single Board Computer
Introduction to Programmable Logic
Performance of Single-cycle Design
Embedded Systems Design
ENG3050 Embedded Reconfigurable Computing Systems
Architecture & Organization 1
5.2 Eleven Advanced Optimizations of Cache Performance
Cache Memory Presentation I
Introduction.
Chapter III Desktop Imaging Systems & Issues
FPGA Implementation of Multicore AES 128/192/256
Spartan-II + Soft IP = Programmable ASSP
Subject Name: Digital Signal Processing Algorithms & Architecture
Subject Name: Digital Signal Processing Algorithms & Architecture
Architecture & Organization 1
Challenges Implementing Complex Systems with FPGA Components
Central Processing Unit
XC4000E Series Xilinx XC4000 Series Architecture 8/98
Matlab as a Development Environment for FPGA Design
A Digital Signal Prophecy The past, present and future of programmable DSP and the effects on high performance applications Continuing technology enhancements.
Programmable Configurations
ChipScope Pro Software
HIGH LEVEL SYNTHESIS.
Computer Evolution and Performance
ChipScope Pro Software
Course Code 114 Introduction to Computer Science
ARM920T Processor This training module provides an introduction to the ARM920T processor embedded in the AT91RM9200 microcontroller.We’ll identify the.
Presentation transcript:

The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm Shift: We will show you a new way to think about high performance DSP solutions. The benefits are so incredible that its hard to believe. 1

… the fastest DSP Processor Is Not Fast Enough? What do you do when ... … the fastest DSP Processor Is Not Fast Enough? Design a custom gate array? Add more DSP processors? What do you do when the fastest DSP processor is not fast enough? Traditionally there have been only two options available. Multiple DSP processors have too many problems Too expensive Too many components Too much power Long & expensive development cycle ( complex real-time software) And, results are still too slow Custom chips are appropriate for some applications if you have the time and money, you know exactly what you want, and the market will wait for you. Custom solutions yield a low unit cost if the production volumes materialize and if you don’t make too many mistakes in the design process. High development costs Time-to-market No flexibility 2

Just Add a Xilinx FPGA Channelizer DSP Processor (Demodulation) A to D Fs=20 MHz 700 Million MACs 0.8 MHz Bandwidth Ch.0 Ch.1 Ch.2 60 db 1.5% fS Yes, there is another solution, just add a Xilinx FPGA. Xilinx FPGAs are a programmable solution just like a DSP processor, but have the performance of a custom device. Consider this design example that requires a great deal of processing power to simultaneously separate three narrow band channels. The data sample rate is 20 MHz. Multiple filters with lots of taps are required to achieve 60 db out-of-band attenuation within 1.5% of the sample frequency. This requires about 700 million MACs per second. This type of data path design is a perfect match for Xilinx DSP and the design can be built with Xilinx DSP cores. 0.4 4.6 5.4 9.6 20MHz Sample Rate 5 MHz 10 MHz 0/4 fS 1/4 fS 2/4 fS 3

Using Xilinx DSP Cores 4-Point FFT Spartan S40 1:4 Demux Ch. 0 32-Tap SDA FIR Filter Core 4-Point FFT Ch. 1 32-Tap SDA FIR Filter Core Ch. 2 32-Tap SDA FIR Filter Core 32-Tap SDA FIR Filter Core DSP cores automate the design process by directly implement each functional block. Standard cores can be selected from a library and customized to the specific system requirements. The SDA FIR filter core is used to build a 4-to-1 decimating filter and the outputs go directly to a 4-point FFT instead of through an adder tree. The FFT can be built from several adder cores. The bit-widths for each section of the data path can be independently set to optimal values. With DSP processors you only have one bit-width choice. This set of DSP cores fits in a single Xilinx Spartan device. Spartan is our new family of low cost FPFAs and many high performance applications can fit in this new FPGA family. Spartan S40 4

Design in an Integrated System-Level DSP Environment ELANIX I N C O R P O R A T E D Design in an Integrated System-Level DSP Environment With the integration of DSP system-level tools it is now possible to automatically target Xilinx FPGAs and get an efficient FPGA design implementation Specify the design as a block diagram, use the system modeling tools to verify that it is mathematically correct and then optimize the bit-widths to the minimal values that still meet the system specification. These minimal bit-widths allow the design to fit in a smaller FPGA device, reducing cost. A list of cores with optimal core parameters is passed from the system-level tool to the Xilinx CORE Generator. The cores are then generated using Smart-IP Technology for an efficient implementation with predictable performance. The design is then downloaded to the Xilinx device on your board for verification. 5

Performance Through Parallel Processing Xilinx FPGA DSP Processor Time-share 1 or 2 or 4 MACs CPU & MAC(s) RAM ROM Peripherals MAC FPGAs and cores deliver the parallel processing performance that is not possible to achieve with a DSP processor. Processors can do one (or at most two) multiply accumulates at a time. FPGAs can do many MACs in parallel. Most of the die area in a DSP processor is used to keep a single multiplier (ALU) busy. Wide, highly loaded busses move data and instructions through the chip and this consumes extra power. Xilinx FPGA architecture is scalable to take advantage of new process technology . If more multiply accumulates are required, simply use a larger FPGA device. As many MACs in parallel as you need 6

Greater than 10x DSP uP Performance 5 16-bit FIR Filter Benchmark 4 3 GIGA-MACs 2 1 When you put FPGAs and cores together this is what you get. Using a standard 16-bit FIR filter as a benchmark, FPGAs can achieve at least 10 times the performance of the most advanced DSP processor at a fraction of the cost per performance unit. The new low cost Spartan FPGAs can achieve comparable performance at one tenth the cost. The extra bonus is that this comes with a simpler, less complex design flow so that your product gets to market ahead of the competition. High- Performance DSP uP S30 S40 4036 4062 4085 40125 Xilinx has the Best Architecture for High Performance DSP 7

High Performance at a Fraction of the Cost 1.6 1.2 4036 3 Extra uPs Giga-MACs * Prices based on 50,000 PCS $192* $20* 0.8 S40 2 Extra uPs 0.4 Use Xilinx FPGAs instead if additional DSP processors. There is a dramatic cost savings when you compare the component cost of a Xilinx FPGA with the cost of a high performance DSP processor. A Spatran S40 FPGA is only $20 but it can do the work of two DSP processors. The 4036 is a relatively small device in the XC4000 family, but it many applications it can do the work of three high performance DSP processors. In addition FPGAs do not require additional components that processors need to complete a system (memory, I/O, FPGA glue). 1 Extra uP S30 8

… And with Faster Time-To-Market Development Time Multi-Processor: Code Required 6 Months 5 Months 4 Months FPGA: No-Coding Required Xilinx S40 4036 4044 4062 3 Months 2 Months The real-time software required for multiple processor applications slows down development schedules. As more performance is needed, more processors are required, and the development time increases rapidly. With Xilinx the development can be done with fewer engineers and in less time because the design process is simpler and has less steps. You can deliver your product on time and at a lower development cost. 1 Month Number of DSP Processors 9

Xilinx has all the Pieces Add a Xilinx FPGA, not more processors $20 Spartan Programmable Device MAC rate of Two High Performance DSP Processors S40 Simple, fast design process Up to 80% less power dissipation with FPGAs The end result is that the channelizer example fits in a single Spartan FPGA $20 device and consumes only a small fraction of the power compared to two high end $100 DSP processors that it replaces. And you do not have to write complex real-time software that is needed with multiple processors. Get to market faster with better product at a lower cost. Add a Xilinx FPGA, not multiple processors. 10