Xilinx CPLD Fitter Advanced Optimization

Slides:



Advertisements
Similar presentations
Multi-Level Caches Vittorio Zaccaria. Preview What you have seen: Data organization, Associativity, Cache size Policies -- how to manage the data once.
Advertisements

ONYX RIP Version Technical Training General. Overview General Messaging and What’s New in X10 High Level Print and Cut & Profiling Overviews In Depth.
XPower for CoolRunner™-II CPLDs
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Nov 9, 2005 Topic: Caches (contd.)
Lecture 9: Multi-FPGA System Software October 3, 2013 ECE 636 Reconfigurable Computing Lecture 9 Multi-FPGA System Software.
Achieving Timing Closure. Achieving Timing Closure - 2 © Copyright 2010 Xilinx Objectives After completing this module, you will be able to:  Describe.
1. 2 FPGAs Historically, FPGA architectures and companies began around the same time as CPLDs FPGAs are closer to “programmable ASICs” -- large emphasis.
Achieving Timing Closure. Objectives After completing this module, you will be able to: Describe a flow for obtaining timing closure Interpret a timing.
© 2003 Xilinx, Inc. All Rights Reserved Power Estimation.
Global Timing Constraints FPGA Design Workshop. Objectives  Apply timing constraints to a simple synchronous design  Specify global timing constraints.
StateCAD FPGA Design Workshop. For Academic Use Only Presentation Name 2 Objectives After completing this module, you will be able to:  Describe how.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
Xilinx CPLDs Low Cost Solutions At All Voltages. 0.35u CPLD Product Portfolio Complete Solutions for all Markets 0.18u 0.25u XC9500XL 3.3V 5.0 ns t PD.
A comprehensive method for the evaluation of the sensitivity to SEUs of FPGA-based applications A comprehensive method for the evaluation of the sensitivity.
Section II Basic PLD Architecture. Section II Agenda  Basic PLD Architecture —XC9500 and XC4000 Hardware Architectures —Foundation and Alliance Series.
XPower for CoolRunner™ XPLA3 CPLDs. Quick Start Training Overview Design power considerations Power consumption basics of CMOS devices Calculating power.
J. Christiansen, CERN - EP/MIC
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
PL - Day 1 Answers - slide 1 Rules of the Game Presenter is truly god-like in his infinite wisdom. What he says is the way it is! Each team will have a.
© 2003 Xilinx, Inc. All Rights Reserved Global Timing Constraints FPGA Design Flow Workshop.
Programmable Logic Training Course HDL Editor
“Supporting the Total Product Life Cycle”
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004 ECE 697F Reconfigurable Computing Lecture 6 Mapping to Embedded Memory and PLAs.
© 2005 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Implementation Options.
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Nov 3, 2005.
® Xilinx XC9500 CPLDs. ®  High performance —t PD = 5ns, f SYS = 178MHz  36 to 288 macrocell densities  Lowest price, best value CPLD.
Altera Technical Solutions Seminar Schedule OpeningIntroduction FLEX ® 10KE Devices APEX ™ 20K & Quartus ™ Overview Design Integration EDA Integration.
EE121 John Wakerly Lecture #15
WebPOWERED Software Solutions – Spring 2000 WebPOWERED CPLD Software Solutions SPRING OF CY2000.
Copyright © 2007 by Pearson Education 1 UNIT 6A COMBINATIONAL CIRCUIT DESIGN WITH VHDL by Gregory L. Moss Click hyperlink below to select: Tutorial for.
Programmable Logic Devices
® XC9500XL CPLDs Technical Presentation. ® XC9500XL Overview  Superset of XC9500 CPLD  Optimized for 3.3V systems —compatible levels.
Introduction to ASIC,FPGA,PLDs (16 marks)
Welcome to the CPLD Training Course
Finite state machine optimization
Finite state machine optimization
Memory Allocation The main memory must accommodate both:
Dept. of Electrical and Computer Engineering
M1.5 Foundation Tools Xilinx XC9500/XL CPLD
Data Dissemination and Management (2) Lecture 10
Parallel Algorithm Design
FPGA Implementation of Multicore AES 128/192/256
XC9500XV The Industry’s First 2.5V ISP CPLDs
Conditional Execution
Introduction to cosynthesis Rabi Mahapatra CSCE617
Architectural Features
Boolean Algebra and Digital Logic
COOLRUNNER II REAL DIGITAL CPLD
XC4000E Series Xilinx XC4000 Series Architecture 8/98
Chapter 6: CPU Scheduling
Chapter 5: CPU Scheduling
IAY 0800 Digitaalsüsteemide disain
CPLD Product Applications
FPGA Tools Course Basic Constraints
FPGA Tools Course Answers
XC9500XL New 3.3v ISP CPLDs.
XILINX CPLDs The Total ISP Solution
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Conditional Execution
It has 12 inputs and a dedicated clock input.
Topics Logic synthesis. Placement and routing..
Win with HDL Slide 4 System Level Design
FPGA Tools Course Timing Analyzer
XC9500 Architectural Features
easYgen-3000XT Series Training
COMP755 Advanced Operating Systems
ECE 352 Digital System Fundamentals
TECHNICAL PRESENTATION
Xilinx Alliance Series
Data Dissemination and Management (2) Lecture 10
Presentation transcript:

Xilinx CPLD Fitter Advanced Optimization

CPLD Training Course Optimizing for Speed and Density

Objectives Understand the capabilities of the Advanced Optimization Tab in the Implementation Options dialog box Learn good strategies for optimizing designs for speed and density

Agenda Parallel and Series Logic Optimizing for Speed Optimizing for Density Examples Optimize Speed and Density Templates Summary

Parallel Logic Borrowed Pterms Macrocell Pterms D/T Macrocell Pterms Parallel logic occurs when the Fitter maximizes the use of the Product Term Allocator (also referred to as Flattening) Parallel logic improves the performance of a design by reducing the number of levels of logic required However, this also effectively groups logic together in the same function block Grouping logic can decrease the utilization of any function block Parallel logic usually requires a large number of the function block inputs

Series Logic Feedback Feedback D/T Macrocell Pterms Macrocell Pterms Macrocell Pterms Series logic borrows very few product terms from the Product Term Allocator (also referred to as deepening) Series logic is easy to fit inside an XC9500/XL device because it does not borrow much logic Series logic requires multiple levels of combinatorial logic These pieces are slower than Parallel logic since they use more feedback resources

Advanced Tab Collapsing Product Term Limit Collapsing Input Limit Controls collapsing of multi-level logic. Multi-level logic is flattened until the product term limit is reached. Raising limit can increase speed at expense of density Has no effect (set to 90) during density optimization Collapsing Input Limit Controls collapsing of multilevel logic. Multi-level logic is collapsed until input limit is reached Lowering limit can improve density at expense of speed Default limit is equal to the maximum number of function block inputs in the device

Creating Parallel Logic Higher speed is often desired Remember the trade-offs Flatter designs tend to be created when the following principals are adhered to Don’t over-constrain Set limits to the proper level Use timing optimization

Timing Constraints Timing Constraints effectively communicate performance expectations to the compiler Use global constraints as a “quick and dirty” way of getting the speed necessary Do not over constrain the design Use signal specific constraints to fine tune the performance Allows software to make informed product term allocation and logic collapsing decisions Assert the Use Timing Constraints option, otherwise constraints will be ignored Constraints permit the optimization of some paths and not others, which gives the tools more flexibility

Increase the Pterm Limit Increase the Product Term Collapsing Limit Increases the flattening of multi-level logic by using product term allocator feature This uses the faster interconnect between product term allocators Raise from 20 (default) to 45 or even 90 (FB pterm limit) Check the Fitting Report to determine the extent to which product terms are being borrowed Assert the Use Timing Optimization option Useful for designs that contain multi-level logic or speed critical signals This option tends to improve slowest paths, whereas constrained paths specify which paths to improve

Creating Series Logic Higher density is often desired Remember the trade-offs Deep designs tend to be created when the following principals are adhered to Don’t over-constrain locations Set limits to the proper level Use advanced fitting option

Reduce the Creation of Parallel Logic Creation of Parallel logic occurs when the Product Term Allocator is used extensively Decrease the Collapsing Pterm Limit to map the logic into smaller chunks (force more feedback) Decrease the Collapsing Input Limit to reduce the amount of logic in some function blocks

Use the Advanced Fitting Option This is a different partitioning algorithm that places functions that share inputs into the same function block Use this option if the design becomes function block input limited The Advanced Fitting option will not impact performance This option is on by default

Use the KEEP Attribute Use this attribute on high fanout product terms or input intensive nodes Overrides product term and function block input collapsing limits in GUI Boolean logic reduction still performed KEEP 12 Product Term Implementation 6 Product Term Implementation

Global Resources (The Highest Payoff for Speed and Density) Use global control signals Using global clock, output enable and set/reset nets saves function block inputs and local product terms Assign high fanout control signals generated in macrocells to global nets FF0 FF5 FF6 FF7 FF8 FF1 FF2 FF3 FF4 BUFG=OE

Function Block Input Limited Design ******** Resources Required By Unmapped Logic and Pins*********** ** Logic Signal Total Signals Pwr Slew Name PT Used Mode Rate EXIT 12 19 STD DPCS 10 19 STD IO 10 14 STD LBE1 10 13 STD SP 10 14 STD **************** Function Block Resource Summary **************** Function # of FB Inputs Signals Total O/IO IO Block MCells Used Used PT Used Req Avail FB1 13 36 39 47 0/0 17 FB2 12 36 37 56 10/0 17 FB3 14 36 36 73 5/1 17 FB4 11 36 38 36 7/0 17 FB5 12 36 40 52 8/1 17 FB6 10 36 38 27 7/0 16 FB7 12 36 35 60 10/1 16 FB8 10 36 37 43 5/3 16 94 363 52/6 133

Solution First, use the Advanced Fitting option **************** Function Block Resource Summary **************** Function # of FB Inputs Signals Total O/IO IO Block MCells Used Used PT Used Req Avail FB1 11 36 41 47 0/0 17 FB2 14 36 43 56 10/0 17 FB3 18 36 37 73 5/1 17 FB4 11 36 41 36 7/0 17 FB5 14 36 36 52 8/1 17 FB6 10 35 37 27 7/0 16 FB7 13 36 39 60 10/1 16 FB8 18 36 36 43 5/3 16 109 363 52/6 133 First, use the Advanced Fitting option If no improvement is seen, gradually reduce the function block input collapse limit to reduce the creation of parallel logic Macrocell count increases when collapse limit decreases

Product Term Limited Design ******** Resources Required By Unmapped Logic and Pins*********** ** Logic Signal Total Signals Pwr Slew Name PT Used Mode Rate EXIT 12 10 STD DPCS 20 19 STD IO 17 14 STD LBE1 19 13 STD SP 10 14 STD **************** Function Block Resource Summary **************** Function # of FB Inputs Signals Total O/IO IO Block MCells Used Used PT Used Req Avail FB1 13 34 39 74 0/0 17 FB2 12 30 37 87 10/0 17 FB3 14 26 26 73 5/1 17 FB4 11 17 27 75 7/0 17 FB5 12 27 27 84 8/1 17 FB6 10 30 30 84 7/0 16 FB7 12 29 29 83 10/1 16 FB8 10 20 20 83 5/3 16 94 643 52/6 133

Solution Gradually reduce the Product Term Collapse Limit **************** Function Block Resource Summary **************** Function # of FB Inputs Signals Total O/IO IO Block MCells Used Used PT Used Req Avail FB1 11 36 41 28 0/0 17 FB2 14 36 43 45 10/0 17 FB3 18 36 37 46 5/1 17 FB4 11 36 41 47 7/0 17 FB5 14 36 36 54 8/1 17 FB6 10 35 37 46 7/0 16 FB7 13 36 39 45 10/1 16 FB8 18 36 36 78 5/3 16 109 389 52/6 133 Gradually reduce the Product Term Collapse Limit The number of macrocells used will increase

Function Block and Product Term Limited ******** Resources Required By Unmapped Logic and Pins*********** ** Logic Signal Total Signals Pwr Slew Name PT Used Mode Rate EXIT 12 19 STD DPCS 10 19 STD IO 10 14 STD LBE1 10 13 STD SP 10 14 STD **************** Function Block Resource Summary **************** Function # of FB Inputs Signals Total O/IO IO Block MCells Used Used PT Used Req Avail FB1 13 36 39 74 0/0 17 FB2 12 36 37 87 10/0 17 FB3 14 36 36 73 5/1 17 FB4 11 36 38 75 7/0 17 FB5 12 36 40 84 8/1 17 FB6 10 36 38 84 7/0 16 FB7 12 36 35 83 10/1 16 FB8 10 36 37 83 5/3 16 94 643 52/6 133

Solution First, use the Advanced Fitting option **************** Function Block Resource Summary **************** Function # of FB Inputs Signals Total O/IO IO Block MCells Used Used PT Used Req Avail FB1 11 36 41 28 0/0 17 FB2 14 36 43 45 10/0 17 FB3 18 36 37 46 5/1 17 FB4 11 36 41 47 7/0 17 FB5 14 36 36 54 8/1 17 FB6 10 35 37 46 7/0 16 FB7 13 36 39 45 10/1 16 FB8 18 36 36 78 5/3 16 109 389 52/6 133 First, use the Advanced Fitting option Second, fit the design with Timing Optimization OFF Third, reduce the FB Input Collapse Limit Number of macrocells will increase Finally, reduce the Product Term Collapse Limit

Choosing New Product Term Limit ******** Resources Used by Successfully Mapped Logic ************ Signal Total Signals Loc PWR Slew Pin Name PT Used Mode Rate # Q0 1 8 FB7_5 STD FAST 19 Q1 5 7 FB3_1 STD FAST 35 Q2 3 10 FB6_5 STD FAST 75 Q2 7 3 FB3_1 STD FAST 160 Q4 15 7 FB5_8 STD FAST 100 ******** Resources Required By Unmapped Logic and Pins*********** ** Logic Signal Total Signals Pwr Slew Name PT Used Mode Rate EXIT 8 19 STD DPCS 7 19 STD IO 2 14 STD LBE1 4 13 STD SP 5 14 STD Reduce limits below requirements of largest implemented equations

Summary Timing Constraints are the most effective way to obtain good performance Raising the Product Term Collapsing Limit increases the creation of Parallel logic, which improves the performance of some designs Use of Global Resources frees Product Terms for use. This gives payoff in speed and density Reducing the Product Term Collapsing Limit will reduce the performance of some modules, but will improve the density Use the KEEP attribute to save product terms on high fanout nets