ESE535: Electronic Design Automation

Slides:



Advertisements
Similar presentations
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 19: November 21, 2005 Scheduling Introduction.
Advertisements

ECE 667 Synthesis and Verification of Digital Circuits
FPGA Technology Mapping Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 15: March 12, 2007 Interconnect 3: Richness.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 23, 2008 Covering.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 21: April 15, 2009 Routing 1.
EDA (CS286.5b) Day 6 Partitioning: Spectral + MinCut.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 7: February 11, 2008 Static Timing Analysis and Multi-Level Speedup.
EDA (CS286.5b) Day 11 Scheduling (List, Force, Approximation) N.B. no class Thursday (FPGA) …
EDA (CS286.5b) Day 3 Clustering (LUT Map and Delay) N.B. no lecture Thursday.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 16: March 23, 2009 Covering.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 12: March 2, 2009 Sequential Optimization (FSM Encoding)
CS294-6 Reconfigurable Computing Day 15 October 13, 1998 LUT Mapping.
Penn ESE525 Spring DeHon 1 ESE535: Electronic Design Automation Day 10: February 18, 2009 Partitioning 2 (spectral, network flow)
EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 3: January 27, 2008 Clustering (LUT Mapping, Delay) Please work preclass example.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 17: March 30, 2009 Clustering (LUT Mapping, Delay)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 18, 2009 Static Timing Analysis and Multi-Level Speedup.
FPGA Technology Mapping. 2 Technology mapping:  Implements the optimized nodes of the Boolean network to the target device library.  For FPGA, library.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 13, 2008 Retiming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 26, 2015 Covering Work preclass exercise.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 16, 2015 Scheduling Variants and Approaches.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 18, 2011 Covering and Retiming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 10: February 18, 2015 Architecture Synthesis (Provisioning, Allocation)
Penn ESE525 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: February 4, 2014 Partitioning 2 (spectral, network flow)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 23: April 20, 2015 Static Timing Analysis and Multi-Level Speedup.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 3: January 12, 2004 Clustering (LUT Mapping, Delay)
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004 ECE 697F Reconfigurable Computing Lecture 6 Mapping to Embedded Memory and PLAs.
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
CALTECH CS137 Spring DeHon 1 CS137: Electronic Design Automation Day 5: April 12, 2004 Covering and Retiming.
Give qualifications of instructors: DAP
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 20: April 4, 2011 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 13, 2013 High Level Synthesis II Dataflow Graph Sharing.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2015 Clustering (LUT Mapping, Delay)
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 2: September 28, 2005 Covering.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: January 31, 2011 Scheduling Variants and Approaches.
CS137: Electronic Design Automation
VLSI Physical Design Automation
CS137: Electronic Design Automation
CS137: Electronic Design Automation
CS137: Electronic Design Automation
Lectures on Network Flows
ESE535: Electronic Design Automation
CS137: Electronic Design Automation
ESE535: Electronic Design Automation
Greedy Algorithms / Interval Scheduling Yin Tat Lee
Reconfigurable Computing
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE534: Computer Organization
ESE534: Computer Organization
ESE535: Electronic Design Automation
SAT-Based Optimization with Don’t-Cares Revisited
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
Topics Logic synthesis. Placement and routing..
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
CS184a: Computer Architecture (Structures and Organization)
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
Technology Mapping I based on tree covering
VLSI CAD Flow: Logic Synthesis, Placement and Routing Lecture 5
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
CS137: Electronic Design Automation
Fast Min-Register Retiming Through Binary Max-Flow
CprE / ComS 583 Reconfigurable Computing
CS137: Electronic Design Automation
Presentation transcript:

ESE535: Electronic Design Automation Day 7: February 7, 2011 Clustering (LUT Mapping, Delay) Penn ESE535 Spring 2011 -- DeHon

Today How do we map to LUTs? What happens when Lessons… IO dominates Behavioral (C, MATLAB, …) RTL Gate Netlist Layout Masks Arch. Select Schedule FSM assign Two-level, Multilevel opt. Covering Retiming Placement Routing Today How do we map to LUTs? What happens when IO dominates Delay dominates? Lessons… for non-LUTs for delay-oriented partitioning Penn ESE535 Spring 2011 -- DeHon

LUT Mapping Problem: Map logic netlist to LUTs Old problem? minimizing area minimizing delay Old problem? Technology mapping? (Day 2) How big is the library? 22K gates in library Penn ESE535 Spring 2011 -- DeHon

Simplifying Structure K-LUT can implement any K-input function Penn ESE535 Spring 2011 -- DeHon

Cost Function Delay: number of LUTs in critical path doesn’t say delay in LUTs or in wires does assume uniform interconnect delay Area: number of LUTs Assumes adequate interconnect to use LUTs Penn ESE535 Spring 2011 -- DeHon

LUT Mapping NP-Hard in general Fanout-free -- can solve optimally given decomposition (but which one?) Delay optimal mapping achievable in Polynomial time Area w/ fanout NP-complete Penn ESE535 Spring 2011 -- DeHon

Preliminaries What matters/makes this interesting? Area / Delay target Decomposition Fanout replication reconvergent Penn ESE535 Spring 2011 -- DeHon

Costs: Area vs. Delay Penn ESE535 Spring 2011 -- DeHon

Decomposition Penn ESE535 Spring 2011 -- DeHon

Decomposition Penn ESE535 Spring 2011 -- DeHon

Fanout: Replication Penn ESE535 Spring 2011 -- DeHon

Fanout: Replication Penn ESE535 Spring 2011 -- DeHon

Fanout: Reconvergence Penn ESE535 Spring 2011 -- DeHon

Fanout: Reconvergence Penn ESE535 Spring 2011 -- DeHon

Monotone Property Does cost function increase monotonicly as more of the graph is included? (do all subsets have property) gate count? I/o? Important? How far back do we need to search? Penn ESE535 Spring 2011 -- DeHon

Definition Cone: set of nodes in the recursive fanin of a node Penn ESE535 Spring 2011 -- DeHon

Example Cones Penn ESE535 Spring 2011 -- DeHon

Delay Penn ESE535 Spring 2011 -- DeHon

Dynamic Programming Optimal covering of a logic cone is: Minimum cost (all possible coverings) Evaluate costs of each node based on: cover node cones covering each fanin to node cover Evaluate node costs in topological order Key: are calculating optimal solutions to subproblems only have to evaluate covering options at each node Penn ESE535 Spring 2011 -- DeHon

Flowmap Key Idea: LUT holds anything with K inputs Use network flow to find cuts  logic can pack into LUT including reconvergence …allows replication Optimal depth arise from optimal depth solution to subproblems Penn ESE535 Spring 2011 -- DeHon

MaxFlow Set all edge flows to zero While there is a path from s,t F[u,v]=0 While there is a path from s,t (breadth-first-search) for each edge in path f[u,v]=f[u,v]+1 f[v,u]=-f[u,v] When c[v,u]=f[v,u] remove edge from search O(|E|*cutsize) Penn ESE535 Spring 2011 -- DeHon

Flowmap Delay objective: Height of node will be: Check shorter first minimum height, K-feasible cut I.e. cut no more than K edges start by bounding fanin  K Height of node will be: height of predecessors or one greater than height of predecessors Check shorter first 1 1 1 1 2 Penn ESE535 Spring 2011 -- DeHon

Flowmap Construct flow problem sink  target node being mapped source  start set (primary inputs) flow infinite into start set flow of one on each link to see if height same as predecessors collapse all predecessors of maximum height into sink (single node, cut must be above) height +1 case is trivially true Penn ESE535 Spring 2011 -- DeHon

Example Subgraph Target: K=4 1 1 1 1 2 2 Penn ESE535 Spring 2011 -- DeHon

Trivial: Height +1 1 1 1 1 2 2 3 Penn ESE535 Spring 2011 -- DeHon

Collapse at max height 1 1 1 1 2 2 Penn ESE535 Spring 2011 -- DeHon

Collapse at max height 1 1 1 1 2 2 Collapsed Node Penn ESE535 Spring 2011 -- DeHon

Augmenting Flows Collapsed Node Penn ESE535 Spring 2011 -- DeHon

Augmenting Flows Collapsed Node Penn ESE535 Spring 2011 -- DeHon

Augmenting Flows Collapsed Node Penn ESE535 Spring 2011 -- DeHon

Augmenting Flows Collapsed Node Penn ESE535 Spring 2011 -- DeHon

Augmenting Flows Collapsed Node Collapsed Node Penn ESE535 Spring 2011 -- DeHon

Collapse at max height 2 Collapsed Node 1 1 1 1 2 2 Penn ESE535 Spring 2011 -- DeHon

(different/larger graph) Collapse not work (different/larger graph) 1 1 1 1 2 2 Forced to label height+1 2 Penn ESE535 Spring 2011 -- DeHon

Reconvergent fanout (yet different graph) Can label at height 1 1 1 1 2 Can label at height 2 Penn ESE535 Spring 2011 -- DeHon

Flowmap Max-flow Min-cut algorithm to find cut Use augmenting paths until discover max flow > K O(K|e|) time to discover K-feasible cut (or that does not exist) Depth identification: O(KN|e|) Penn ESE535 Spring 2011 -- DeHon

Mincut may not be unique 2 Collapsed Node 1 1 1 1 2 2 Penn ESE535 Spring 2011 -- DeHon

Flowmap Min-cut may not be unique To minimize area achieving delay optimum find max volume min-cut Compute max flow  find min cut remove edges consumed by max flow DFS from source Compliment set is max volume set Penn ESE535 Spring 2011 -- DeHon

Graph Penn ESE535 Spring 2011 -- DeHon

Graph: maxflow (K=4) Penn ESE535 Spring 2011 -- DeHon

Graph: BFS source Reachable Penn ESE535 Spring 2011 -- DeHon

Max Volume min-cut Penn ESE535 Spring 2011 -- DeHon

Flowmap Covering from labeling is straightforward Notes: process in reverse topological order allocate identified K-feasible cut to LUT remove node postprocess to minimize LUT count Notes: replication implicit (covered multiple places) nodes purely internal to one or more covers may not get their own LUTs Penn ESE535 Spring 2011 -- DeHon

Flowmap Roundup Label Cover Work from inputs to outputs Find max label of predecessors Collapse new node with all predecessors at this label Can find flow cut  K? Yes: mark with label (find max-volume cut extent) No: mark with label+1 Cover Work from outputs to inputs Allocate LUT for identified cluster/cover Recurse covering selection on inputs to identified LUT Penn ESE535 Spring 2011 -- DeHon

Area Penn ESE535 Spring 2011 -- DeHon

DF-Map Duplication Free Mapping can find optimal area under this constraint (but optimal area may not be duplication free) [Cong+Ding, IEEE TR VLSI Sys. V2n2p137] Penn ESE535 Spring 2011 -- DeHon

Maximum Fanout Free Cones MFFC: bit more general than trees Penn ESE535 Spring 2011 -- DeHon

MFFC Follow cone backward end at node that fans out (has output) outside the code Penn ESE535 Spring 2011 -- DeHon

MFFC example Penn ESE535 Spring 2011 -- DeHon

MFFC example Penn ESE535 Spring 2011 -- DeHon

DF-Map Partition into graph into MFFCs Optimally map each MFFC In dynamic programming for each node examine each K-feasible cut note: this is very different than flowmap where only had to examine a single cut pick cut to minimize cost 1 +  MFFCs for fanins Penn ESE535 Spring 2011 -- DeHon

DF-Map Example Cones? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example Penn ESE535 Spring 2011 -- DeHon

DF-Map Example Penn ESE535 Spring 2011 -- DeHon

DF-Map Example Penn ESE535 Spring 2011 -- DeHon

DF-Map Example Penn ESE535 Spring 2011 -- DeHon

DF-Map Example Penn ESE535 Spring 2011 -- DeHon

DF-Map Example Start mapping cone Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 ? 1 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 ? 1 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 ? 1 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 ? 1 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example Similar to previous 1 1 1 1 1 1 1 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 2 1 1 1 1 1 1 1 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 3 1 1 1 1 1 1 1 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 3 2 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 3 2 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 1 1 1 1 1 3 2 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 1 1 1 1 1 3 2 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 3 1 1 3 1 1 1 1 1 3 2 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 4 2 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 4 2 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 1 1 1 1 1 4 2 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 1 1 1 1 1 4 2 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 3 1 1 1 1 1 4 2 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 3 1 1 1 1 1 4 2 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 3 1 1 1 1 1 4 2 3 ? 4 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 3 1 1 1 1 1 4 2 3 ? 4 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 3 3 1 1 1 1 1 4 2 3 ? 3 4 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 3 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 3 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 3 9 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 3 9 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 7 2 3 3 9 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 7 2 3 3 9 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 7 5 2 3 3 9 ? Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 3 5 Penn ESE535 Spring 2011 -- DeHon

DF-Map Example 1 1 1 1 1 1 1 2 3 3 5 Penn ESE535 Spring 2011 -- DeHon

Lecture Ended Here Penn ESE535 Spring 2011 -- DeHon

Composing Don’t need minimum delay off the critical path Don’t always want/need minimum delay Composite: map with flowmap Greedy decomposition of “most promising” non-critical nodes DF-map these nodes Penn ESE535 Spring 2011 -- DeHon

Variations on a Theme Penn ESE535 Spring 2011 -- DeHon

Applicability to Non-LUTs? E.g. LUT Cascade can handle some functions of K inputs How apply? Penn ESE535 Spring 2011 -- DeHon

Adaptable to Non-LUTs Sketch: Initial decomposition to nodes that will fit Find max volume, min-height K-feasible cut ask if logic block will cover yes  done no  exclude one (or more) nodes from block and repeat exclude == collapse into start set nodes this makes heuristic Penn ESE535 Spring 2011 -- DeHon

Partitioning? Effectively partitioning logic into clusters LUT cluster unlimited internal “gate” capacity limited I/O (K) simple delay cost model 1 cross between clusters 0 inside cluster Penn ESE535 Spring 2011 -- DeHon

Partitioning Clustering if strongly I/O limited, same basic idea works for partitioning to components typically: partitioning onto multiple FPGAs assumption: inter-FPGA delay >> intra-FPGA delay w/ area constraints similar to non-LUT case make min-cut will it fit? Exclude some LUTs and repeat Penn ESE535 Spring 2011 -- DeHon

Clustering for Delay W/ no IO constraint area is monotone property DP-label forward with delays grab up largest labels (greatest delays) until fill cluster size Work backward from outputs creating clusters as needed Penn ESE535 Spring 2011 -- DeHon

Area and IO? Real problem: Doing both optimally is NP-hard FPGA/chip partitioning Doing both optimally is NP-hard Heuristic around IO cut first should do well (e.g. non-LUT slide) [Yang and Wong, FPGA’94] Penn ESE535 Spring 2011 -- DeHon

Partitioning To date: Open/promising primarily used for 2-level hierarchy I.e. intra-FPGA, inter-FPGA Open/promising adapt to multi-level for delay-optimized partitioning/placement on fixed-wire schedule localize critical paths to smallest subtree possible? Penn ESE535 Spring 2011 -- DeHon

Summary Optimal LUT mapping NP-hard in general fanout, replication, …. K-LUTs makes delay optimal feasible single constraint: IO capacity technique: max-flow/min-cut Heuristic adaptations of basic idea to capacity constrained problem promising area for interconnect delay optimization Penn ESE535 Spring 2011 -- DeHon

Admin Assignment 1 (Graded) Assignment 2 Reading Wed. on web Hope you are into programming Office hour T4:30pm I hope to give 2a quick look soon Reading Wed. on web Penn ESE535 Spring 2011 -- DeHon

Today’s Big Ideas: IO may be a dominant cost Exploit structure: K-LUTs limiting capacity, delay Exploit structure: K-LUTs Mixing dominant modes multiple objectives Define optimally solvable subproblem duplication free mapping Penn ESE535 Spring 2011 -- DeHon