CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.

Slides:



Advertisements
Similar presentations
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
Advertisements

ECE 667 Synthesis and Verification of Digital Circuits
Hardware/ Software Partitioning 2011 年 12 月 09 日 Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These.
Hadi Goudarzi and Massoud Pedram
ECE-777 System Level Design and Automation Hardware/Software Co-design
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
Towards Target-Level Testing and Debugging Tools For Embedded Software Harry Koehnemann, Arizona State University Dr. Timothy Lindquist, Arizona State.
© 2005 Prentice Hall6-1 Stumpf and Teague Object-Oriented Systems Analysis and Design with UML.
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Hardware/Software Codesign.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Chapter 2 – Netlist and System Partitioning
Process Scheduling for Performance Estimation and Synthesis of Hardware/Software Systems Slide 1 Process Scheduling for Performance Estimation and Synthesis.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 08: RC Principles: Software (1/4) Prof. Sherief Reda.
Reconfigurable Computing (EN2911X, Fall07)
Ritu Varma Roshanak Roshandel Manu Prasanna
Scheduling with Optimized Communication for Time-Triggered Embedded Systems Slide 1 Scheduling with Optimized Communication for Time-Triggered Embedded.
System Partitioning Kris Kuchcinski
1 of 14 1 Analysis and Synthesis of Communication-Intensive Heterogeneous Real-Time Systems Paul Pop Computer and Information Science Dept. Linköpings.
Mahapatra-Texas A&M-Fall'001 Partitioning - I Introduction to Partitioning.
Merging Synthesis With Layout For Soc Design -- Research Status Jinian Bian and Hongxi Xue Dept. Of Computer Science and Technology, Tsinghua University,
Winter-Spring 2001Codesign of Embedded Systems1 Introduction to HW/SW Codesign Part of HW/SW Codesign of Embedded Systems Course (CE )
A New Approach for Task Level Computational Resource Bi-Partitioning Gang Wang, Wenrui Gong, Ryan Kastner Express Lab, Dept. of ECE, University of California,
1 of 14 1 / 18 An Approach to Incremental Design of Distributed Embedded Systems Paul Pop, Petru Eles, Traian Pop, Zebo Peng Department of Computer and.
A Low-Power Low-Memory Real-Time ASR System. Outline Overview of Automatic Speech Recognition (ASR) systems Sub-vector clustering and parameter quantization.
1 Embedded Computer System Laboratory RTOS Modeling in Electronic System Level Design.
Universität Dortmund  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Hardware/software partitioning  Functionality to be implemented in software.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
- 1 - EE898-HW/SW co-design Hardware/Software Codesign “Finding right combination of HW/SW resulting in the most efficient product meeting the specification”
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
CAD Techniques for IP-Based and System-On-Chip Designs Allen C.-H. Wu Department of Computer Science Tsing Hua University Hsinchu, Taiwan, R.O.C {
1 Partitioning Methods. 2 Outline Introduction to Hardware-Software Codesign Models, Architectures, Languages Partitioning Methods Design Quality Estimation.
May 2004 Department of Electrical and Computer Engineering 1 ANEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS A NEW GRAPH.
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
SOFTWARE / HARDWARE PARTITIONING TECHNIQUES SHaPES: A New Approach.
Hardware-Software Co-partitioning for Distributed Embedded Systems.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Hardware/Software Co-design Design of Hardware/Software Systems A Class Presentation for VLSI Course by : Akbar Sharifi Based on the work presented in.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
High Performance Embedded Computing © 2007 Elsevier Lecture 18: Hardware/Software Codesign Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
Chapter 5B: Hardware/Software Codesign / Partitioning EECE **** Embedded System Design.
- 1 - EE898_HW/SW Partitioning Hardware/software partitioning  Functionality to be implemented in software or in hardware? No need to consider special.
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
C OMPARING T HREE H EURISTIC S EARCH M ETHODS FOR F UNCTIONAL P ARTITIONING IN H ARDWARE -S OFTWARE C ODESIGN Theerayod Wiangtong, Peter Y. K. Cheung and.
1 SYNTHESIS of PIPELINED SYSTEMS for the CONTEMPORANEOUS EXECUTION of PERIODIC and APERIODIC TASKS with HARD REAL-TIME CONSTRAINTS Paolo Palazzari Luca.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
Presentation by Tom Hummel OverSoC: A Framework for the Exploration of RTOS for RSoC Platforms.
Lecture 12: Reconfigurable Systems II October 20, 2004 ECE 697F Reconfigurable Computing Lecture 12 Reconfigurable Systems II: Exploring Programmable Systems.
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
High Performance Embedded Computing © 2007 Elsevier Chapter 7, part 3: Hardware/Software Co-Design High Performance Embedded Computing Wayne Wolf.
CSCI1600: Embedded and Real Time Software Lecture 33: Worst Case Execution Time Steven Reiss, Fall 2015.
1 of 14 1/34 Embedded Systems Design: Optimization Challenges Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden.
Physically Aware HW/SW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration Sudarshan Banarjee, Elaheh Bozorgzadeh, Nikil.
Chapter 11 System-Level Verification Issues. The Importance of Verification Verifying at the system level is the last opportunity to find errors before.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2012.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Embedded Real-Time Systems
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
The Hardware / Software Tradeoff -John Burnette-
A Methodology for System-on-a-Programmable-Chip Resources Utilization
CSCI1600: Embedded and Real Time Software
Boltzmann Machine (BM) (§6.4)
CSCI1600: Embedded and Real Time Software
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010

Winter CS CS244 – Lecture 5 Hardware/Software Co-design

Winter CS Review: Design Objectives Performance Cost Quality Thresholds Better Improving quality beyond threshold is desired Improving performance beyond threshold Is a waste Improving cost is desired

Winter CS Co-design Flow System Model System SimulationInformal Specification Hardware/Software Partitioning Partitioned Model Schedule Partitioned Model & Sch. HW/SW Co-simulation Refine Algorithmic Design

Winter CS Co-design Flow Partitioned Model + Sch. Communication Synthesis Software Model Hardware Model HW/SW Co-simulation CompilationSynthesis HW/SW Co-simulation Gate-level Model Binary Exec. Model Refine

Winter CS Co-design Flow Gate-level Model Binary Exec. Model Emulate or Prototype Refine Fabrication

Winter CS Informal Specification & System Level Model Informal Specification loosely defines high level behavior, constraints, and optimization objectives of the system  Algorithmic and implementation details absent  Performance estimates not present System level model formally captures behavior, constraints, and optimization objectives  Can be simulated to obtain early performance estimates Feedback to refine the system specification  Can serve as a golden model for validation of intermediate or final stages Algorithmic design

Winter CS Hardware Software Partitioning Decompose (i.e., partition) the function F of the system into N sub-functions F 1, F 2, F 3 … F N Decompose the constraints and design objectives of the system into sub-constraints and design sub-objectives Cluster F 1, F 2, F 3, …, F n into M partitions to run on M processors F {F 1, F 2, F 3 … F n } P1P1 P2P2 P3P3 PMPM … …

Winter CS Scheduling Scheduling is to obtain an execution sequence such that dependencies are obeyed Static  During design time the schedule is fixed (the common case) Dynamic  During execution time, the schedule is determined (reconfigurable computing) F1F1 F2F2 F3F3 F4F4 F5F5 F6F6 F7F7 F8F8 P1: F1  F2  F8 P2: F4  F5 P3: F3  F6 P4: F7

Winter CS Scheduling A deadline D for the entire schedule An execution time for each T i for each F i ASAP (as soon as possible) ALAP (as late as possible) F1F1 F2F2 F3F3 F4F4 F5F5 F6F6 F7F7 F8F8 P1: F1  F2  F8 P2: F4  F5 P3: F3  F6 P4: F

Winter CS Partitioning (Clustering) Given:  F = { F 1, F 2, F 3 … F N }  P = { P 1, P 2, P 3 … P M } Find a lowest cost partition (cluster), as computed by an objective function Exhaustive approach O(M N ) Heuristics  Constructive partitioning (based on closeness function) Random (good for seeding iterative approaches) Cluster Growth Hierarchical clustering  Iterative partitioning Start with a partition and improve Gradient search Controlled random search Modified Kernighan/Lin and FM algorithm  Partitions a set of nodes (functions) into two bins (processors)  Minimize edges between bins (communication cost, wires, etc.)  Cost function for moving a node from one partition to another ILP Genetic evolution Simulated annealing

Winter CS Partitioning (Clustering) Given:  F = { F 1, F 2, F 3 … F N }  P = { P 1, P 2, P 3 … P M } Find a lowest cost partition (cluster), as computed by an objective function Exhaustive approach O(M N ) Heuristics  Constructive partitioning (based on closeness function) Random (good for seeding iterative approaches) Cluster Growth Hierarchical clustering  Iterative partitioning Start with a partition and improve Gradient search Controlled random search Modified Kernighan/Lin algorithm  Partitions a set of nodes (functions) into two bins (processors)  Minimize edges between bins (communication cost, wires, etc.)  Cost function for moving a node from one partition to another ILP Genetic evolution Simulated annealing

Iterative Partitioning Algorithms The computation time in an iterative algorithm is spent evaluating large numbers of partitions Iterative algorithms differ from one another primarily in the ways in which they modify the partition and in which they accept or reject bad modifications

Kernighan-Lin (Min-Cut) Algorithms Two-way partitioning example  Start with 2 equal subgraphs  Exchange k pairs in each iteration  Continue until no further improvement Gain function  f(internal – external) cost

Winter CS Hierarchical Clustering – Example

Clustering w/ several criteria

Winter CS Alternate Partitioning Techniques Start with all functionality in software and move portions into hardware which are time- critical and can not be allocated to software (software-oriented partitioning) Start with all functionality in hardware and move portions into software implementation (hardware-oriented partitioning)

Winter CS More Partitioning Issues Partitioning into hardware and software affects overall system cost and performance Hardware implementation  Provides higher performance via hardware speeds and parallel execution of operations  Incurs additional design expense Software implementation  Lower performance  Incurs high cost of developing and maintaining (complex) software

Winter CS Functional Co-simulation Some of the M processors are single-purpose (e.g., those with a single function mapped on to them), others are general purpose Functions mapped onto the general-purpose processors are implemented in software and simulated on virtual machines with performance models Functions mapped onto the single-purpose processors are simulated at the behavioral level with performance models Communication is done via abstract channels Feedback is used to refine the partitioning and scheduling tasks

Winter CS Communication Synthesis & Bus- accurate Co-simulation Abstract channels A 1, A 2 … A n are mapped onto a set of communication channels C 1, C 2 … C m  Similar to functional partitioning  Similar to hardware/software scheduling Channels correspond to physical artifacts of the architecture Hardware and software models are annotated with detailed communication constructs A hardware model and software model is obtained and co- simulated Communication synthesis (or possibly higher levels of design) are refined

Winter CS Compilation & Synthesis & Cycle- accurate Co-simulation Compiler used to generate binary executables for general-purpose processors Synthesis used to generate gate-level models of single-purpose processors Synthesis used to generate gate-level models of general-purpose processors Cycle accurate co-simulation of the entire system  Note: mixed level co-simulation is common

Winter CS Emulate/Prototype and Fabrication Use hardware (e.g, FPGAs) to emulate a system as fast as possible (relative to real-time) Fabrication  Place & route  Mask design  Chip testing Manufacturing fault models Test vector generation  Packaging

Winter CS Conclusion Satisfying performance, cost, and quality metrics of a system entails hardware and software codesign Partitioning is at the heart of codesign  Functional  Communication  Scheduling Partitioning techniques  Constructive  Iterative Heuristics often used to bound the running time