PRESENTED BY: MOHAMAD HAMMAM ALSAFRJALANI UFL ECE Dept. 3/19/2010 UFL ECE Dept 1 SYSTEM LEVEL HARDWARE/SOFTWARE PARTITIONING BASED ON SIMULATED ANNEALING.

Slides:



Advertisements
Similar presentations
Local optimization technique G.Anuradha. Introduction The evaluation function defines a quality measure score landscape/response surface/fitness landscape.
Advertisements

© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
Neural and Evolutionary Computing - Lecture 4 1 Random Search Algorithms. Simulated Annealing Motivation Simple Random Search Algorithms Simulated Annealing.
Simulated Annealing General Idea: Start with an initial solution
PradeepKumar S K Asst. Professor Dept. of ECE, KIT, TIPTUR. PradeepKumar S K, Asst.
Tabu Search Strategy Hachemi Bennaceur 5/1/ iroboapp project, 2013.
CPSC 322, Lecture 16Slide 1 Stochastic Local Search Variants Computer Science cpsc322, Lecture 16 (Textbook Chpt 4.8) February, 9, 2009.
Spie98-1 Evolutionary Algorithms, Simulated Annealing, and Tabu Search: A Comparative Study H. Youssef, S. M. Sait, H. Adiche
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Process Scheduling for Performance Estimation and Synthesis of Hardware/Software Systems Slide 1 Process Scheduling for Performance Estimation and Synthesis.
Scheduling with Optimized Communication for Time-Triggered Embedded Systems Slide 1 Scheduling with Optimized Communication for Time-Triggered Embedded.
System Partitioning Kris Kuchcinski
MAE 552 – Heuristic Optimization
Ant Colony Optimization Optimisation Methods. Overview.
Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research
Simulated Annealing Van Laarhoven, Aarts Version 1, October 2000.
A Tool for Partitioning and Pipelined Scheduling of Hardware-Software Systems Karam S Chatha and Ranga Vemuri Department of ECECS University of Cincinnati.
D Nagesh Kumar, IIScOptimization Methods: M1L4 1 Introduction and Basic Concepts Classical and Advanced Techniques for Optimization.
HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.
1 of 14 1 / 18 An Approach to Incremental Design of Distributed Embedded Systems Paul Pop, Petru Eles, Traian Pop, Zebo Peng Department of Computer and.
Informed Search Next time: Search Application Reading: Machine Translation paper under Links Username and password will be mailed to class.
Dynamic Hardware Software Partitioning A First Approach Komal Kasat Nalini Kumar Gaurav Chitroda.
Metaheuristics The idea: search the solution space directly. No math models, only a set of algorithmic steps, iterative method. Find a feasible solution.
Hardware/Software Partitioning Greg Stitt ECE Department University of Florida.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Vilalta&Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
Heuristic Optimization Methods
An Introduction to Artificial Life Lecture 4b: Informed Search and Exploration Ramin Halavati In which we see how information.
Heuristic Optimization Methods
Mahesh Sukumar Subramanian Srinivasan. Introduction Embedded system products keep arriving in the market. There is a continuous growing demand for more.
Dr.Abeer Mahmoud ARTIFICIAL INTELLIGENCE (CS 461D) Dr. Abeer Mahmoud Computer science Department Princess Nora University Faculty of Computer & Information.
Optimization Problems - Optimization: In the real world, there are many problems (e.g. Traveling Salesman Problem, Playing Chess ) that have numerous possible.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Heuristic Optimization Methods Tabu Search: Advanced Topics.
Simulated Annealing.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Thursday, May 9 Heuristic Search: methods for solving difficult optimization problems Handouts: Lecture Notes See the introduction to the paper.
C OMPARING T HREE H EURISTIC S EARCH M ETHODS FOR F UNCTIONAL P ARTITIONING IN H ARDWARE -S OFTWARE C ODESIGN Theerayod Wiangtong, Peter Y. K. Cheung and.
A Simple Example The Traveling Salesman Problem: Find a tour of a given set of cities so that each city is visited only once the total distance traveled.
CAS 721 Course Project Implementing Branch and Bound, and Tabu search for combinatorial computing problem By Ho Fai Ko ( )
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
G5BAIM Artificial Intelligence Methods
Single-solution based metaheuristics. Outline Local Search Simulated annealing Tabu search …
Reactive Tabu Search Contents A brief review of search techniques
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
Presenter: Leo, Shih-Chang, Lin Advisor: Frank, Yeong-Sung, Lin /12/16.
Optimization Problems
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Dec 1, 2005 Part 2.
Ramakrishna Lecture#2 CAD for VLSI Ramakrishna
ESE 566: Hardware/Software Co-Design of Embedded Systems Fall 2005 Instructor: Dr. Alex Doboli. Paper discussed in class: P. Eles, Z. Peng, K. Kuchcinski,
Dynamic and On-Line Design Space Exploration for Reconfigurable Architecture Fakhreddine Ghaffari, Michael Auguin, Mohamed Abid Nice Sophia Antipolis University.
CEng 713, Evolutionary Computation, Lecture Notes parallel Evolutionary Computation.
Optimization Problems
CSCI 4310 Lecture 10: Local Search Algorithms
Heuristic Optimization Methods
Van Laarhoven, Aarts Version 1, October 2000
Tabu Search Review: Branch and bound has a “rigid” memory structure (i.e. all branches are completed or fathomed). Simulated Annealing has no memory structure.
Artificial Intelligence (CS 370D)
Subject Name: Operation Research Subject Code: 10CS661 Prepared By:Mrs
A Model Based Path Selection Testing on Mobile Apps using TABU Monitored Hybrid Local Search Optimizations Main Paper Akhil Yendluri.
A Tabu Search Environment for Engineering Design Optimisation
Optimization Problems
Multi-Objective Optimization
Partitioning Presented by AMIT KUMAR GUPTA(2001VLS007)
School of Computer Science & Engineering
5.2.3 Optimization, Search and
Artificial Intelligence
More on HW 2 (due Jan 26) Again, it must be in Python 2.7.
Presentation transcript:

PRESENTED BY: MOHAMAD HAMMAM ALSAFRJALANI UFL ECE Dept. 3/19/2010 UFL ECE Dept 1 SYSTEM LEVEL HARDWARE/SOFTWARE PARTITIONING BASED ON SIMULATED ANNEALING AND TABU SEARCH

Outline 3/19/2010 UFL ECE Dept 2  15 minutes break  Introduction of the challenge  Overview of heuristics  Implementation and modification  Comparison of the two approaches  Conclusion

Introduction 2/26/2010 UFL ECE Dept 3  Our goal is not to

Introduction 3/19/2010 UFL ECE Dept 4  Many embedded systems have strong requirements concerning the expected performance  Solution—1: application specific systems such as Application specific integrated circuits (ASIC) Application specific instruction processor (ASIP) Problem: very expensive  Solution—2: FPGA’s Problem: still is not the optimal solution FPGA for I/O operations?

Today’s challenge 2/26/2010 UFL ECE Dept 5  Solution—3: hybrid systems (SW/HW) Ex: Super computing: CPU controls multiple FPGA platforms Ex: Embedded systems: Software radios Problem: huge exploration space, long time to market (SW/HW developed separately), less reliability  The challenge: How can we partition the system into HW & SW regions to gain the best speedup at minimum overhead Areas of challenge (what factors into your cost function) Area, power, $$, and code overhead Minimize communication between HW/SW domains Increase parallelism

Hw-sw partitioning co-design challenges 3/19/2010 UFL ECE Dept 6  System specification and modeling  Co-simulation  Partitioning  Synthesizing  Verification  Performance and cost estimation

Partitioning 3/19/2010 UFL ECE Dept 7  Determining which module to run on sw/hw  Has crucial impact on system performance  Matrix multiply can take 1 cycle in hw*  Critical cost factor  Silicon, sw/hw-dev & engineering costs  Power and energy costs  But, as mentioned, huge exploration area

Partitioning –Challenges 2/26/2010 UFL ECE Dept 8  Granularity  Evaluation  Alternative region implementations  Implementation models  Exploration

Granularity 2/26/2010 UFL ECE Dept 9  How big/small is each area  Coarse grained:  Simple partitioning, less inter-partition communication, more accurate estimation  Fine grained:  More complex, more communication, harder to estimate  Provides a better solution

Coarse Grained 3/19/2010 UFL ECE Dept 10  Example  Main (){ Function 1 Function 1-a Function 1-b Function 1-c Function 2 Function 1-a Function 1-b Function 1-c … } HW SW

Fine Grained 3/19/2010 UFL ECE Dept 11  Example  Main (){ Function 1 Function 1-a Function 1-b Function 1-c Function 2 Function 1-a Function 1-b Function 1-c … } HW SW HW SW HW

Evaluation, Alternative Region Implementations & models 2/26/2010 UFL ECE Dept 12  Evaluation: : How good is a given partition  Based on the cost function Power consumption, heat dissipation, speedup, etc  Alternative Region Implementation  There could be more than one way to implement a given region in sw or hw. Colum vs. row major ordering in loops  Implementation models  How do we implement our system Execution, trace, communication

Exploration–very big area to explore 2/26/2010 UFL ECE Dept 13  If a problem has a polynomial solution in the form of O(n), O(n2), O(n3), etc. Then it is a (P) problem  If the solution can’t be determined, then its called (NP) problem (nondeterministic polynomial time); doesn’t mean not-polynomial  HW/SW partitioning is an NP problem

Exploration—example 3/19/2010 UFL ECE Dept 14  How huge is huge? Example: How many possible ways are their to realize 45 functional units in hw or sw?

Partitioning 3/19/2010 UFL ECE Dept 15 Actually 35x10^12

Practical approach 3/19/2010 UFL ECE Dept 16  Do we implement all possibilities to evaluate performance?  No  Do we accept a random partition?  No  Then?  We use heuristics to get close to a good enough partition

Possible Heuristics 3/19/2010 UFL ECE Dept 17  The most common ones are those based on neighborhood search  Hill climbing  Simulated annealing  Tabu search

Possible Heuristics 3/19/2010UFL ECE Dept 18  Use a heuristic to find a possible good solution  Hill climbing  Tabu Search  Simulated Annealing Keep searching until next value < current value If next < current, keep trying, for some limit (+)Very fast, (-) stuck at local peaks (+) Can find near optimal solution, (-) takes longer, very sensitive to initial state Very similar to SA but more complicated algorithm

Simulated Annealing (SA) 3/19/2010 UFL ECE Dept 19  Name inspiration: from annealing in metallurgy  Searching for a better state than the current state  Very common, why?  Can be quickly implemented  Widely applicable to many different problems  Disadvantage  Takes a long execution time  Amount of experiments needed to tune the algorithm

SA – Basic Algorithm 3/19/2010 UFL ECE Dept 20  Starts with an initial ‘best state’  Selects neighboring solution randomly  Accept an improved solution  Replace initial ‘best’ state with this ‘better’  Accepts a worse solution with a certain probability that depends on the deterioration of the cost function and on a control parameter called temperature  Repeat until probability (temperature) is very small (cold)

SA – Improved Algorithm 3/19/2010 UFL ECE Dept 21  Solution space (hw-sw areas/modules/functions)  Two ways:  Simple move Move one node from one domain into another  Improved move Move the node and its direct neighboring at the same time Reduces the spectrum of visited solutions  Moves are repeated (another neighboring solution) if it violates constraints

SM vs. IM – Experimental Results 3/19/2010 UFL ECE Dept 22 Table summarizes simple and improved moves times and speed up of IM to SM  Exploration with improved moves reaches the optimal partitioning faster

3/19/2010 UFL ECE Dept 23 Questions?

Tabu Search (TS) 3/19/2010 UFL ECE Dept 24  Name Inspiration: from a ‘taboo’/prohibited list  Uphill moves are not purely random  Saves searching history  Maintains a search list called Tabu list  Doesn’t repeat explored areas and their evaluations  Provides a better diversity of solutions

TS – Memories 3/19/2010 UFL ECE Dept 25  Short term memory, contains a tabu list of information relative to the most recent history of the search. It is used in order to avoid cycling that could occur if a certain move returns to a recently visited solution.  Long term memory, stores information on the global evolution of the algorithm.  Long and short memory lists are used for Diversification. Diversification meant to improve exploration of the solution space by broadening the spectrum of visited solutions.

TS – Algorithm 3/19/2010 UFL ECE Dept 26  1-Define an initial solution  2-If stopping condition is not met  Identify neighboring set N(s)  Identify Tabu set T(s)  Identify Aspirant set A(s)  Choose the best in N(s): N(s,k) = {N(s) - T(s,k)}+A(s,k)  Memorize s’ if it improves the previous best known solution s := s’. k := k+1  3-END

TS – Diversification 3/19/2010 UFL ECE Dept 27  Improve the searching strategies by:  Node moves are ordered according to a penalized cost function which favors the transfer of nodes that have spent a long time in their current partition  A move is considered tabu if the frequency of occurrences of the node in its current partition is smaller than a certain threshold  If the system is frozen a new search can be started from an initial configuration which is different from those encountered previously

TS –Experimental Results 3/19/2010 UFL ECE Dept 28 Tao: Tabu Tenure Nr_f_b: Number of iterations without improvement of the solution after which the system is considered frozen Nr_r: Number of restarts with a new initial configuration The minimal values needed for an optimal partitioning of all graphs of the respective dimension and the resulted CPU times. The times have been computed as the average of the partitioning time for all graphs of the given dimension. Restarting tours were necessary only for the 400 nodes graphs.

SA vs. TS 3/19/2010 UFL ECE Dept 29 1) Near-optimal partitioning can be produced both by the SA and TS based algorithm 2) SA is based on a random exploration of the neighborhood while TS is completely deterministic The deterministic nature of TS makes experimental tuning of the algorithm less laborious than for SA 3) SA strategy for a particular problem is relatively easy and can be performed without a deep study of domain specific aspects. Although, specific improvements can result in large gains of performance. Development of a TS algorithm is more complex and has to consider particular aspects of the given problem. * Bases on the paper

SA vs TS 3/19/2010 UFL ECE Dept 30 4) TS performance are superior to those in SA (on average more than 20 times faster) 5) TS based hardware/software partitioning approach has yet been reported, while SA continues to be one of the most popular approaches for automatic partitioning. * Bases on the paper

Conclusion 3/19/2010 UFL ECE Dept 31  Embedded systems has strong requirements of performance  Those can be realized in ASIC’s, ASIP’s, FPGA, Hybrid, etc  Hybrid Systems impose a new challenge: HW/SW co- design aspects (co-simulation, partitioning, etc)  Partitioning has its own challenges: (Granularity, evaluation, alternative region implementation, models, and exploration)  Exploration is remedied by heuristics such as SA & TS  TS & SA each has its own advantages and disadvantages

Questions? 3/19/2010 UFL ECE Dept 32

References 3/19/2010 UFL ECE Dept 33  Mastrolilli M., Tabu Seach, Dalle Molle Institute for Artificial Intelligence  Kimmo Järvinen, DI., FPGA’s Helsinki University of Technology  Stitt, G., HW/SW paritioning, University of Florida  ELES, KUCHCINSKI, PENG, DOBOLI, System Level Hardware/Software Partitioning Based on Simulated Annealing and Tabu Search