Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design & Co-design of Embedded Systems Distributed System Co-synthesis (2) Maziar Goudarzi.

Similar presentations


Presentation on theme: "Design & Co-design of Embedded Systems Distributed System Co-synthesis (2) Maziar Goudarzi."— Presentation transcript:

1 Design & Co-design of Embedded Systems Distributed System Co-synthesis (2) Maziar Goudarzi

2 Fall 2005 Design & Co-design of Embedded Systems2 Today Program zIntroduction zPreliminaries zHardware/Software Partitioning zDistributed System Co-Synthesis (part 2) References: Wayne Wolf, “Hardware/Software Co-Synthesis Algorithms,” Chapter 2, Hardware/Software Co-Design: Principles and Practice, Eds: J. Staunstrup, W. Wolf, Kluwer Academic Publishers, 1997. W. Wolf, “An architectural co-synthesis algorithm for distributed, embedded computing systems,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 5, no. 2, pp. 218-229, 1997. References: Wayne Wolf, “Hardware/Software Co-Synthesis Algorithms,” Chapter 2, Hardware/Software Co-Design: Principles and Practice, Eds: J. Staunstrup, W. Wolf, Kluwer Academic Publishers, 1997. W. Wolf, “An architectural co-synthesis algorithm for distributed, embedded computing systems,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 5, no. 2, pp. 218-229, 1997.

3 Fall 2005 Design & Co-design of Embedded Systems3 Topics zIntroduction zAn Integer Linear Programming Model zA Heuristic Algorithm yOn ordinary task graphs yOn an Object-Oriented model

4 Co-Synthesis Algorithms: Distributed System Co-Synthesis Wolf’s Heuristic Algorithm on Ordinary Task Graphs

5 Fall 2005 Design & Co-design of Embedded Systems5 Wolf’s Heuristic Algorithm zAs ever, topics of importance: ySystem Specification Language/Model yTarget Architecture yFunctionality (Allocation/Scheduling) Quantum yAllocation Strategy yScheduling Strategy yCost Estimation yPerformance Estimation yAlgorithm Details

6 Fall 2005 Design & Co-design of Embedded Systems6 Wolf’s Heuristic Algorithm (cont’d) zWolf’s Heuristic Algorithm ySystem Specification Language/Model xAlgorithm input: single-rate task graph yTarget Architecture xHeterogeneous multiprocessor architecture yAllocation xPrimal approach: Performance is the major objective yScheduling x? yFunctionality Quantum xProcesses in a single-rate task graph

7 Fall 2005 Design & Co-design of Embedded Systems7 Wolf’s Heuristic Algorithm (cont’d) zWolf’s Heuristic Algorithm (cont’d) yPerformance Estimation xComponent Technology Library xRun-time of each process on each available PE is supposed to be known yCost Estimation xComponent Technology Library  Total Cost =  i (Cost of PE i ) +  j (Cost of Device j ) +   (Cost of Comm. Channel k ) yAlgorithm Details

8 Fall 2005 Design & Co-design of Embedded Systems8 Wolf’s Heuristic Algorithm Details zFour major steps in co-design yPartitioning: dividing the spec. into smaller parts (e.g. processes) yAllocation: assigning each process to a multiprocessor node (PE) yScheduling: serializing processes assigned to each PE yMapping: selecting a particular component for each PE zProblem: These steps (especially allocation, scheduling, and mapping) have a circular relationship zSolution: Break the loop

9 Fall 2005 Design & Co-design of Embedded Systems9 Wolf’s Heuristic Algorithm Details (cont’d) zWolf: 1.Give an initial allocation 2.Refine it to reduce cost zOrder of satisfying design criteria: 1.Satisfy all deadlines 2.Minimize PE cost 3.Minimize comm. port cost 4.Minimize device cost

10 Fall 2005 Design & Co-design of Embedded Systems10 Wolf’s Heuristic Algorithm Details (cont’d) yFirst ignore communication costs. Later, take them into account ySteps: 1. Create an initial feasible solution, and perform an initial scheduling on it. Initial feasible solution: assign each process to a separate PE 2. Reallocate processes to PEs to minimize total PE cost. Possibly eliminate PEs from initial feasible solution 3. Reallocate processes again to minimize the amount of communication required between PEs 4. Allocate communication channels 5. Allocate IO devices. (Internal or external to PEs)

11 Fall 2005 Design & Co-design of Embedded Systems11 Wolf’s Heuristic Algorithm Details (cont’d) yThe most important step: 2. Initial reallocation xReason: PE cost is the dominant hardware cost yInitial reallocation 1. PE cost reduction: 1.1 Scan the PEs, starting with the least-utilized PE. 1.2 Try to reallocate that PE’s processes to other existing PEs 1.3 If no process left on the PE, eliminate it otherwise replace the PE with a suitable lower-cost one 2. Pair-wise merge Merge a pair of PEs into a single, more powerful one 3. Load balancing

12 Fall 2005 Design & Co-design of Embedded Systems12 Wolf’s Heuristic Algorithm Details (cont’d) yInitial reallocation (cont’d) x“PE cost reduction” phase tries to reallocate multiple processes at a time xThe above 3 phases are repeated as far as possible

13 Fall 2005 Design & Co-design of Embedded Systems13 Wolf’s Heuristic Algorithm: Experimental Results Example#processesPeriodImpl. CostCPU time (sec) WolfP&PWolfP&P pp142.514 0.0511 314130.0524 4770.0528 7550.0537 pp29515 0.73732 612 1.126710 7881.632320 8871.04511 15551.1385012

14 Fall 2005 Design & Co-design of Embedded Systems14 Wolf’s Heuristic Algorithm Experimental Results (cont’d) zFinds optimal solutions to most of ILP-solved examples zFinds near-optimal solutions for the remaining examples zShowed good results on larger examples zRequires very little run-time yDue to multiple-move strategy during PE cost minimization phase

15 Co-Synthesis Algorithms: Distributed System Co-Synthesis Wolf’s Heuristic Algorithm for Object-Oriented Models

16 Fall 2005 Design & Co-design of Embedded Systems16 Introduction zTarget yCo-synthesis of a Distributed-System out of an Object-Oriented Specification zSignificance yOO is a promising approach in designing embedded systems at ESL Reference: W. Wolf, “Object-Oriented Co-Synthesis of Distributed Embedded Systems,” ACM Transactions on Design Automation of Electronics Systems, pp. 301-314, 1996 Reference: W. Wolf, “Object-Oriented Co-Synthesis of Distributed Embedded Systems,” ACM Transactions on Design Automation of Electronics Systems, pp. 301-314, 1996

17 Fall 2005 Design & Co-design of Embedded Systems17 OO Co-Synthesis Algorithm zAgain, our eight topics ySystem Specification Language/Model yTarget Architecture yFunctionality (Allocation/Scheduling) Quantum yAllocation Strategy yScheduling Strategy yCost Estimation yPerformance Estimation yAlgorithm Details

18 Fall 2005 Design & Co-design of Embedded Systems18 OO Co-Synthesis Algorithm (cont’d) zSystem Specification Model/Language yAn Object-Oriented Specification as input yMethod dataflow graph as model Object O1 method m1 variables v1,v2 method m2 variables v2,v3 Object O2 method m4 variables v10,v20 Object O3 method m3 variables v8,v9

19 Fall 2005 Design & Co-design of Embedded Systems19 OO Co-Synthesis Algorithm (cont’d) zTarget Architecture yDistributed System xAn arbitrary-topology network of PEs zFunctionality Quantum yMethods of Objects in an OO Specification yAs far as possible, keeps together all methods of an object yPartitioning is done during algorithm execution

20 Fall 2005 Design & Co-design of Embedded Systems20 OO Co-Synthesis Algorithm (cont’d) zCost and Performance Estimation yPre-specified xA technology description of available components is input to the algorithm zAllocation, Scheduling, and Algorithm Details yMuch like Wolf’s previous heuristic algorithm yIncludes modifications in order to: xhandle large sets of methods xconsider effects of splitting objects across PEs

21 Fall 2005 Design & Co-design of Embedded Systems21 OO Co-Synthesis Algorithm (cont’d) zAllocation, Scheduling, and Algorithm Details 1.Initial allocation and scheduling. Allocate processes to PEs such that all tasks are placed on PEs fast enough to ensure that all deadlines are met, keeping objects together as much as possible 2. Minimize PE cost. Reallocate processes to PEs to minimize PE cost, splitting objects when necessary. 3. Minimize communication. Reallocate processes again to minimize inter-PE communication, taking into account traffic generated by splitting objects across PEs

22 Fall 2005 Design & Co-design of Embedded Systems22 OO Co-Synthesis Algorithm (cont’d) 4. Allocate channels. Allocate communication channels 5. Allocate devices. either as on-chip devices or external devices on communication channels z Allocation, … Details (cont’d)

23 Fall 2005 Design & Co-design of Embedded Systems23 OO Co-synthesis Details zStep 1 (initial allocation) yOne PE per object zStep 2 (minimize PE cost) yoo_balance_load() xTries to redistribute methods to better balance the system load yPE_replacement() xUse a cheaper PE without distributing the allocation yoo_pairwise_merge() xTries to eliminate PE by moving its methods to other PEs zStep 2 is done repeatedly yMethods are re-scheduled after each new allocation

24 Fall 2005 Design & Co-design of Embedded Systems24 OO Co-synthesis Details (cont’d) Note : This operation may cause "Hidden communication”. Note : This operation may cause "Hidden communication”.

25 Fall 2005 Design & Co-design of Embedded Systems25 OO Co-synthesis Details (cont’d)

26 Fall 2005 Design & Co-design of Embedded Systems26 OO Co-Synthesis Algorithm (cont’d) zExperimental Results yAlgorithm implemented in C++ xUsing NIH class library x8600 lines of code xExecuted on SGI Indigo workstation yAlgorithm applied to examples from software engineering books on OO design xExample#objects/methods CPU Time xcfuge2/30.05 xdye3/152.0 xjuice3/40.05 xtrain5/60.05 Reason for highest cpu-time: Having most methods => scheduling required in each inner loop of step 2 This implementation, had a simple inefficient scheduler. Reason for highest cpu-time: Having most methods => scheduling required in each inner loop of step 2 This implementation, had a simple inefficient scheduler.

27 Fall 2005 Design & Co-design of Embedded Systems27 OO Co-Synthesis Algorithm (cont’d) zMain contribution yOO specification is an important aid to automatic partitioning xThe specification is naturally divided into two levels of granularity Systems is composed of Objects Objects are composed of data members and methods yThe heuristic: xPreserve the specification’s partitioning as much as possible

28 Fall 2005 Design & Co-design of Embedded Systems28 What we learned today zDistributed System Co-Synthesis yA heuristic approach xNon-OO algorithm xCustomization to OO specifications xHeuristic: First minimize the PE cost since it is the dominant factor


Download ppt "Design & Co-design of Embedded Systems Distributed System Co-synthesis (2) Maziar Goudarzi."

Similar presentations


Ads by Google