Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design & Co-design of Embedded Systems Introduction to Co-synthesis Algorithms + HW/SW Partitioning Algorithms Maziar Goudarzi.

Similar presentations


Presentation on theme: "Design & Co-design of Embedded Systems Introduction to Co-synthesis Algorithms + HW/SW Partitioning Algorithms Maziar Goudarzi."— Presentation transcript:

1 Design & Co-design of Embedded Systems Introduction to Co-synthesis Algorithms + HW/SW Partitioning Algorithms Maziar Goudarzi

2 Fall 2005 Design & Co-design of Embedded Systems2 Today Program zIntroduction zPreliminaries zHardware/Software Partitioning zDistributed System Co-Synthesis (Next session) Reference: Wayne Wolf, “Hardware/Software Co-Synthesis Algorithms,” Chapter 2, Hardware/Software Co-Design: Principles and Practice, Eds: J. Staunstrup, W. Wolf, Kluwer Academic Publishers, 1997.

3 Introduction to HW/SW Co-Synthesis Algorithms Introduction

4 Fall 2005 Design & Co-design of Embedded Systems4 Introduction zImplementing a system? Why use CPU? yEasier implementation yEasier (and cheaper) to change and debug zWhy use hardware modules? yMeeting other constraints xperformance, power consumption, etc zFound a CPU meeting all non-functional constraints? yYes! What could be better? Use the CPU. yNo! Design custom logic, or a combination of both

5 Fall 2005 Design & Co-design of Embedded Systems5 Introduction (cont’d) zWhy more than one CPU or custom logic? zWhy not use the fastest available CPU?

6 Fall 2005 Design & Co-design of Embedded Systems6 Introduction (cont’d) zReason 1: yExponential cost per CPU performance yFigure: xlate-1996 retail prices of Pentium Processor Pentium processor prices Clock speed (MHz)

7 Fall 2005 Design & Co-design of Embedded Systems7 Introduction (cont’d) zExponential price/performance implies yPaying for performance in a uni-processor is very expensive xUsing multiple small CPUs is cheaper xCommunication overhead is added, but still an economic choice xProcessors need not be CPUs. But special-function units. xSpecial-purpose PEs can be even cheaper than dedicated CPU! Measured in system manufacturing cost, not necessarily in design cost

8 Fall 2005 Design & Co-design of Embedded Systems8 Introduction (cont’d) zReason 2: yScheduling overhead xMore than 31% overhead, under reasonable assumptions, when executing multiple processes Reason: uncertainty in the times at which the processes will need to execute Result: we have to reserve extra CPU horsepower, which comes at exponential cost

9 Fall 2005 Design & Co-design of Embedded Systems9 Introduction (cont’d) zDefinition yHW/SW co-synthesis: process of simultaneously design the SW architecture of an application and the HW architecture on which that SW is executed.

10 Fall 2005 Design & Co-design of Embedded Systems10 Introduction (cont’d) Problem Specification SW (app.) Arch. HW Engine PE Mem Communication Channels CoSynthesis

11 Fall 2005 Design & Co-design of Embedded Systems11 Introduction (cont’d) zProblem specification includes yFunctionality yNon-functional requirements xPerformance goals, physical constraints, etc

12 Fall 2005 Design & Co-design of Embedded Systems12 Introduction (cont’d) zHardware Architecture yOne or more Processing-Elements (PEs) zSoftware (Application) Architecture includes yProcess structure xEach process executes sequentially xDetermines The amount of parallelism The amount of communication xProper process structure is crucial for cost-effective implementation yAllocation of the processes onto PEs in the HW engine zCommunication channels yHardware elements ySoftware primitives

13 Fall 2005 Design & Co-design of Embedded Systems13 Introduction (cont’d) zHW/SW Co-synthesis yAllows trade-offs between SW architecture and HW on which it executes yWhere is such trade-off important? xEveryday processing applications vs. Embedded applications xEmbedded computing: Computing with limited resources yDifferent co-synthesis styles depending on xThe Specification xThe System Components xSystem Elements to synthesize

14 Fall 2005 Design & Co-design of Embedded Systems14 Introduction (cont’d) zTwo broad implementation styles yHW/SW partitioning xTarget HW architecture: a CPU and multiple ASICs yDistributed System Co-synthesis xTarget HW architecture: arbitrary hardware topologies

15 Introduction to HW/SW Co-Synthesis Algorithms Preliminaries

16 Fall 2005 Design & Co-design of Embedded Systems16 Preliminaries zRate (execution rate) yMaximum frequency at which a processing must be done zSingle-rate vs. Multi-rate yExample of multi-rate system xaudio/video decoder

17 Fall 2005 Design & Co-design of Embedded Systems17 Preliminaries (cont’d) zLatency yRequired maximum time between starting and finishing a processing task

18 Fall 2005 Design & Co-design of Embedded Systems18 Behavior Models zDFG: Data Flow Graph ySuitable for data-processing algorithms zCFG: Control Flow Graph ySuitable for process control algorithms zCDFG: Control Data Flow Graph yCombination of the two above

19 Fall 2005 Design & Co-design of Embedded Systems19 Behavior Models (cont’d) zSingle-rate systems yStandard model: Control-Data Flow Graph (CDFG) xImplies a program-counter or system-state xNot suitable to model multi-rate tasks Due to unified system state

20 Fall 2005 Design & Co-design of Embedded Systems20 Behavior Models (cont’d) zMulti-rate systems yCommon model: Task Graph zTask Graph yEach Node: Process yEach Edge: Communication yEach Set of connected nodes: sub-task P1 P2P3 P4P5 P6

21 Fall 2005 Design & Co-design of Embedded Systems21 Behavior Models (cont’d) zSDFG: Synchronous Data Flow Graph ySuitable for signal processing applications y= DFG + may be cyclic yLee and Messerschmitt: xAlgorithm to check feasibility of an SDFG + schedule it on a uni-processor or multiprocessor a b c 21 1 12 1

22 Fall 2005 Design & Co-design of Embedded Systems22 Behavior Models (cont’d) zCo-design Finite-State Machine (CFSM) yPOLIS project at UC-Berkeley yUsed for control-dominated systems xe.g., ECU (Engine Control Unit) yEvent-driven FSM xTransitions occur by events (instead of periodic clock signal) idle test error Done/ stop_time Timeout/ alarm=ON Reset/ alarm=OFF Go / start_timer

23 Fall 2005 Design & Co-design of Embedded Systems23 Architectural Models zThe hardware engine also needs a description zHere, only basic models for cost estimation

24 Fall 2005 Design & Co-design of Embedded Systems24 Architectural Models (cont’d) zHW-engine is another graph yGenerally: xProcessing Elements (PE) as nodes + communication channels as edges xProblem: How to model busses? xSolution: Nodes also used for channels Edges represents nets connecting PEs and channels Nodes are labeled with their type

25 Fall 2005 Design & Co-design of Embedded Systems25 Architectural Models (cont’d) zComponent Technology Library yUsed when pre-designed components constitute the HW engine yIncludes xGeneral parameters e.g., manufacturing cost, average power consumption, clock rate xInformation regarding functional elements (behaviors) A table giving execution time of each behavior on that PE

26 Fall 2005 Design & Co-design of Embedded Systems26 Architectural Models (cont’d) zCPU scheduling yProcess vs. thread (light-weight process) xWe use these terms interchangeably yScheduling policies to run multiple processes on a single CPU xNon-preemptive vs. preemptive (prioritized) xTime-slicing not normally used in embedded systems

27 Fall 2005 Design & Co-design of Embedded Systems27 Architectural Models (cont’d) yScheduling policies (cont’d) xPriority can be static or dynamic A well-known static priority scheme: –RMS (Rate monotonic Scheduling) –Best static schedule –Guarantees all deadlines –Needs 31% extra CPU horsepower A well-known dynamic priority scheme: –EDF (Earliest Deadline First) –100% CPU utilization –May miss deadlines xMore on this later

28 Fall 2005 Design & Co-design of Embedded Systems28 Topics zIntroduction zPreliminaries zHardware/Software Partitioning zDistributed System Co-Synthesis

29 Fall 2005 Design & Co-design of Embedded Systems29 Topics zIntroduction zA Classification zExamples yVulcan yCosyma

30 Fall 2005 Design & Co-design of Embedded Systems30 Introduction to HW/SW Partitioning zThe first variety of co-synthesis applications zDefinition yA HW/SW partitioning algorithm implements a specification on some sort of multiprocessor architecture zUsually yMultiprocessor architecture = one CPU + some ASICs on CPU bus

31 Fall 2005 Design & Co-design of Embedded Systems31 Introduction to HW/SW Partitioning (cont’d) zA Terminology yAllocation xSynthesis methods which design the multiprocessor topology along with the PEs and SW architecture yScheduling xThe process of assigning PE (CPU and/or ASICs) time to processes to get executed

32 Fall 2005 Design & Co-design of Embedded Systems32 Introduction to HW/SW Partitioning (cont’d) zIn most partitioning algorithms yType of CPU is fixed and given yASICs must be synthesized xWhat function to implement on each ASIC? xWhat characteristics should the implementation have? yAre single-rate synthesis problems xCDFG is the starting model

33 Fall 2005 Design & Co-design of Embedded Systems33 HW/SW Partitioning (cont’d) zNormal use of architectural components yCPU performs less computationally-intensive functions yASICs used to accelerate core functions zWhere to use? yHigh-performance applications xNo CPU is fast enough for the operations yLow-cost application xASIC accelerators allow use of much smaller, cheaper CPU

34 Fall 2005 Design & Co-design of Embedded Systems34 A Classification zCriterion: Optimization Strategy xTrade-off between Performance and Cost yPrimal Approach xPerformance is the primary goal xFirst, all functionality in ASICs. Progressively move more to CPU to reduce cost. yDual Approach xCost is the primary goal xFirst, all functions in the CPU. Move operations to the ASIC to meet the performance goal.

35 Fall 2005 Design & Co-design of Embedded Systems35 A Classification (cont’d) zClassification due to optimization strategy (cont’d) yExample co-synthesis systems xVulcan (Stanford): Primal strategy xCosyma (Braunschweig, Germany): Dual strategy

36 Co-Synthesis Algorithms: HW/SW Partitioning HW/SW Partitioning Examples: Vulcan

37 Fall 2005 Design & Co-design of Embedded Systems37 Partitioning Examples: Vulcan zGupta, De Micheli, Stanford University zPrimal approach 1. All-HW initial implementation. 2. Iteratively move functionality to CPU to reduce cost. zSystem specification language yHardwareC xIs compiled into a flow graph

38 Fall 2005 Design & Co-design of Embedded Systems38 Partitioning Examples: Vulcan (cont’d) nop x=ay=b 1 1 x=a; y=b; HardwareC cond x=ey=f c>dc<=d if (c>d) x=e; else y=f; HardwareC

39 Fall 2005 Design & Co-design of Embedded Systems39 Partitioning Examples: Vulcan (cont’d) zFlow Graph Definition yA variation of a (single-rate) task graph yNodes xRepresent operations xTypically low-level operations: mult, add yEdges xRepresent data dependencies xEach contains a Boolean condition under which the edge is traversed

40 Fall 2005 Design & Co-design of Embedded Systems40 Partitioning Examples: Vulcan (cont’d) zFlow Graph yis executed repeatedly at some rate ycan have initiation-time constraints for each node t(v i )+l ij  t(v j )  t(v i )+u ij ycan have rate constraints on each node m i  R i  M i


Download ppt "Design & Co-design of Embedded Systems Introduction to Co-synthesis Algorithms + HW/SW Partitioning Algorithms Maziar Goudarzi."

Similar presentations


Ads by Google