Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Multiprocessor System-on-Chip

Similar presentations

Presentation on theme: "Introduction to Multiprocessor System-on-Chip"— Presentation transcript:

1 Introduction to Multiprocessor System-on-Chip
Prof. Jan Madsen Informatics and Mathematical Modeling Technical University of Denmark Richard Petersens Plads, Building 321 DK2800 Lyngby, Denmark

2 Embedded systems bit-pattern func mem CPU rom io (c) Jan Madsen if ...
then ... else ... for { ... ..} func bit-pattern mem CPU rom io (c) Jan Madsen

3 Embedded systems Systems which use a computer to perform a specific function, but are neither used nor perceived as a computer They are embedded within larger electronic devices Repeatedly carrying out a particular function Often completely unrecognized by the device’s user (c) Jan Madsen

4 Embedded systems design
Several design groups hardware software hardware model software model Separated validations validation hardware prototype software prototype Problems arise at a very late point in the design process Prototype realization (c) Jan Madsen

5 Principples of Codesign
CPU void UnitControl() { up = down = 0; open = 1; while (1) { while (req == floor); open = 0; if (req > floor) { up = 1;} else {down = 1;} while (req != floor); open = 1; delay(10); } SW synthesis void UnitControl() { up = down = 0; open = 1; while (1) { while (req == floor); open = 0; if (req > floor) { up = 1;} else {down = 1;} while (req != floor); open = 1; delay(10); } Interface synthesis ASIC HW synthesis (c) Jan Madsen

6 Overview Technology Codesign for speed-up Building sub-system
Processors IC fabric Codesign for speed-up component execution timing (SW and HW) Building sub-system Hardware/software partitioning Building system System-level issues of codesign (c) Jan Madsen

7 Software pe Elements of computation Store data Transform data
if ... then ... else ... for { ... ..} func Elements of computation Store data Transform data Move data (c) Jan Madsen

8 Processor Architecture components Processing elements – transform data
func if ... then ... else ... for { ... ..} Architecture components Processing elements – transform data Memories – store data Interconnect – move data (c) Jan Madsen

9 Processor: General Purpose
func if ... inst mem controller datapath data mem then ... else ... pc ir cu func for { ... ..} reg +/- * Availability Low cost (mass production) Simple design flow High flexibility (c) Jan Madsen

10 Processor: General Purpose - example
func if ... inst mem controller datapath data mem then ... else ... ir cu A[i] func for { ... ..} reg * pc +/- x = x + A[i] * p1 5 cycles (c) Jan Madsen

11 Processor: Custom (ASIC)
func controller datapath cu +/- * + mem if ... then ... else ... for { ... ..} High performance Low power Complex design flow No flexibility (c) Jan Madsen

12 Processor: Custom (ASIC) – example
func if ... controller datapath then ... else ... cu mem A[i] for { ... ..} * + +/- x = x + A[i] * p1 1 cycle (c) Jan Madsen

13 Processor: Semicustom (ASIP)
func inst mem controller datapath data mem func pc ir cu reg +/- + * if ... then ... else ... for { ... ..} Costumized datapath – 16, 8 or 4 bit Optimized for particular class of programs - MACC ”Simple” design flow High flexibility (c) Jan Madsen

14 Processor: Semicustom - example
func if ... inst mem controller datapath data mem then ... else ... ir cu func A[i] for { ... ..} reg * + pc +/- x = x + A[i] * p1 2 cycles (c) Jan Madsen

15 IC fabrics IC is an interconnection of transistors following one of several possible styles – fabrics The fabric defines how and when transistors are composed ”the material of processors” IC fabrics differ in terms of customizability and generality (c) Jan Madsen

16 IC fabrics: Custom Exact implementation of processor components
High NRE cost – mask set ~ 1M$ (c) Jan Madsen

17 IC fabrics: Semicustom
Several semicustom fabrics Library of standard cells Cell arrays (sea-of-gates) Most processing steps are pre manufactured (high volume) (c) Jan Madsen

18 IC fabrics: Programmable
Set of interconnected modules Set of modules programmed to implement different components FPGA Programmable logic modules, storage and interconnect (c) Jan Madsen

19 Chips: Implementing IC fabric
(c) Jan Madsen

20 Hardware/software codesign?
if ... then ... else ... for { ... ..} func Many possible mappings Processor may not exist yet! Exploring the design space Need to estimate (c) Jan Madsen

21 Hardware/Software Codesign
Optimizing Timing (high performance, hard deadlines) Area (cost) Power consumption Flexibility Reliability ... We will focus on timing (c) Jan Madsen

22 Processing element timing
Execution path Control data dependent Input data dependent Function implementation Component architecture Compiler or synthesis if ... then ... else ... for { ... ..} func (c) Jan Madsen

23 Formal execution path timing analysis
bi basic block or program segment tpe(bi,pej) execution time of bi on processing element pej c(bi) execution frequency of bi worst/best case timing bounds ) c(b ,pe ) (b F,pe ) t i I å × = ( pe j b1 if ... b3 b2 else { ... } then ... for { ... ..} b4 (c) Jan Madsen

24 Formal execution path timing analysis
,pe ) (b i t pe j + - * model + - * software b2 then ... + - * hardware (c) Jan Madsen

25 Memory models Access time Control overhead Burst access (packets)
Cache hit/miss time overhead Based on execution history PE D$ I$ Flash RAM SDRAM (c) Jan Madsen

26 Advanced architectures
Modern high performance processors includes architectural features which complicates timing analysis Dynamic instruction scheduling Speculative execution Though fast, it makes the processor very power hungry tight bounds on timing very difficult Computation less predictable Issues which are important for embedded systems (c) Jan Madsen

27 Building sub-systems Initial codesign problem
func if ... processor ASIC then ... else ... for { ... ..} Initial codesign problem Hardware/software partitioning the LYCOS cosynthesis tool Automatic partitioning from C (subset) and VHDL (single process) Developed at DTU (c) Jan Madsen

28 Hardware/Software partitioning
func b1 1 b2 if ... b3 2 then ... else ... 4 mapping b4 for { ... ..} 3 CPU ASIC CPU ASIC (c) Jan Madsen

29 Architectural choices
Which processor should be selected and how fast should it be? Which ASIC technology should be chosen and how fast should the ASIC be? How large an ASIC can we afford and which functions should it execute? How should the processor and ASIC communicate? (c) Jan Madsen

30 Partitioning Model BB SW HW Model Specification Determines granularity and simplifying assumptions w.r.t. communication, HW sharing, etc (c) Jan Madsen

31 Estimation S a t t H a t C a SW HW SW Estimator Lib Estimator HW Lib
Com a (c) Jan Madsen

32 Process communication
s(bi) sent data in bi r(bi) received data in bi c(bi) execution frequency of bi Communication time s(bi) and r(bi) determined by data volume Data encoding Communication protocol b1 if ... b2 b4 else { send(...); receive(...); ... } then ... for { ... ..} b3 (c) Jan Madsen

33 Solving the Partitioning Problem
SW HW 1 2 3 4 5 6 Just try all combinations... (c) Jan Madsen

34 Solving the Partitioning Problem
No communication interleaved exec. additive areas Interleaved communication additive areas Parallel execution non-additive areas SW HW 1 2 3 4 5 6 SW HW 1 2 3 4 5 6 1 2 6 7 HW 3 4 5 SW Knapsack Stuffing Large scale linear/nonlinear integer programming Heuristics needed! (c) Jan Madsen

35 LYCOS Design Flow (c) Jan Madsen Specification Functional Require
Translate Analysis CDFG SW SW Estim. Model HW Partitioning HW Estim. Model Comm. Comm. Estim. CDFG Model SW Comm. HW Synthesis Synthesis Synthesis Assembler SW/HW Netlist (c) Jan Madsen

36 Building Systems Different processing element types
Platform architectures are heterogeneous Different processing element types Different interconnection networks and communication protocols Different memory types Different scheduling and synchronization strategies M CoP P DSP (c) Jan Madsen

37 Managing HW platform complexity
Development of APIs to hide complexity from application programmer and improve portability Specialized RTOS to control resource sharing and interfaces aComplex multi-level HW/SW architecture (c) Jan Madsen

38 Software architecture
pe1 mem HW/SW Plattform application private application shared RTOS RTOS-APIs Hardware Software private private private drivers CPU I/O Int Bus- CTRL Timer Cache Periphery Bus ce1 (c) Jan Madsen

39 Platform design challenges
Integration Design process integration Heterogeneous component and language integration Design space exploration and optimization Verification (c) Jan Madsen

40 Complex run-time interdependencies
CoP Run-time dependencies of independent components via communication Influence on timing and power Need to handle resource sharing Process/task scheduling Communication scheduling Scheduling strategies (static, dynamic, time or priority driven) (c) Jan Madsen

41 Interdependency example
Complex non-functional interdependencies Periodic task executing on PE Task writes to bus at the end of each periodic execution Short execution time ahigh bus load long execution time alow bus load PE Local decision on improving performance may impact the global system performance (c) Jan Madsen

42 System-on-Chip challenge
processor memory io router (c) Jan Madsen

43 Network-on-Chip Multi-hop Concurrency Segmented communication
Multiple simultaneous communications a b c d M (c) Jan Madsen

44 Network-on-Chip Multi-hop Concurrency Sharing Segmented communication
Multiple simultaneous communications Sharing Quasi-simultaneous resource usage Multiple communication events occupying some or all resources in an interleaved fashion a b c d M (c) Jan Madsen

45 System-on-Chip design
1 3 4 2 os 3 4 2 mapping 1 os a b c L1 L2 L3 R1 R2 R3 a b c (c) Jan Madsen

46 New design paradigme ... Platform-based design platform design IP
specification platform IP re-design Mapping re-configure (c) Jan Madsen

47 thank you! (c) Jan Madsen

Download ppt "Introduction to Multiprocessor System-on-Chip"

Similar presentations

Ads by Google