Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These slides use Microsoft.

Slides:



Advertisements
Similar presentations
Fakultät für informatik informatik 12 technische universität dortmund Optimizations - Compilation for Embedded Processors - Peter Marwedel TU Dortmund.
Advertisements

Analysis of Computer Algorithms
fakultät für informatik informatik 12 technische universität dortmund Additional compiler optimizations Peter Marwedel TU Dortmund Informatik 12 Germany.
Evaluation and Validation
Fakultät für informatik informatik 12 technische universität dortmund Specifications and Modeling Peter Marwedel TU Dortmund, Informatik 12 Graphics: ©
Fakultät für informatik informatik 12 technische universität dortmund Standard Optimization Techniques Peter Marwedel Informatik 12 TU Dortmund Germany.
Technische universität dortmund fakultät für informatik informatik 12 Specifications and Modeling Peter Marwedel TU Dortmund, Informatik
fakultät für informatik informatik 12 technische universität dortmund Optimizations - Compilation for Embedded Processors - Peter Marwedel TU Dortmund.
Fakultät für informatik informatik 12 technische universität dortmund Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics:
Università di Genova Facoltà di Ingegneria Genova 28 settembre 2007 A benchmark for SoftPLC Technology in Windows Environment Valerio Malerba Università
Fakultät für informatik informatik 12 technische universität dortmund Classical scheduling algorithms for periodic systems Peter Marwedel TU Dortmund,
Technische universität dortmund fakultät für informatik informatik 12 Specifications, Modeling, and Model of Computation Jian-Jia Chen (slides are based.
Hardware/ Software Partitioning 2011 年 12 月 09 日 Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These.
CSCI 4717/5717 Computer Architecture
1 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory Design Space Exploration of Embedded Systems © Lothar Thiele ETH Zurich.
ECE 720T5 Fall 2011 Cyber-Physical Systems Rodolfo Pellizzoni.
CA 714CA Midterm Review. C5 Cache Optimization Reduce miss penalty –Hardware and software Reduce miss rate –Hardware and software Reduce hit time –Hardware.
Problem 2: The In-Car Radio Navigation Case Study Marcel Verhoef (Chess Information Technology, NL) Ernesto Wandeler, Lothar Thiele (ETH Zürich, CH) Paul.
Modeling the ISOLA case study environment scenario (outside the system) application scenario (inside the system)
Lab Meeting Performance Analysis of Distributed Embedded Systems Lothar Thiele and Ernesto Wandeler Presented by Alex Cameron 17 th August, 2012.
1 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory Performance Analysis of Embedded Systems Lothar Thiele ETH Zurich.
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
Processor Technology and Architecture
Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.
1 Lecture 11: Digital Design Today’s topics:  Evaluating a system  Intro to boolean functions.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
NSF Foundations of Hybrid and Embedded Software Systems UC Berkeley: Chess Vanderbilt University: ISIS University of Memphis: MSI A New System Science.
Fakultät für informatik informatik 12 technische universität dortmund Classical scheduling algorithms for periodic systems Peter Marwedel TU Dortmund,
CS294-6 Reconfigurable Computing Day 3 September 1, 1998 Requirements for Computing Devices.
Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory Influence of different system abstractions on the performance analysis.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
Copyright © , Software Engineering Research. All rights reserved. Creating Responsive Scalable Software Systems Dr. Lloyd G. Williams Software.
Chapter 1 Introduction. Computer Architecture selecting and interconnecting hardware components to create computers that meet functional, performance.
- 1 - Embedded systems: processing Embedded System Hardware Embedded system hardware is frequently used in a loop („hardware in a loop“): actuators.
1 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory Internal Design Representations for Embedded System Design Lothar.
HRTC Meeting 12 September 2002, Vienna Smart Sensors Thomas Losert.
Software Reliability SEG3202 N. El Kadri.
1 4.2 MARIE This is the MARIE architecture shown graphically.
1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.
Introduction of Intel Processors
Digital Design and Computer Architecture Dr. Robert D. Kent LT Ext Lecture 1 Introduction.
Quality of Service Karrie Karahalios Spring 2007.
Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany 2013 年 12 月 02 日 These slides use Microsoft clip arts. Microsoft copyright.
Leiden Embedded Research Center Prof. Dr. Ed F. Deprettere, Dr. Bart Kienhuis, Dr. Todor Stefanov Leiden Embedded Research Center (LERC) Leiden Institute.
Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany 2012 年 12 月 05 日 These slides use Microsoft clip arts. Microsoft copyright.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) Reconfigurable Architectures Forces that drive.
Computer Architecture CPSC 350
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
- 1 - EE898- Compiler Compilers for embedded systems: Why are compilers an issue?  Many reports about low efficiency of standard compilers  Special.
Classical scheduling algorithms for periodic systems Peter Marwedel TU Dortmund, Informatik 12 Germany 2012 年 12 月 19 日 These slides use Microsoft clip.
Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These slides use Microsoft.
OPERATING SYSTEMS CS 3502 Fall 2017
ECE354 Embedded Systems Introduction C Andras Moritz.
Low-power Digital Signal Processing for Mobile Phone chipsets
Architecture & Organization 1
Intel Atom Architecture – Next Generation Computing
Computer Architecture CSCE 350
COMP4211 : Advance Computer Architecture
Standard Optimization Techniques
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Architecture & Organization 1
Evaluation and Validation
Leonie Ahrendts, Sophie Quinton, Thomas Boroske, Rolf Ernst
Presentation transcript:

Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These slides use Microsoft clip arts. Microsoft copyright restrictions apply 年 01 月 12 日

- 2 -  p. marwedel, informatik 12, 2011 Structure of this course 2: Specification 3: ES-hardware 4: system software (RTOS, middleware, …) 8: Test 5: Evaluation & Validation (energy, cost, performance, …) 7: Optimization 6: Application mapping Application Knowledge Design repository Design Numbers denote sequence of chapters

- 3 -  p. marwedel, informatik 12, 2011 Real-time calculus (RTC)/ Modular performance analysis (MPA)  p 2p2p 3p3p  p 2p2p 3p3p p-J p+J periodic event streamperiodic event stream with jitter Thiele et al. (ETHZ): Extended network calculus: Arrival curves describe the maximum and minimum number of events arriving in some time interval . Examples: p p p-J P+J

- 4 -  p. marwedel, informatik 12, 2011 RTC/MPA: Service curves Service curves  u resp.  ℓ describe the maximum and minimum service capacity available in some time interval  s p bandwidth b TDMA bus p s p-sp+s b sb s 2p Example:

- 5 -  p. marwedel, informatik 12, 2011 RTC/MPA: Workload characterization  u resp.  ℓ describe the maximum and minimum service capacity required as a function of the number e of events. Example: e WCET=4 BCET=3  u u  l l (Defined only for an integer number of events)

- 6 -  p. marwedel, informatik 12, 2011 RTC/MPA: Workload required for incoming stream Incoming workload Upper and lower bounds on the number of events

- 7 -  p. marwedel, informatik 12, 2011 RTC/MPA: System of real time components RTC RTC’ RTC” … Incoming event streams and available capacity are transformed by real-time components: Theoretical results allow the computation of properties of outgoing streams 

- 8 -  p. marwedel, informatik 12, 2011 RTC/MPA: Transformation of arrival and service curves Resulting arrival curves: Remaining service curves: Where:

- 9 -  p. marwedel, informatik 12, 2011 RTC/MPA: Remarks  Details of the proofs can be found in relevant references  Results also include bounds on buffer sizes and on maximum latency.  Theory has been extended into various directions, e.g. for computing remaining battery capacities

 p. marwedel, informatik 12, 2011 Application: In-Car Navigation System Car radio with navigation system User interface needs to be responsive Traffic messages (TMC) must be processed in a timely way Several applications may execute concurrently © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 System Overview NAVRAD MMI DB Communication © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 NAVRAD MMI DB Communication Use case 1: Change Audio Volume < 200 ms < 50 ms © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Use case 1: Change Audio Volume © Thiele, ETHZ Communication Resource Demand

 p. marwedel, informatik 12, 2011 NAVRAD MMI DB Communication < 200 ms Use case 2: Lookup Destination Address © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Use case 2: Lookup Destination Address © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 NAVRAD MMI DB Communication Use case 3: Receive TMC Messages < 1000 ms © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Use case 3: Receive TMC Messages © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Proposed Architecture Alternatives NAV RAD MMI 22 MIPS 11 MIPS 113 MIPS NAVRAD MMI 22 MIPS 11 MIPS113 MIPS RAD 260 MIPS NAV MMI 22 MIPS RAD 130 MIPS MMI NAV 113 MIPS MMI 260 MIPS RAD NAV 72 kbps 57 kbps 72 kbps (A) (E)(D)(C) (B) © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Step 1: Environment (Event Steams) Event Stream Model e.g. Address Lookup (1 event / sec) uu ℓℓ [s] [events] 1 1 © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Step 2: Architectural Elements Event Stream Model e.g. Address Lookup (1 event / sec) Resource Model e.g. unloaded RISC CPU (113 MIPS) l=ul=u uu ll [s] [MIPS] [events] © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Step 3: Mapping / Scheduling Rate Monotonic Scheduling (Pre-emptive fixed priority scheduling):  Priority 1:Change Volume (p=1/32 s)  Priority 2:Address Lookup (p=1 s)  Priority 3:Receive TMC (p=6 s) © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Step 4: Performance Model CPU1BUSCPU3 CPU2 Change Volume Receive TMC     NAVRAD MMI Address Lookup   MMI NAVRAD © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Step 5: Analysis CPU1BUSCPU3 CPU2 Change Volume Receive TMC     NAVRAD MMI Address Lookup   MMI NAVRAD © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Analysis – Design Question 1 How do the proposed system architectures compare in respect to end-to-end delays? © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 End-to-end delays: (A)(E)(D)(C)(B) Analysis – Design Question 1 © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Analysis – Design Question 2 How robust is architecture A? Where is the bottleneck of this architecture? NAVRAD MMI 22 MIPS 11 MIPS113 MIPS 72 kbps (A) © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 TMC delay vs. MMI processor speed: Analysis – Design Question 2 NAVRAD MMI 22 MIPS 11 MIPS113 MIPS 72 kbps (A) 26.4 MIPS © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Conclusions  Easy to construct models (~ half day)  Evaluation speed is fast and linear to model complexity (~ 1s per evaluation)  Needs little information to construct early models (Fits early design cycle very well)  Even though involved mathematics is very complex, the method is easy to use (Language of engineers) © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Evaluation of designs according to multiple objectives Different design objectives/criteria are relevant:  Average performance  Worst case performance  Energy/power consumption  Thermal behavior  Reliability  Electromagnetic compatibility  Numeric precision  Testability  Cost  Weight, robustness, usability, extendibility, security, safety, environmental friendliness

 p. marwedel, informatik 12, 2011 Energy- and power models Power/energy models becoming increasingly important  due to mobile computing,  since energy availability becomes more relevant due to increased performance, and  due to environmental issues.

 p. marwedel, informatik 12, 2011 Energy models  Tiwari (1994): Energy consumption within processors  Simunic (1999): Using values from data sheets. Allows modeling of all components, but not very precise.  Russell, Jacome (1998): Measurements for 2 fixed configurations  Steinke et al., UniDo (2001): mixed model using measurements and prediction  CACTI [Jouppi, 1996]: Predicted energy consumption of caches  Wattch [Brooks, 2000]: Power estimation at the architectural level, without circuit or layout

 p. marwedel, informatik 12, 2011 Steinke’s & Knauer’s model E.g.: ATMEL board with ARM7TDMI and ext. SRAM Data Memory Instruction Memory IAddr V DD ALU Multi- plier Barrel Shifter Register File Instr. Decoder & Control Logic Instr Imm Reg Value Reg# Opcode ARM7 DAddr mA Data Instr E total = E cpu_instr + E cpu_data + E mem_instr + E mem_data

 p. marwedel, informatik 12, 2011 Example: Instruction dependent costs in the CPU E cpu_instr =  MinCostCPU(Opcode i ) + FUCost(Instr i-1,Instr i )  1 *  w(Imm i,j ) + ß 1 *  h(Imm i-1,j, Imm i,j ) +  2 *  w(Reg i,k ) + ß 2 *  h(Reg i-1,k, Reg i,k ) +  3 *  w(RegVal i,k ) + ß 3 *  h(RegVal i-1,k, RegVal i,k ) +  4 *  w(IAddr i ) + ß 4 *  h(IAddr i-1, IAddr i ) Cost for a sequence of m instructions w : number of ones; h : Hamming distance; FUCost: cost of switching functional units , ß: determined through experiments

 p. marwedel, informatik 12, 2011 Hamming Distance between adjacent addresses is playing major role

 p. marwedel, informatik 12, 2011 Results  It is not important, which address bit is set to ‘1‘  The number of ‘1‘s in the address bus is irrelevant  The cost of flipping a bit on the address bus is independent of the bit position.  It is not important, which data bit is set to ‘1‘  The number of ‘1‘s on the data bus has a minor effect (3%)  The cost of flipping a bit on the data bus is independent of the bit position.

 p. marwedel, informatik 12, 2011 CACTI model Comparison with SPICE Cache model used

 p. marwedel, informatik 12, 2011 Access times and energy consumption increase with the size of the memory Example (CACTI Model): "Currently, the size of some applications is doubling every 10 months" [STMicroelectronics, Medea+ Workshop, Stuttgart, Nov. 2003]

 p. marwedel, informatik 12, 2011 Influence of the associativity Parameters different from previous slides [P. Marwedel et al., ASPDAC, 2004]

 p. marwedel, informatik 12, 2011 Evaluation of designs according to multiple objectives Different design objectives/criteria are relevant:  Average performance  Worst case performance  Energy/power consumption  Thermal behavior  Reliability  Electromagnetic compatibility  Numeric precision  Testability  Cost  Weight, robustness, usability, extendibility, security, safety, environmental friendliness

 p. marwedel, informatik 12, 2011 Thermal models Thermal models becoming increasingly important  since temperatures become more relevant due to increased performance, and  since temperatures affect usability and dependability.

 p. marwedel, informatik 12, 2011 Model components  Thermal conductance reflects the amount of heat transferred through a plate (made from that material) of area A and thickness L when the temperatures at the opposite sides differ by one Kelvin.  The reciprocal of thermal conductance is called thermal resistance.  Thermal resistances add up like electrical resistances  Masses storing heat correspond to capacitors  Thermal modeling typically uses equivalent electrical models and employs well-known techniques for solving electrical network equations

 p. marwedel, informatik 12, 2011 Results of simulations based on thermal models (1) Source: NL_JUN_2001/Campi/Jun_Campi_2001.html Encapsulated cryptographic coprocessor:

 p. marwedel, informatik 12, 2011 Results of simulations based on thermal models (2) Source: applications/app141/hot_chip.pdf Microprocessor

 p. marwedel, informatik 12, 2011 Summary  Thiele’s real-time calculus (RTC)/MPA Using bounds on the number of events in input streams Using bounds on available processing capability Derives bounds on the number of events in output streams Derives bound on remaining processing capability, buffer sizes, … Examples demonstrate design procedure based on RTC  Energy and power consumption  Thermal behavior

 p. marwedel, informatik 12, 2011 Some more details

 p. marwedel, informatik 12, 2011 Analysis – Design Question 3 Architecture D is chosen for further investigation. How should the processors be dimensioned? RAD 130 MIPS MMI NAV 113 MIPS 72 kbps (D) © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 RAD 130 MIPS MMI NAV 113 MIPS 72 kbps 33 MIPS 29 MIPS Analysis – Design Question 3 d max = 200 d max = 1000 d max = 50 d max = 200 © Thiele, ETHZ

 p. marwedel, informatik 12, 2011 Hamming Distance between adjacent values on the data bus is important

 p. marwedel, informatik 12, 2011 Other costs E cpu_data =   5 * w (DAddr i ) + ß 5 * h (DAddr i-1, DAddr i ) +  6 * w (Data i ) + ß 6 * h (Data i-1, Data i ) E mem_instr =  MinCostMem(InstrMem,Word_width i ) +  7 * w (IAddr i ) + ß 7 * h (IAddr i-1, IAddr i ) +  8 * w (IData i ) +ß 8 * h (IData i-1, IData i ) E mem_data =  MinCostMem (DataMem, Direction, Word_width i ) +  9 * w (DAddr i ) + ß 9 * h (DAddr i-1, DAddr i ) +  10 * w (Data i ) + ß 10 * h (Data i-1, Data i )