WCET Analysis for a Java Processor Martin Schoeberl TU Vienna, Austria Rasmus Pedersen CBS, Denmark.

Slides:



Advertisements
Similar presentations
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors Onur Mutlu, The University of Texas at Austin Jared Start,
Advertisements

Approximating the Worst-Case Execution Time of Soft Real-time Applications Matteo Corti.
Xianfeng Li Tulika Mitra Abhik Roychoudhury
Computer Architecture Instruction-Level Parallel Processors
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Parallelism & Locality Optimization.
CPE 731 Advanced Computer Architecture Instruction Level Parallelism Part I Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? ILP: VLIW Architectures Marco D. Santambrogio:
Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.
1 Advanced Computer Architecture Limits to ILP Lecture 3.
Constraint Systems used in Worst-Case Execution Time Analysis Andreas Ermedahl Dept. of Information Technology Uppsala University.
Software Methods to Increase Data Cache Performance Presented by Philip Marshall.
Timing Predictability - A Must for Avionics Systems - Reinhard Wilhelm Saarland University, Saarbrücken.
Using one level of Cache:
Predictable Implementation of Real-Time Applications on Multiprocessor Systems-on-Chip Alexandru Andrei Embedded Systems Laboratory Linköping University,
EENG449b/Savvides Lec /17/04 February 17, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
EECS 470 Superscalar Architectures and the Pentium 4 Lecture 12.
Multiscalar processors
Embedded Computing From Theory to Practice November 2008 USTC Suzhou.
Csci4203/ece43631 Review Quiz. 1)It is less expensive 2)It is usually faster 3)Its average CPI is smaller 4)It allows a faster clock rate 5)It has a simpler.
Computer Architecture 2010 – Out-Of-Order Execution 1 Computer Architecture Out-Of-Order Execution Lihu Rappoport and Adi Yoaz.
Precision Going back to constant prop, in what cases would we lose precision?
JOP Design Flow Microcode make JopSim ModelSim Java Quartus JVM
JOP: A Java Optimized Processor for Embedded Real-Time Systems Martin Schöberl.
Ch2. Instruction-Level Parallelism & Its Exploitation 2. Dynamic Scheduling ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department.
Pipelines for Future Architectures in Time Critical Embedded Systems By: R.Wilhelm, D. Grund, J. Reineke, M. Schlickling, M. Pister, and C.Ferdinand EEL.
A Modular and Retargetable Framework for Tree-based WCET analysis Antoine Colin Isabelle Puaut IRISA - Solidor Rennes, France.
Real-Time Java on JOP Martin Schöberl. Real-Time Java on JOP2 Overview RTSJ – why not Simple RT profile Scheduler implementation User defined scheduling.
A Time Predictable Instruction Cache for a Java Processor Martin Schoeberl.
An Efficient Stack Machine Martin Schöberl. JOP Stack Architecture2 Overview JVM stack machine Parameter passing Stack access patterns Common stack caches.
Computer architecture Lecture 11: Reduced Instruction Set Computers Piotr Bilski.
Evaluation and Validation Peter Marwedel TU Dortmund, Informatik 12 Germany 2013 年 12 月 02 日 These slides use Microsoft clip arts. Microsoft copyright.
OOE vs. EPIC Emily Evans Prashant Nagaraddi Lin Gu.
RISC architecture and instruction Level Parallelism (ILP) based on “Computer Architecture: a Quantitative Approach” by Hennessy and Patterson, Morgan Kaufmann.
Zheng Wu. Background Motivation Analysis Framework Intra-Core Cache Analysis Cache Conflict Analysis Optimization Techniques WCRT Analysis Experiment.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
Crosscutting Issues: The Rôle of Compilers Architects must be aware of current compiler technology Compiler Architecture.
Pipelining and Parallelism Mark Staveley
A Unified WCET Analysis Framework for Multi-core Platforms Sudipta Chattopadhyay, Chong Lee Kee, Abhik Roychoudhury National University of Singapore Timon.
Exploiting Scratchpad-aware Scheduling on VLIW Architectures for High-Performance Real-Time Systems Yu Liu and Wei Zhang Department of Electrical and Computer.
Reconfigurable Computing - Pipelined Systems John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
Static WCET Analysis vs. Measurement: What is the Right Way to Assess Real-Time Task Timing? Worst Case Execution Time Prediction by Static Program Analysis.
JOP Java Optimized Processor DI Martin Schöberl. Content Targets Java Virtal Machine Three different architectures Datapath of JOP3 First results.
1  1998 Morgan Kaufmann Publishers Chapter Six. 2  1998 Morgan Kaufmann Publishers Pipelining Improve perfomance by increasing instruction throughput.
Parallel Processing Chapter 9. Problem: –Branches, cache misses, dependencies limit the (Instruction Level Parallelism) ILP available Solution:
ECE 720T5 Fall 2011 Cyber-Physical Systems Rodolfo Pellizzoni.
Real-time aspects Bernhard Weirich Real-time Systems Real-time systems need to accomplish their task s before the deadline. – Hard real-time:
CSE 522 WCET Analysis Computer Science & Engineering Department Arizona State University Tempe, AZ Dr. Yann-Hang Lee (480)
Introduction to Computer Organization Pipelining.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Where Testing Fails …. Problem Areas Stack Overflow Race Conditions Deadlock Timing Reentrancy.
ECE 720T5 Winter 2014 Cyber-Physical Systems Rodolfo Pellizzoni.
Timing Anomalies in Dynamically Scheduled Microprocessors Thomas Lundqvist, Per Stenstrom (RTSS ‘99) Presented by: Kaustubh S. Patil.
Niagara: A 32-Way Multithreaded Sparc Processor Kongetira, Aingaran, Olukotun Presentation by: Mohamed Abuobaida Mohamed For COE502 : Parallel Processing.
“Temperature-Aware Task Scheduling for Multicore Processors” Masters Thesis Proposal by Myname 1 This slides presents title of the proposed project State.
Topics to be covered Instruction Execution Characteristics
Advanced Architectures
CS2100 Computer Organization
Computer Architecture Principles Dr. Mike Frank
“Temperature-Aware Task Scheduling for Multicore Processors”
CDA 3101 Spring 2016 Introduction to Computer Organization
Flow Path Model of Superscalars
Instruction Scheduling for Instruction-Level Parallelism
CSCI1600: Embedded and Real Time Software
Evaluation and Validation
FAST: Frequency-Aware Static Timing Analysis
CSCI1600: Embedded and Real Time Software
Spring 2019 Prof. Eric Rotenberg
Presentation transcript:

WCET Analysis for a Java Processor Martin Schoeberl TU Vienna, Austria Rasmus Pedersen CBS, Denmark

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Outline Motivation Motivation WCET analysis WCET analysis The Java processor JOP The Java processor JOP Results Results Conclusion, future work Conclusion, future work Demo Demo

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Motivation Schedule analysis Schedule analysis For (hard) real-time For (hard) real-time Execution time numbers needed Execution time numbers needed Static WCET analysis Static WCET analysis No measurements! No measurements! Can not guarantee the WC Can not guarantee the WC Still used Still used

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Issues with static WCET Analysis Why is WCET Analysis so seldom used? Why is WCET Analysis so seldom used? High-level part is easy High-level part is easy Low-level is the hard part Low-level is the hard part Instruction timing Instruction timing Caches Caches Advanced, speculative processors Advanced, speculative processors

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Static WCET Analysis Mature research Mature research High-level based in ILP High-level based in ILP Construct CFG Construct CFG Add execution time to BB Add execution time to BB Build ILP equations: Build ILP equations: Sum of BB t exe will be maximized Sum of BB t exe will be maximized Frequency of in-edges = out-edges Frequency of in-edges = out-edges Add loop constraints Add loop constraints

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Low-level WCET Analysis Execution time of basic blocks Execution time of basic blocks Sum of instruction timing? Sum of instruction timing? Not in modern CPUs Not in modern CPUs What about memory access? What about memory access? Instruction cache Instruction cache Instruction prefetch Instruction prefetch Data cache Data cache Very hard for general purpose CPUs Very hard for general purpose CPUs

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 The Proposed Solution Build a processor to simplify WCET Analysis Build a processor to simplify WCET Analysis Avoid non analyzable features Avoid non analyzable features Find better solutions Find better solutions Java processor JOP Java processor JOP Built from ground up for WCET Built from ground up for WCET FPGA implementation (small) FPGA implementation (small) Not slow on average Not slow on average

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 JOP Features A RISC stack machine A RISC stack machine 4 stage pipeline 4 stage pipeline No dependencies No dependencies No shared state (e.g. memory) No shared state (e.g. memory) Stack cache Stack cache Method cache Method cache

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Size Resource(LC)Memory(KB)Fmax(MHz) JOP min JOP typ Lightfoot LEON

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Performance

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Low-level Timing Bytecode execution time known Bytecode execution time known Analysis of microcode (DATE’06) Analysis of microcode (DATE’06) No dependencies No dependencies Documented Documented

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Memory Access Assume SRAM Assume SRAM Constant access time (r ws and w ws ) Constant access time (r ws and w ws ) Access time partially hidden: Access time partially hidden: Method cache load: Method cache load: Partially hidden Partially hidden

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Method Cache Full method loaded Full method loaded Misses only on invoke/return Misses only on invoke/return Cache contains several methods Cache contains several methods Simpler to analyze Simpler to analyze At call tree level At call tree level Other instructions are a hit Other instructions are a hit

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Method Cache Analysis Only call tree leaves Only call tree leaves Return is always a hit Return is always a hit Invoke in a loop Invoke in a loop for (int i=0; i<10; ++i) { foo();} One miss and 9 hits One miss and 9 hits Miss times added to the CFG Miss times added to the CFG

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Miss Times in the CFG =>

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Evaluation Measured(cycle)Estimated(cycle)Pessimism(ratio) Robot Lift Kfl UdpIp

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Conclusion We need static WCET analysis We need static WCET analysis COTS processors don’t work COTS processors don’t work Design HW for WCET Design HW for WCET A RT computer architecture A RT computer architecture JOP is a first step to RT processors JOP is a first step to RT processors Analysis at bytecode level Analysis at bytecode level

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Future Work Method cache cont. Method cache cont. Detection of loop bounds Detection of loop bounds Integration into Eclipse Integration into Eclipse WC memory consumption WC memory consumption

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Demo Time

Martin Schoeberl – WCET Analysis for a Java Processor JTRES 2006 Thank You! Questions&Suggestions