Compilation Technology Oct. 16, 2006 © 2006 IBM Corporation Software Group Reducing Startup Costs of Java Applications with Shared Relocatable Code Derek.

Slides:



Advertisements
Similar presentations
Operating Systems Components of OS
Advertisements

IBM JIT Compilation Technology AOT Compilation in a Dynamic Environment for Startup Time Improvement Kenneth Ma Marius Pirvu Oct. 30, 2008.
Target Code Generation
ITEC 352 Lecture 25 Memory(3). Review Questions RAM –What is the difference between register memory, cache memory, and main memory? –What connects the.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Mr. D. J. Patel, AITS, Rajkot 1 Operating Systems, by Dhananjay Dhamdhere1 Static and Dynamic Memory Allocation Memory allocation is an aspect of a more.
Introduction to Programming Lecture 2. Today’s Lecture Software Categories Software Categories System Software System Software Application Software Application.
CS 31003: Compilers ANIRUDDHA GUPTA 11CS10004 G2 CLASS DATE : 24/07/2013.
ITEC 352 Lecture 27 Memory(4). Review Questions? Cache control –L1/L2  Main memory example –Formulas for hits.
CSCE 145: Algorithmic Design I Chapter 1 Intro to Computers and Java Muhammad Nazmus Sakib.
Basic Memory Management 1. Readings r Silbershatz et al: chapters
Run-Time Dynamic Linking for Reprogramming Wireless Sensor Networks
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
CS533 Concepts of Operating Systems Jonathan Walpole.
IBM Software Group © 2005 IBM Corporation Compilation Technology Toward Deterministic Java Performance Mark Stoodley, Mike Fulton Toronto Lab, IBM Canada.
Compilation Technology October 17, 2005 © 2005 IBM Corporation Software Group Reducing Compilation Overhead in J9/TR Marius Pirvu, Derek Inglis, Vijay.
Memory Management 2010.
Dynamic Tainting for Deployed Java Programs Du Li Advisor: Witawas Srisa-an University of Nebraska-Lincoln 1.
© 2007 IBM Corporation IBM Software Group September 26, 2007 Tactics for Minimal Interference from Class Loading in Real-Time Java™ The 5th International.
Lecture 9: SHELL PROGRAMMING (continued) Creating shell scripts!
Silberschatz, Galvin and Gagne  Operating System Concepts Multistep Processing of a User Program User programs go through several steps before.
Software Development and Software Loading in Embedded Systems.
COP4020 Programming Languages
Chapter 91 Memory Management Chapter 9   Review of process from source to executable (linking, loading, addressing)   General discussion of memory.
WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.
Swapping and Contiguous Memory Allocation. Multistep Processing of a User Program User programs go through several steps before being run. Program components.
A genda for Today What is memory management Source code to execution Address binding Logical and physical address spaces Dynamic loading, dynamic linking,
CS 149: Operating Systems February 26 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
Compilation Technology © 2007 IBM Corporation CGO Performance Overheads In Real-Time Java Programs Mark Stoodley and Mike Fulton Compilation.
1 Comp 104: Operating Systems Concepts Java Development and Run-Time Store Organisation.
Linking and Loading Linker collects procedures and links them together object modules into one executable program. Why isn't everything written as just.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
Buffered dynamic run-time profiling of arbitrary data for Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler Compiler workshop ’08.
1 Fast and Efficient Partial Code Reordering Xianglong Huang (UT Austin, Adverplex) Stephen M. Blackburn (Intel) David Grove (IBM) Kathryn McKinley (UT.
IBM Software Group, Compilation Technology © 2007 IBM Corporation Some Challenges Facing Effective Native Code Compilation in a Modern Just-In-Time Compiler.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
Lecture 8 February 29, Topics Questions about Exercise 4, due Thursday? Object Based Programming (Chapter 8) –Basic Principles –Methods –Fields.
Part 1.  Intel x86/Pentium family  32-bit CISC processor  SUN SPARC and UltraSPARC  32- and 64-bit RISC processors  Java  C  C++  Java  Why Java?
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
+ Why program? Java I Fall 2015 Dr. Dwyer. + What do we use computers for? (desert island time – what computing application would you need to have on.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Basic Memory Management 1. Readings r Silbershatz et al: chapters
1 Chapter 1 Programming Languages Evolution of Programming Languages To run a Java program: Java instructions need to be translated into an intermediate.
By: Cheryl Mok & Sarah Tan. Java is partially interpreted. 1. Programmer writes a program in textual form 2. Runs the compiler, which converts the textual.
Programming in the Context of a Typical Computer Computer Studies Created by Rex Woollard.
2.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition System Programs (p73) System programs provide a convenient environment.
JAVA CARD Presented by: MAYA RAJ U C A S,PATHANAMTHITTA.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Introduction to JAVA Programming
1© Copyright 2015 EMC Corporation. All rights reserved. NUMA(YEY) BY JACOB KUGLER.
Just-In-Time Compilation. Introduction Just-in-time compilation (JIT), also known as dynamic translation, is a method to improve the runtime performance.
Object Files & Linking. Object Sections Compiled code store as object files – Linux : ELF : Extensible Linking Format – Windows : PE : Portable Execution.
Chapter 2: Operating-System Structures
Self Healing and Dynamic Construction Framework:
Lecture 1 Runtime environments.
Runtime Analysis of Hotspot Java Virtual Machine
2.1. Compilers and Interpreters
Main Memory Management
Multistep Processing of a User Program
Adaptive Code Unloading for Resource-Constrained JVMs
Outline Chapter 2 (cont) OS Design OS structure
Lecture 1 Runtime environments.
System calls….. C-program->POSIX call
PROGRAM AT RUNTIME Subject code: CSCI-620
System Programming By Prof.Naveed Zishan.
OPERATING SYSTEMS MEMORY MANAGEMENT BY DR.V.R.ELANGOVAN.
Presentation transcript:

Compilation Technology Oct. 16, 2006 © 2006 IBM Corporation Software Group Reducing Startup Costs of Java Applications with Shared Relocatable Code Derek Inglis, Marius Lut, Kenneth Ma and Marius Pirvu IBM Toronto Lab

Compilation Technology © 2006 IBM Corporation Software Group 2 Evolution of WAS Startup Time

Compilation Technology © 2006 IBM Corporation Software Group 3 Agenda  Background  Shared Relocatable Code  Heuristics  Experiment Results  Future Work and Conclusions

Compilation Technology © 2006 IBM Corporation Software Group 4 Agenda  Background  Shared Relocatable Code  Heuristics  Experiment Results  Future Work and Conclusions

Compilation Technology © 2006 IBM Corporation Software Group 5 Shared Classes  Why shared classes? – Memory footprint reduction – Startup time improvement Class Loader a1 VM a RAM Class Class Loader a2 RAM Class Class Loader a1 VM b RAM Class Class Loader a2 RAM Class ROM Class

Compilation Technology © 2006 IBM Corporation Software Group 6 Shared Classes  Shared Cache – Shared memory of fixed size – ROM Classes are stored inside the shared cache  Characteristics – Concurrent JVM access – All system and application classes can be stored – Suitable for multi-JVM environment or when JVM is regularly restarted

Compilation Technology © 2006 IBM Corporation Software Group 7 Agenda  Background  Shared Relocatable Code  Heuristics  Experiment Results  Future Work and Conclusions

Compilation Technology © 2006 IBM Corporation Software Group 8 Our Solution – Shared Relocatable Code  Performance improvement by reducing compilation time – Startup and response time – CPU utilization  What is shared relocatable code? – Compiled code in “relocatable” form – Similar to code created by a static AOT compiler – Relocations performed to fix up address values

Compilation Technology © 2006 IBM Corporation Software Group 9 Shared Relocatable Code  How does shared relocatable code work? – Compiler generates relocatable code during execution – Relocatable code is stored into the shared cache – Subsequent JVM loads and relocates code from the shared cache  Share other characteristics of shared classes – Concurrent JVM access – Methods of all system and application classes can be stored  Reduces compilation time!

Compilation Technology © 2006 IBM Corporation Software Group 10 At a glance… Intermediate Code Virtual Machine Compiler Relocatable code exists? Class loading phase? Shared relocatable code No YesNo Compiled code Shared Cache Repository First compile? Retrieve Compile & Store NoYes Compile request JIT Compile JIT Compile Relocate Store Retrieve

Compilation Technology © 2006 IBM Corporation Software Group 11 Agenda  Background  Shared Relocatable Code  Heuristics  Experiment Results  Future Work and Conclusions

Compilation Technology © 2006 IBM Corporation Software Group 12 Performance Desiderates  Reduce startup time  Lower CPU utilization  Maintain runtime performance (throughput)

Compilation Technology © 2006 IBM Corporation Software Group 13 Heuristics  Decisions – When to generate shared relocatable code – When to use relocatable code from shared cache  Heuristic – Generate relocatable code during the “class loading phases” – Always use relocatable code if available – Alternative: use relocatable code, if available, during the class loading phases

Compilation Technology © 2006 IBM Corporation Software Group 14 Heuristic Refinement  Generate relocatable code only during the initial run  Use relocatable code sooner – “scount” option.  Bump priority of the relocatable code.

Compilation Technology © 2006 IBM Corporation Software Group 15 Agenda  Background  Shared Relocatable Code  Heuristics  Experiment Results  Future Work and Conclusions

Compilation Technology © 2006 IBM Corporation Software Group 16 WebSphere Performance SMP

Compilation Technology © 2006 IBM Corporation Software Group 17 WebSphere Performance UP

Compilation Technology © 2006 IBM Corporation Software Group 18 WebSphere – Compilation level statistics First Run  Compiled, stored in cache, and relocated3632  Level=133  Level=25577  Level=362  Level=439  Level=52  Relocated methods that were recompiled = 164 Second Run  Taken from cache and relocated 3616  Level=1543  Level=25320  Level=356  Level=436  Level=5 3  Relocated methods that were recompiled = 153

Compilation Technology © 2006 IBM Corporation Software Group 19 WebSphere – Shared Cache Utilization  After WebSphere startup – ROM classes  61.7 MB – Relocatable code  3.7 MB  After Trade run – ROM classes  65.4 MB – Relocatable code  4.3 MB

Compilation Technology © 2006 IBM Corporation Software Group 20 Eclipse, Tomcat Startup Performance

Compilation Technology © 2006 IBM Corporation Software Group 21 Agenda  Background  Shared Relocatable Code  Heuristics  Experiment Results  Future Work and Conclusions

Compilation Technology © 2006 IBM Corporation Software Group 22 Future Work  Improving the quality of the relocatable code  Tuning the heuristics  Heuristics about setting the ‘scount’ value  Recompile the relocatable code more aggressively to improve runtime performance

Compilation Technology © 2006 IBM Corporation Software Group 23 Conclusions  Shared Relocatable code – Builds on top of the shared classes framework – Generate code in relocatable form during the first run and place it in the shared cache. Reuse it during the subsequent runs. – Reduce compilation overhead  reduce startup time and CPU utilization  Good performance improvements – startup time – 12% - 50% – CPU utilization reduction in excess of 50% – Runtime performance barely affected