David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center ControllingFragmentation and Space Consumption in the Metronome.

Slides:



Advertisements
Similar presentations
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 1 MC 2 –Copying GC for Memory Constrained Environments Narendran Sachindran J. Eliot.
Advertisements

Steve Blackburn Department of Computer Science Australian National University Perry Cheng TJ Watson Research Center IBM Research Kathryn McKinley Department.
1 Write Barrier Elision for Concurrent Garbage Collectors Martin T. Vechev Cambridge University David F. Bacon IBM T.J.Watson Research Center.
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
Reducing Pause Time of Conservative Collectors Toshio Endo (National Institute of Informatics) Kenjiro Taura (Univ. of Tokyo)
KERNEL MEMORY ALLOCATION Unix Internals, Uresh Vahalia Sowmya Ponugoti CMSC 691X.
Real-time Garbage Collection Presented by Boris Dogadov 20/01/15 1.
MC 2 : High Performance GC for Memory-Constrained Environments - Narendran Sachindran, J. Eliot B. Moss, Emery D. Berger Sowmiya Chocka Narayanan.
5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.
By Jacob SeligmannSteffen Grarup Presented By Leon Gendler Incremental Mature Garbage Collection Using the Train Algorithm.
MC 2 : High Performance GC for Memory-Constrained Environments N. Sachindran, E. Moss, E. Berger Ivan JibajaCS 395T *Some of the graphs are from presentation.
Increasing Memory Usage in Real-Time GC Tobias Ritzau and Peter Fritzson Department of Computer and Information Science Linköpings universitet
1 The Compressor: Concurrent, Incremental and Parallel Compaction. Haim Kermany and Erez Petrank Technion – Israel Institute of Technology.
On the limits of partial compaction Anna Bendersky & Erez Petrank Technion.
MOSTLY PARALLEL GARBAGE COLLECTION Authors : Hans J. Boehm Alan J. Demers Scott Shenker XEROX PARC Presented by:REVITAL SHABTAI.
0 Parallel and Concurrent Real-time Garbage Collection Part I: Overview and Memory Allocation Subsystem David F. Bacon T.J. Watson Research Center.
Chapter 9 – Real Memory Organization and Management
21 September 2005Rotor Capstone Workshop Parallel, Real-Time Garbage Collection Daniel Spoonhower Guy Blelloch, Robert Harper, David Swasey Carnegie Mellon.
Correctness-Preserving Derivation of Concurrent Garbage Collection Algorithms Martin T. Vechev Eran Yahav David F. Bacon University of Cambridge IBM T.J.
1 An Efficient On-the-Fly Cycle Collection Harel Paz, Erez Petrank - Technion, Israel David F. Bacon, V. T. Rajan - IBM T.J. Watson Research Center Elliot.
Jangwoo Shin Garbage Collection for Real-Time Java.
Uniprocessor Garbage Collection Techniques Paul R. Wilson.
 2004 Deitel & Associates, Inc. All rights reserved. Chapter 9 – Real Memory Organization and Management Outline 9.1 Introduction 9.2Memory Organization.
Memory Management n 1. Single contiguous allocation n 2. Partitioned organization: –Static, Dynamic n 3. (Pure) Paging.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Garbage Collection Memory Management Garbage Collection –Language requirement –VM service –Performance issue in time and space.
A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao.
1 Overview Assignment 6: hints  Living with a garbage collector Assignment 5: solution  Garbage collection.
David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center The Metronome: A Hard Real-time Garbage Collector.
Taking Off The Gloves With Reference Counting Immix
ISMM 2004 Mostly Concurrent Compaction for Mark-Sweep GC Yoav Ossia, Ori Ben-Yitzhak, Marc Segal IBM Haifa Research Lab. Israel.
380C Lecture 17 Where are we & where we are going –Managed languages Dynamic compilation Inlining Garbage collection –Why you need to care about workloads.
Ulterior Reference Counting: Fast Garbage Collection without a Long Wait Author: Stephen M Blackburn Kathryn S McKinley Presenter: Jun Tao.
A Mostly Non-Copying Real-Time Collector with Low Overhead and Consistent Utilization David Bacon Perry Cheng (presenting) V.T. Rajan IBM T.J. Watson Research.
Copyright (c) 2004 Borys Bradel Myths and Realities: The Performance Impact of Garbage Collection Paper: Stephen M. Blackburn, Perry Cheng, and Kathryn.
Message Analysis-Guided Allocation and Low-Pause Incremental Garbage Collection in a Concurrent Language Konstantinos Sagonas Jesper Wilhelmsson Uppsala.
1 Real-Time Replication Garbage Collection Scott Nettles and James O’Toole PLDI 93 Presented by: Roi Amir.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science 1 Automatic Heap Sizing: Taking Real Memory into Account Ting Yang, Emery Berger,
Computer Science Department Daniel Frampton, David F. Bacon, Perry Cheng, and David Grove Australian National University Canberra ACT, Australia
1 Advanced Memory Management Techniques  static vs. dynamic kernel memory allocation  resource map allocation  power-of-two free list allocation  buddy.
September 11, 2003 Beltway: Getting Around GC Gridlock Steve Blackburn, Kathryn McKinley Richard Jones, Eliot Moss Modified by: Weiming Zhao Oct
Fast Garbage Collection without a Long Wait Steve Blackburn – Kathryn McKinley Presented by: Na Meng Ulterior Reference Counting:
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
A REAL-TIME GARBAGE COLLECTOR WITH LOW OVERHEAD AND CONSISTENT UTILIZATION David F. Bacon, Perry Cheng, and V.T. Rajan IBM T.J. Watson Research Center.
Memory Management -Memory allocation -Garbage collection.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
2/4/20161 GC16/3011 Functional Programming Lecture 20 Garbage Collection Techniques.
Real-time collection for multithreaded Java Microcontroller Garbage Collection. Garbage Collection. Application of Java in embedded real-time systems.
On the limits of partial compaction Nachshon Cohen and Erez Petrank Technion.
1 GC Advantage: Improving Program Locality Xianglong Huang, Zhenlin Wang, Stephen M Blackburn, Kathryn S McKinley, J Eliot B Moss, Perry Cheng.
CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)
The Metronome Washington University in St. Louis Tobias Mann October 2003.
Real-time Garbage Collection By Tim St. John Low Overhead and Consistent Utilization. Low Overhead and Consistent Utilization. Multithreaded Java Microcontroller.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Immix: A Mark-Region Garbage Collector Jennifer Sartor CS395T Presentation Mar 2, 2009 Thanks to Steve for his Immix presentation from
File System Implementation
Memory Management.
Dynamic Compilation Vijay Janapa Reddi
Ulterior Reference Counting Fast GC Without The Wait
David F. Bacon, Perry Cheng, and V.T. Rajan
Memory Management and Garbage Collection Hal Perkins Autumn 2011
Main Memory Background Swapping Contiguous Allocation Paging
Strategies for automatic memory management
Memory Management Kathryn McKinley.
Beltway: Getting Around Garbage Collection Gridlock
Chapter 12 Memory Management
Page Main Memory.
Presentation transcript:

David F. Bacon Perry Cheng V.T. Rajan IBM T.J. Watson Research Center ControllingFragmentation and Space Consumption in the Metronome

Problem Domain Hard real-time garbage collection – Implemented for Java Uniprocessor – Multiprocessors rare in real-time systems – Complication: collector must be finely interleaved – Simplification: memory model easier to program No truly concurrent operations Sequentially consistent

Metronome Project Goals Make GC feasible for hard real-time systems Provide simple application interface Develop technology that is efficient: – Throughput, Space comparable to stop-the- world BREAK THE MILLISECOND BARRIER – While providing even CPU utilization

Outline Overview of the Metronome Empirical Results What is Fragmentation? – Static and dynamic measures How do we control space consumption? Conclusions

The Metronome Collector

Real-time GC Approaches Mark-sweep (non-copying) – Fragmentation avoidance, coalescing – Subject to space explosion Semi-space Copying – Concurrently copies between spaces – Cost of copying, consistency, 2x space The Metronome – Mark-sweep, selective defragmentation – Best of both, adaptively adjusts

Components of the Metronome Incremental mark-sweep collector – Mark phase fixes stale pointers Selective incremental defragmentation – Moves < 2% of traced objects Time-based scheduling Segregated free list allocator – Geometric size progression limits internal fragmentation OldNew

Support Technologies Read barrier: to-space invariant [Brooks] – New techniques with only 4% overhead Write barrier: snapshot-at-the-beginning [Yuasa] Arraylets: bound fragmentation, large object ops OldNew

Empirical Results

Pause time distribution: javac 12 ms Time-based SchedulingWork-based Scheduling

Utilization vs. Time: javac Time (s) Utilization (%) Time-based SchedulingWork-based Scheduling 0.45

Space Usage: javac

Parameterization Mutator a * (ΔGC) m Collector R Tuner Δt s u Allocation Rate Maximum Live Memory Collection Rate Real Time Interval Maximum Used Memory CPU Utilization

Is it real-time? Yes Maximum pause time < 4 ms [currently] MMU > 50% ± 2% Memory requirement < 2 X max live

Static Fragmentation

Segregated Free List Allocator Heap divided into fixed-size pages Each page divided into fixed-size blocks Objects allocated in smallest block that fits

Fragmentation on a Page Internal: wasted space at end of object Page-internal: wasted space at end of page External: blocks needed for other size external internal page-internal

Fragmentation: ρ=1/8 vs. ρ=1/2

Dynamic Fragmentation

Locality of Size: λ Measures reuse Normalized: 0≤λ≤1 Segregated by size Bytes AllocatedFreedAllocatedFreedAllocatedFreed Perfect Reuse λ = Σ i min ( f i / f, a i / a) Freed Reused Allocated Reused

λ in Real-time GC Context Mark-sweep (non-copying) Semi-space Copying The Metronome Assumes λ=1 Assumes λ=0 Adapts as λ varies

λ in Practice

Controlling Space Consumption

Triggering Collection MS 1MS 2MS 3DF 1DF 2 TIME PAGES Free Defrag

Factors in Space Consumption MS 1MS 2MS 3DF 1DF 2 R λ

Reducing Space Consumption Collection Rate R – Higher rate: less memory reserve needed – Speed up collector – Specify rate more precisely (bound worst- case) Locality of Size λ – Higher locality: less free page reserve needed – Specify page requirement more precisely

Conclusions

Contributions – Time-based scheduling (Tunable) – Mostly non-copying collection (λ varies) – Efficient software read barrier – Precise definition of fragmentation – Precise specification of collection triggers The Metronome provides true real-time GC – First collector to do so without major sacrifice Short pauses (4 ms) High MMU during collection (50%) Low memory consumption (2x max live)