Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu

Slides:



Advertisements
Similar presentations
Chapter 19 Vectors, templates, and exceptions Bjarne Stroustrup
Advertisements

Custom STL Allocators Pete Isensee Xbox Advanced Technology Group
Writing Modern C++ Marc Grégoire Software Architect April 3 rd 2012.
Dynamic Memory Management
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Chapter 12. Kernel Memory Allocation
Recitation 7 – 3/5/01 Outline Control Flow Memory Allocation –Lab 3 Details Shaheen Gandhi Office Hours: Wednesday.
KERNEL MEMORY ALLOCATION Unix Internals, Uresh Vahalia Sowmya Ponugoti CMSC 691X.
U NIVERSITY OF M ASSACHUSETTS D EPARTMENT OF C OMPUTER S CIENCE Reconsidering Custom Memory Allocation Emery Berger, Ben Zorn, Kathryn McKinley.
5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.
Run-time organization  Data representation  Storage organization: –stack –heap –garbage collection Programming Languages 3 © 2012 David A Watt,
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
Agenda  Review: pointer & array  Relationship between pointer & array  Dynamic memory allocation.
CP104 Introduction to Programming Structure II Lecture 32 __ 1 Data Type planet_t and Basic Operations Abstract Data Type (ADT) is a data type combined.
1 Optimizing Malloc and Free Professor Jennifer Rexford COS 217 Reading: Section 8.7 in K&R book
Malloc Recitation Section K (Kevin Su) November 5 th, 2012.
ISBN Chapter 11 Abstract Data Types and Encapsulation Concepts.
@ Zhigang Zhu, CSC212 Data Structure - Section FG Lecture 10 The Bag and Sequence Classes with Linked Lists Instructor: Zhigang Zhu Department.
Chapter 3.5 Memory and I/O Systems. Memory Management 2 Only applies to languages with explicit memory management (C, C++) Memory problems are one of.
Run time vs. Compile time
Composing High-Performance Memory Allocators Emery Berger, Ben Zorn, Kathryn McKinley.
1 Data Structures Data Structures Topic #2. 2 Today’s Agenda Data Abstraction –Given what we talked about last time, we need to step through an example.
CS61C L06 C Memory Management (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles C/C++ Emery Berger and Mark Corner University of Massachusetts.
. Memory Management. Memory Organization u During run time, variables can be stored in one of three “pools”  Stack  Static heap  Dynamic heap.
UPC Runtime Layer Jason Duell. The Big Picture The Runtime layer handles everything that is both: 1) Platform/Environment specific —So compiler can output.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
A genda for Today What is memory management Source code to execution Address binding Logical and physical address spaces Dynamic loading, dynamic linking,
Outline Midterm results Static variables Memory model
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
CS212: Object Oriented Analysis and Design Lecture 6: Friends, Constructor and destructors.
SEN 909 OO Programming in C++ Final Exam Multiple choice, True/False and some minimal programming will be required.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Week 9 Part 1 Kyle Dewey. Overview Dynamic allocation continued Heap versus stack Memory-related bugs Exam #2.
Programming Languages by Ravi Sethi Chapter 6: Groupings of Data and Operations.
Replay Compilation: Improving Debuggability of a Just-in Time Complier Presenter: Jun Tao.
University of Washington Today Finished up virtual memory On to memory allocation Lab 3 grades up HW 4 up later today. Lab 5 out (this afternoon): time.
1 Dynamic Memory Allocation –The need –malloc/free –Memory Leaks –Dangling Pointers and Garbage Collection Today’s Material.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 9.
1 Advanced Memory Management Techniques  static vs. dynamic kernel memory allocation  resource map allocation  power-of-two free list allocation  buddy.
COMP3190: Principle of Programming Languages
Standard Template Library The Standard Template Library was recently added to standard C++. –The STL contains generic template classes. –The STL permits.
Lists Chapter 8. 2 Linked Lists As an ADT, a list is –finite sequence (possibly empty) of elements Operations commonly include: ConstructionAllocate &
CS 241 Discussion Section (11/17/2011). Outline Review of MP7 MP8 Overview Simple Code Examples (Bad before the Good) Theory behind MP8.
CSE 425: Control Abstraction I Functions vs. Procedures It is useful to differentiate functions vs. procedures –Procedures have side effects but usually.
Object-Oriented Programming Chapter Chapter
CSC 8505 Compiler Construction Runtime Environments.
FORTRAN History. FORTRAN - Interesting Facts n FORTRAN is the oldest Language actively in use today. n FORTRAN is still used for new software development.
Computer Graphics 3 Lecture 1: Introduction to C/C++ Programming Benjamin Mora 1 University of Wales Swansea Pr. Min Chen Dr. Benjamin Mora.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
Efficient Detection of All Pointer and Array Access Errors Todd M.Austin Scott E.Breach Gurindar S.Sohi Computer Sciences Department University of Wisconsin-Madison.
CNG 140 C Programming (Lecture set 12) Spring Chapter 13 Dynamic Data Structures.
CS 241 Discussion Section (2/9/2012). MP2 continued Implement malloc, free, calloc and realloc Reuse free memory – Sequential fit – Segregated fit.
Malloc Lab : Introduction to Computer Systems Recitation 11: Nov. 4, 2013 Marjorie Carlson Recitation A.
CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)
Carnegie Mellon 1 Malloc Recitation Ben Spinelli Recitation 11: November 9, 2015.
Slide 1 Chapter 8 Architectural Design. Slide 2 Topics covered l System structuring l Control models l Modular decomposition l Domain-specific architectures.
CS 342: C++ Overloading Copyright © 2004 Dept. of Computer Science and Engineering, Washington University Overview of C++ Overloading Overloading occurs.
1 Data Organization Example 1: Heap storage management Maintain a sequence of free chunks of memory Find an appropriate chunk when allocation is requested.
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
CSCI206 - Computer Organization & Programming
Reconsidering Custom Memory Allocation
Optimizing Malloc and Free
CS212: Object Oriented Analysis and Design
Memory Allocation CS 217.
Pointers C#, pointers can only be declared to hold the memory addresses of value types int i = 5; int *p; p = &i; *p = 10; // changes the value of i to.
Operating System Introduction.
Malloc Lab CSCI 380: Operating Systems
ENERGY 211 / CME 211 Lecture 30 December 5, 2008.
Presentation transcript:

Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu

Copyright © 2005 Andrei Alexandrescu 2 Agenda  Modularity and speed: a fundamental tension  Example: memory allocation Policies Eager Computation Segregate functionality Costless refinements  Based on “Composing High-Performance Memory Allocators” by Berger et al:

Copyright © 2005 Andrei Alexandrescu 3 Modularity: good  Developing systems from small parts is good Best known way to manage complexity  Abstraction is good Modularity and abstraction go hand in hand  Separate development is good  Separate testing is good Confinement of bugs is good

Copyright © 2005 Andrei Alexandrescu 4 Speed: good  Getting work done is good (?)  Libraries that don’t exact penalties are good  Lossless growth is good Compounded inefficiency: abstraction’s worst enemy

Copyright © 2005 Andrei Alexandrescu 5 Modularity and Speed  Fundamental tension: Modularity asks for separation, hiding, abstraction, and uniform interfaces Speed asks for coalescing, transparency, specialization, and non-uniformity  How to resolve the tension?

Copyright © 2005 Andrei Alexandrescu 6 Two Approaches  Defer compilation/optimization Develop subsystems separately, have the runtime optimize when it sees them all Various JIT approaches  Expedite computation/exposure Develop subsystems separately, have the compiler see them all early Various macro and compilation systems

Copyright © 2005 Andrei Alexandrescu 7 Example: Memory Allocation  Memory allocation: Very hard to modularize/componentize Highly competitive: General-purpose allocators: 100 cycles/alloc Specialized allocators: < 12 cycles/alloc  Templates: Compute things early Expose modular code early

Copyright © 2005 Andrei Alexandrescu 8 Idea #1: mixins/policies  Create uncommitted, “for adoption” derived classes template struct Heap : public Base { void* Alloc(size_t); void Dealloc(void*); };  Exposes modular code early

Copyright © 2005 Andrei Alexandrescu 9 Top Class  Can’t defer forever, so without further ado… struct MallocHeap { void* Alloc(size_t s) { return malloc(s); } void Dealloc(void* p) { return free(p); };

Copyright © 2005 Andrei Alexandrescu 10 Idea #2: Eager Computation  Avoid redundant and runtime computation safely! class TopHeap { void* Alloc(size_t) {... } void Dealloc(void*) {... } friend void* Alloc(Heap & h, size_t s) { return h.AllocImpl( (s + AlignBytes - 1) & ~(AlignBytes - 1))); } friend void Dealloc(Heap & h, void* p) { return h.Dealloc(p); } };

Copyright © 2005 Andrei Alexandrescu 11 Idea #3: Segregate Representation template class SzHeap : public Base { void* Alloc(size_t s) { size_t * pS = static_cast ( Base::AllocImpl(s + sizeof(size_t))); return *pS = s, pS + 1; } void Dealloc(void* p) { Base::Dealloc(static_cast (p) – 1); } size_t SizeOf(void* p) { return (static_cast (p))[-1]; } };

Copyright © 2005 Andrei Alexandrescu 12 Free Lists  Unbeatable specialized allocation method  Put deallocated blocks in a freelist  Consult the freelist when allocating  Disadvantage: fixed size, no coallescing, no reallocation

Copyright © 2005 Andrei Alexandrescu 13 Free Lists Layer template class FLHeap : public Base { void* Alloc(size_t s) { if (s != S || !list_) { return Base::AllocImpl(s); } void * p = list_; list_ = list_->next_; return p; }...

Copyright © 2005 Andrei Alexandrescu 14 (continued)... void Dealloc(void * p) { if (SizeOf(p) != S) return Base::Dealloc(p); list * pL = static_cast (p); pL->next_ = list_; list_= pL; } ~FLHeap() {... } private: struct List { List * next_; } };

Copyright © 2005 Andrei Alexandrescu 15 Remarks  There is no source-level coupling between the way the size is maintained and computed, and FLHeap Combinatorial advantage  There is coupling at the object code level + Optimization - Separate linking, dynamic loading…

Copyright © 2005 Andrei Alexandrescu 16 Building a Layered Allocator typedef FLHeap<64, FLHeap<32, SzHeap > > MyHeap;  Modular  Easy to understand  Easy to change  Efficient

Copyright © 2005 Andrei Alexandrescu 17 Idea #4: Costless Refinements template struct CanResize { enum { value = 0 }; }; template bool Resize(Heap &, void*, size_t &) { return 0; }  Refined implementations will “hide” the default and specialize CanResize  Can test for resizing capability at compile tim or runtime

Copyright © 2005 Andrei Alexandrescu 18 Range Allocators template class RHeap : public Base { void* Alloc(size_t s) { static_assert(S1 < S2); if (s >= S1 && s < S2) s = S2; return Base::AllocImpl(s); }... };  Improved speed at the cost of slack memory  User-controlled tradeoff

Copyright © 2005 Andrei Alexandrescu 19 Idea #2 again: Eager computation template <size_t S1, size_t S2, size_t S3, class B> void* RHeap >:: Alloc(size_t s) { static_assert(S1 < S2 && S2 < S3); if (s >= S1 && s < S3) { s = s < S2 ? S2 : S3; } return Base::AllocImpl(s); }... };

Copyright © 2005 Andrei Alexandrescu 20 Further Building Blocks  Profiling and debug heaps  MT heaps Locked Lock-free  Region-based Alloc bumps a pointer Dealloc doesn’t do a thing Destructor deallocates everything

Copyright © 2005 Andrei Alexandrescu 21 Performance  1%-8% speed improvement over gcc’s ObStack  2%-3% speed loss over the Kingsley allocator  2% faster – 20% slower than Lea’s allocator Lea: monolithic general-purpose allocator Optimized for 7 years  Memory consumption similar within 5%

Copyright © 2005 Andrei Alexandrescu 22 Conclusions  Modularity and efficiency are at odds  Templates offer black-box source, white-box compilation  A few idioms for efficient, safe idioms: Policies Eager Computation Segregate functionality Costless refinements

Copyright © 2005 Andrei Alexandrescu 23 Bibliography  Emery Berger et al., “Composing High- Performance Memory Allocators”, PLDI 2001  Yours Truly and Emery Berger, “Policy-Based Memory Allocation”, CUJ Dec 2005