Don’t Try This at Work Low Level Threading with C++11

Slides:

Advertisements

Similar presentations

EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.

Advertisements

Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.

AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory

Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.

AP STUDY SESSION 2.

EuroCondens SGB E.

Slide 1Fig 26-CO, p.795. Slide 2Fig 26-1, p.796 Slide 3Fig 26-2, p.797.

Slide 1Fig 25-CO, p.762. Slide 2Fig 25-1, p.765 Slide 3Fig 25-2, p.765.

Copyright © 2003 Pearson Education, Inc. Slide 1.

Sequential Logic Design

Copyright © 2013 Elsevier Inc. All rights reserved.

Copyright © 2013 Elsevier Inc. All rights reserved.

Addition and Subtraction Equations

David Burdett May 11, 2004 Package Binding for WS CDL.

Create an Application Title 1Y - Youth Chapter 5.

Add Governors Discretionary (1G) Grants Chapter 6.

CHAPTER 18 The Ankle and Lower Leg

The 5S numbers game..

A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Media-Monitoring Final Report April - May 2010 News.

Break Time Remaining 10:00.

The basics for simulations

PP Test Review Sections 6-1 to 6-6

Chapter 3 Logic Gates.

2000 Deitel & Associates, Inc. All rights reserved. Chapter 16 – Bits, Characters, Strings, and Structures Outline 16.1Introduction 16.2Structure Definitions.

2013 Fox Park Adopt-A-Hydrant Fund Raising & Beautification Campaign Now is your chance to take part in an effort to beautify our neighborhood by painting.

Regression with Panel Data

Operating Systems Operating Systems - Winter 2012 Chapter 2 - Processes Vrije Universiteit Amsterdam.

Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.

Progressive Aerobic Cardiovascular Endurance Run

Biology 2 Plant Kingdom Identification Test Review.

Adding Up In Chunks.

FAFSA on the Web Preview Presentation December 2013.

MaK_Full ahead loaded 1 Alarm Page Directory (F11)

Facebook Pages 101: Your Organization’s Foothold on the Social Web A Volunteer Leader Webinar Sponsored by CACO December 1, 2010 Andrew Gossen, Senior.

Artificial Intelligence

When you see… Find the zeros You think….

Before Between After.

Slide R - 1 Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Prentice Hall Active Learning Lecture Slides For use with Classroom Response.

12 October, 2014 St Joseph's College ADVANCED HIGHER REVISION 1 ADVANCED HIGHER MATHS REVISION AND FORMULAE UNIT 2.

Subtraction: Adding UP

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.

Static Equilibrium; Elasticity and Fracture

Converting a Fraction to %

Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Clock will move after 1 minute

famous photographer Ara Guler famous photographer ARA GULER.

Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Pointers and Linked Lists.

Physics for Scientists & Engineers, 3rd Edition

Select a time to count down from the clock above

Copyright Tim Morris/St Stephen's School

1.step PMIT start + initial project data input Concept Concept.

A Data Warehouse Mining Tool Stephen Turner Chris Frala

1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.

Meat Identification Quiz

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

Presentation transcript:

Don’t Try This at Work Low Level Threading with C++11 Tony Van Eerd, BlackBerry May 13, 2013 Copyright Tony Van Eerd (2009-2013) and BlackBerry, Inc. 2012-2013

Things to discuss The C++ (11) Memory Model C++ (11) Atomics May 15, 2012 Things to discuss The C++ (11) Memory Model C++ (11) Atomics What you can but shouldn’t do with them

May 15, 2012 The Memory Model ?

? How I typically explain computers to non-computer people May 15, 2012 ? How I typically explain computers to non-computer people

May 15, 2012 ?

May 15, 2012 ?

← Cache Coherency 2009 2013 2 16 (1968: 1 good + 1 evil) May 15, 2012 ← Cache Coherency As of 2009 there were only 2 concurrent Spocks, but according to Moore’s law, there should be at least 16 in the next movie… (Prior to 2009 there was only a single Spock, save for a brief glimpse of a second (evil) Spock in 1968, and approximately 1½ Spocks in 1984) 2009 2013 2 16 Predicted # of Spocks in Star Trek 2013, According to Moore’s Law (1968: 1 good + 1 evil) (1984: ~1½ Spocks)

May 15, 2012

David Hilley - http://www.thegibson.org/blog/ May 15, 2012 David Hilley - http://www.thegibson.org/blog/ The internet is amazing – search for “deadlock left turn intersection diagram” and not only find the right diagram, but a concurrency programming blog!

May 15, 2012 The internet is amazing – search for “deadlock left turn intersection diagram” and not only find the right diagram, but a concurrency programming blog!

May 15, 2012 Want

May 15, 2012 Got

May 15, 2012 May 15, 2012 13 Want Got

May 15, 2012 May 15, 2012 14 Want Also Want

Sequential Consistency Speed ...and eat it too May 15, 2012 May 15, 2012 15 Sequential Consistency Speed Who doesn’t Want 16 Spocks? (besides McCoy maybe?) ...and eat it too Cake…

A Bit of Precision… Sequential Consistency: May 15, 2012 A Bit of Precision… Sequential Consistency: All memory reads and writes of all threads are executed as if interleaved in some global order, consistent with the program order of each thread. A B C a b c α β γ Thread α α Thread a a b c Thread A A

A Bit of Precision… “Relax” Sequential Consistency: May 15, 2012 A Bit of Precision… Sequential Consistency: All memory reads and writes of all threads are executed as if interleaved in some global order, consistent with the program order of each thread. P1: W(x)1 . P2: R(x)? R(x)? . P3: R(x)? R(x)? . P4: W(x)2 . P1: W(x)1 . P2: R(x)1 R(x)2 . P3: R(x)2 R(x)2 . P4: W(x)2 . P1: W(x)1 . P2: R(x)1 R(x)2 . P3: R(x)2 R(x)2 . P4: W(x)2 . Joe Pfeiffer, New Mexico State University - http://www.cs.nmsu.edu/~pfeiffer P1: W(x)1 . P2: R(x)1 R(x)2 . P3: R(x)1 R(x)2 . P4: W(x)2 . P1: W(x)1 . P2: R(x)1 R(x)2 . P3: R(x)1 R(x)2 . P4: W(x)2 . “Relax” P1: W(x)1 . P2: R(x)1 R(x)2 . P3: R(x)2 R(x)1 . P4: W(x)2 . P1: W(x)1 . P2: R(x)1 R(x)2 . P3: R(x)2 R(x)1 . P4: W(x)2 . P1: W(x)1 . P2: R(x)2 R(x)1 . P3: R(x)2 R(x)1 . P4: W(x)2 . P1: W(x)1 . P2: R(x)2 R(x)1 . P3: R(x)2 R(x)1 . P4: W(x)2 . Yeah, well, you know, that's just, like, your opinion, man.

A Bit of Precision… Sequential Consistency: May 15, 2012 A Bit of Precision… Sequential Consistency: All memory reads and writes of all threads are executed as if interleaved in some global order, consistent with the program order of each thread.

A Bit of Precision… Sequential Consistency: May 15, 2012 A Bit of Precision… Sequential Consistency: All memory reads and writes of all threads are executed as if interleaved in some global order, consistent with the program order of each thread. “Relaxed” Memory Model: 16 egotistical and/or layed-back Spocks that each don’t care what the others think. (But are each individually, internally, consistent.)

A Bit of Precision… Sequential Consistency: May 15, 2012 A Bit of Precision… Sequential Consistency: All memory reads and writes of all threads are executed as if interleaved in some global order, consistent with the program order of each thread. “Relaxed” Memory Model: 16 egotistical and/or layed-back Spocks that each don’t care what the others think. (But are each individually, internally, consistent.) (relaxing what we mean by “precision” here)

Have your cake and eat it too. May 15, 2012 C++ Atomics Have your cake and eat it too.

C++ Atomics Have your cake and eat it too. May 15, 2012 C++ Atomics Have your cake and eat it too. But be careful, you baked it!

#include <atomic> May 15, 2012 namespace std { // 29.3, order and consistency enum memory_order; template <class T> T kill_dependency(T y) noexcept; // 29.4, lock-free property #define ATOMIC_BOOL_LOCK_FREE unspecified #define ATOMIC_CHAR_LOCK_FREE unspecified #define ATOMIC_CHAR16_T_LOCK_FREE unspecified #define ATOMIC_CHAR32_T_LOCK_FREE unspecified #define ATOMIC_WCHAR_T_LOCK_FREE unspecified #define ATOMIC_SHORT_LOCK_FREE unspecified #define ATOMIC_INT_LOCK_FREE unspecified #define ATOMIC_LONG_LOCK_FREE unspecified #define ATOMIC_LLONG_LOCK_FREE unspecified #define ATOMIC_POINTER_LOCK_FREE unspecified // 29.5, generic types template<class T> struct atomic; template<> struct atomic<integral >; template<class T> struct atomic<T*>; // 29.6.1, general operations on atomic types // In the following declarations, atomic-type is either // atomic<T> or a named base class for T from // Table 145 or inferred from Table 146 or from bool. // If it is atomic<T>, then the declaration is a template // declaration prefixed with template <class T>. bool atomic_is_lock_free(const volatile atomic-type *) noexcept; bool atomic_is_lock_free(const atomic-type *) noexcept; void atomic_init(volatile atomic-type *, T) noexcept; void atomic_init(atomic-type *, T) noexcept; void atomic_store(volatile atomic-type *, T) noexcept; void atomic_store(atomic-type *, T) noexcept; void atomic_store_explicit(volatile atomic-type *, T, memory_order) noexcept; void atomic_store_explicit(atomic-type *, T, memory_order) noexcept; T atomic_load(const volatile atomic-type *) noexcept; T atomic_load(const atomic-type *) noexcept; T atomic_load_explicit(const volatile atomic-type *, memory_order) noexcept; T atomic_load_explicit(const atomic-type *, memory_order) noexcept; T atomic_exchange(volatile atomic-type *, T) noexcept; T atomic_exchange(atomic-type *, T) noexcept; T atomic_exchange_explicit(volatile atomic-type *, T, memory_order) noexcept; T atomic_exchange_explicit(atomic-type *, T, memory_order) noexcept; bool atomic_compare_exchange_weak(volatile atomic-type *, T*, T) noexcept; bool atomic_compare_exchange_weak(atomic-type *, T*, T) noexcept; bool atomic_compare_exchange_strong(volatile atomic-type *, T*, T) noexcept; bool atomic_compare_exchange_strong(atomic-type *, T*, T) noexcept; bool atomic_compare_exchange_weak_explicit(volatile atomic-type *, T*, T, memory_order, memory_order) noexcept; bool atomic_compare_exchange_weak_explicit(atomic-type *, T*, T. memory_order, memory_order) noexcept; bool atomic_compare)exchange_strong_explicit(volatile atomic-type *, T*, T, memory_order, memory_order) noexcept; bool atomic_compare_exchange_strong_explicit(atomic-type *, T*, T, memory_order, memory_order) noexcept; // 29.6.2, templated operations on atomic types template <class T> T atomic_fetch_add(volatile atomic<T>*, T) noexcept; template <class T> T atomic_fetch_add(atomic<T>*, T) noexcept; template <class T> T atomic_fetch_add_explicit(volatile atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_add_explicit(atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_sub(volatile atomic<T>*, T) noexcept; template <class T> T atomic_fetch_sub(atomic<T>*, T) noexcept; template <class T> T atomic_fetch_sub_explicit(volatile atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_sub_explicit(atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_and(volatile atomic<T>*, T) noexcept; template <class T> T atomic_fetch_and(atomic<T>*, T) noexcept; template <class T> T atomic_fetch_and_explicit(volatile atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_and_explicit(atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_or(volatile atomic<T>*, T) noexcept; template <class T> T atomic_fetch_or(atomic<T>*, T) noexcept; template <class T> T atomic_fetch_or_explicit(volatile atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_or_explicit(atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_xor(volatile atomic<T>*, T) noexcept; template <class T> T atomic_fetch_xor(atomic<T>*, T) noexcept; template <class T> T atomic_fetch_xor_explicit(volatile atomic<T>*, T, memory_order) noexcept; template <class T> T atomic_fetch_xor_explicit(atomic<T>*, T, memory_order) noexcept; // 29.6.3, arithmetic operations on atomic types // In the following declarations, atomic-integral is either // atomic<T> or a named base class for T from // Table 145 or inferred from Table 146. // If it is atomic<T>, then the declaration is a template // specialization declaration prefixed with template <>. integral atomic_fetch_add(volatile atomic-integral *, integral ) noexcept; integral atomic_fetch_add(atomic-integral *, integral ) noexcept; integral atomic_fetch_add_explicit(volatile atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_add_explicit(atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_sub(volatile atomic-integral *, integral ) noexcept; integral atomic_fetch_sub(atomic-integral *, integral ) noexcept; integral atomic_fetch_sub_explicit(volatile atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_sub_explicit(atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_and(volatile atomic-integral *, integral ) noexcept; integral atomic_fetch_and(atomic-integral *, integral ) noexcept; integral atomic_fetch_and_explicit(volatile atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_and_explicit(atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_or(volatile atomic-integral *, integral ) noexcept; integral atomic_fetch_or(atomic-integral *, integral ) noexcept; integral atomic_fetch_or_explicit(volatile atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_or_explicit(atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_xor(volatile atomic-integral *, integral ) noexcept; integral atomic_fetch_xor(atomic-integral *, integral ) noexcept; integral atomic_fetch_xor_explicit(volatile atomic-integral *, integral , memory_order) noexcept; integral atomic_fetch_xor_explicit(atomic-integral *, integral , memory_order) noexcept; // 29.6.4, partial specializations for pointers template <class T> T* atomic_fetch_add(volatile atomic<T*>*, ptrdiff_t) noexcept; template <class T> T* atomic_fetch_add(atomic<T*>*, ptrdiff_t) noexcept; template <class T> T* atomic_fetch_add_explicit(volatile atomic<T*>*, ptrdiff_t, memory_order) noexcept; template <class T> T* atomic_fetch_add_explicit(atomic<T*>*, ptrdiff_t, memory_order) noexcept; template <class T> T* atomic_fetch_sub(volatile atomic<T*>*, ptrdiff_t) noexcept; template <class T> T* atomic_fetch_sub(atomic<T*>*, ptrdiff_t) noexcept; template <class T> T* atomic_fetch_sub_explicit(volatile atomic<T*>*, ptrdiff_t, memory_order) noexcept; template <class T> T* atomic_fetch_sub_explicit(atomic<T*>*, ptrdiff_t, memory_order) noexcept; // 29.6.5, initialization #define ATOMIC_VAR_INIT(value) see below // 29.7, flag type and operations struct atomic_flag; bool atomic_flag_test_and_set(volatile atomic_flag*) noexcept; bool atomic_flag_test_and_set(atomic_flag*) noexcept; bool atomic_flag_test_and_set_explicit(volatile atomic_flag*, memory_order) noexcept; bool atomic_flag_test_and_set_explicit(atomic_flag*, memory_order) noexcept; void atomic_flag_clear(volatile atomic_flag*) noexcept; void atomic_flag_clear(atomic_flag*) noexcept; void atomic_flag_clear_explicit(volatile atomic_flag*, memory_order) noexcept; void atomic_flag_clear_explicit(atomic_flag*, memory_order) noexcept; #define ATOMIC_FLAG_INIT see below // 29.8, fences extern "C" void atomic_thread_fence(memory_order) noexcept; extern "C" void atomic_signal_fence(memory_order) noexcept; typedef enum memory_order { memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst } memory_order; } #include <atomic>

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { ... }; template <class T> struct atomic { void store(T, memory_order = memory_order_seq_cst) volatile noexcept; void store(T, memory_order = memory_order_seq_cst) noexcept; T load(memory_order = memory_order_seq_cst) const volatile noexcept; T load(memory_order = memory_order_seq_cst) const noexcept; ... };

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { void store(T, memory_order = memory_order_seq_cst) volatile noexcept; void store(T, memory_order = memory_order_seq_cst) noexcept; T load(memory_order = memory_order_seq_cst) const volatile noexcept; T load(memory_order = memory_order_seq_cst) const noexcept; ... };

VOLATILE NOT FOR HERE  VOLATILE  VOLATILE  VOLATILE  VOLATILE  May 15, 2012 VOLATILE  VOLATILE  VOLATILE NOT FOR HERE  VOLATILE  VOLATILE  VOLATILE  VOLATILE 

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { void store(T, memory_order = memory_order_seq_cst) volatile noexcept; void store(T, memory_order = memory_order_seq_cst) noexcept; T load(memory_order = memory_order_seq_cst) const volatile noexcept; T load(memory_order = memory_order_seq_cst) const noexcept; ... };

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { void store(T, memory_order = memory_order_seq_cst); T load(memory_order = memory_order_seq_cst); ... }; // C-like: T atomic_load(const A * object); T atomic_load(const volatile A * object); T atomic_load_explicit(const A * object, memory_order); T atomic_load_explicit(const volatile A * object, memory_order); bool atomic_compare_exchange_weak_explicit( volatile A * object, C * expected, C desired, memory_order success, memory_order failure);

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic // generic T, integral, pointer, bool { atomic() = default; constexpr atomic(T); atomic(const atomic&) = delete; atomic& operator=(const atomic&) = delete; void store(T, memory_order = memory_order_seq_cst); T load(memory_order = memory_order_seq_cst); T operator=(T t) { store(t); } operator T() { return load(); } T exchange(T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); bool is_lock_free(); }; Note from Standard: “Type arguments that are not also statically initializable may be difficult to use.”

“is_lock_free == true” May 15, 2012 struct atomic_flag; “is_lock_free == true” struct atomic_flag { atomic_flag() = default; atomic_flag(const atomic_flag&) = delete; atomic_flag& operator=(const atomic_flag&) = delete; bool test_and_set(memory_order = memory_order_seq_cst); void clear(memory_order = memory_order_seq_cst); }; atomic_flag guard = ATOMIC_FLAG_INIT;

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic // generic T, integral, pointer, bool { atomic() = default; constexpr atomic(T); atomic(const atomic&) = delete; atomic& operator=(const atomic&) = delete; void store(T, memory_order = memory_order_seq_cst); T load(memory_order = memory_order_seq_cst); T operator=(T t) { store(t); } operator T() { return load(); } T exchange(T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); bool is_lock_free(); }; Note from Standard: “Type arguments that are not also statically initializable may be difficult to use.”

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { // pointers and intergrals ... // as above // both: T fetch_add(T, memory_order = memory_order_seq_cst); T fetch_sub(T, memory_order = memory_order_seq_cst); T operator++(int); T operator--(int); T operator++(); // atomic! not the same as: a = a + 1 T operator--(); T operator+=(T); T operator-=(T); // integrals only: T fetch_and(T, memory_order = memory_order_seq_cst); T fetch_or(T, memory_order = memory_order_seq_cst); T fetch_xor(T, memory_order = memory_order_seq_cst); T operator&=(T); T operator|=(T); T operator^=(T); };

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { atomic() = default; constexpr atomic(T); atomic(const atomic&) = delete; atomic& operator=(const atomic&) = delete; void store(T, memory_order = memory_order_seq_cst); T load(memory_order = memory_order_seq_cst); T operator=(T t) { store(t); } operator T() { return load(); } T exchange(T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); bool is_lock_free(); };

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { void store(T, memory_order = memory_order_seq_cst); T load(memory_order = memory_order_seq_cst); T exchange(T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); }; enum memory_order { memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst

Sequential Consistency vs Acquire/Release vs Relaxed May 15, 2012 An operation A synchronizes-with an operation B if A is a store to some atomic variable m, with memory_order_release / memory_order_seq_cst, and B is a load from the same variable m, with memory_order_acquire / memory_order_seq_cst, B reads the value stored by A. Sequential Consistency vs Acquire/Release vs Relaxed P.S. Locks use Acquire/Release (not S.C.) relaxed seq_cst relaxed acquire seq_cst It’s like Git – work locally, commit locally, push (release), someone else (who was also working locally) pulls (acquire)… If you only have push (release) without pull (acquire) it doesn’t work. Accept when it does – because it is a leaky git that is always doing some automatic push/pull behind your back (but no guarantees as to exactly what and when) release relaxed seq_cst relaxed seq_cst boom

Sequential Consistency vs Acquire/Release vs Relaxed May 15, 2012 Sequential Consistency vs Acquire/Release vs Relaxed x 1st y 1st x 2nd y 2nd BONUS QUESTION! y == 0 implies x == 0 implies ∴ z != 0

May 15, 2012

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { void store(T, memory_order = memory_order_seq_cst); T load(memory_order = memory_order_seq_cst); T exchange(T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); }; enum memory_order { memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { T exchange(T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); }; static SharedData data; static atomic<bool> locked; if(!locked.exchange(true)) { do_exclusive(data); locked.store(false); } static SharedData data; static atomic<bool> locked; if(!locked.exchange(true, memory_order_acquire)) { do_exclusive(data); locked.store(false, memory_order_release); } static SharedData data; static atomic_flag locked; if(!locked.test_and_set()) { do_exclusive(data); locked.clear(); } re-entrancy check

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); }; template <class T> struct atomic { bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); }; template <class T> struct atomic { bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); bool compare_exchange_strong(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_strong(T&, T, memory_order, memory_order); }; static atomic<int> count; int next; int was = count.load(); do { next = was + 1; } while (!count.compare_exchange_weak(was, next, acq_rel, relaxed)); static atomic<int> count; int next; int was = count.load(); do { next = was + 1; } while (!count.compare_exchange_weak(was, next)); // compare_exchange: if (count untouched) count = next; else was = count; // compare_exchange: if (count == was) count = next; else was = count; OR... ^ atomically

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { T fetch_add(T, memory_order = memory_order_seq_cst); T operator++(int); }; static atomic<int> count; count++; // or count.fetch_add(memory_order_acq_rel); do { next = (was + 1) % length; } while (!count.compare_exchange_weak(was, next));

May 15, 2012 template<class T> struct atomic; template<> struct atomic< integral >; template<class T> struct atomic<T*>; template <class T> struct atomic { bool compare_exchange_weak(T&, T, memory_order = memory_order_seq_cst); bool compare_exchange_weak(T&, T, memory_order, memory_order); }; // Lock Free Stack... void push(T val) { } T pop()

May 15, 2012 void push(Val val) { //...? }

Node * newhead = new Node(val); } May 15, 2012 void push(Val val) { Node * newhead = new Node(val); }

Node * newhead = new Node(val); Node * oldhead = stack.head; } May 15, 2012 void push(Val val) { Node * newhead = new Node(val); Node * oldhead = stack.head; }

Node * newhead = new Node(val); Node * oldhead = stack.head; May 15, 2012 void push(Val val) { Node * newhead = new Node(val); Node * oldhead = stack.head; newhead->next = oldhead; }

Node * newhead = new Node(val); Node * oldhead = stack.head; May 15, 2012 void push(Val val) { Node * newhead = new Node(val); Node * oldhead = stack.head; newhead->next = oldhead; stack.head = newhead; }

Node * newhead = new Node(val); Node * oldhead = stack.head; do { May 15, 2012 void push(Val val) { Node * newhead = new Node(val); Node * oldhead = stack.head; do { next = was + 1; } while (!count.compare_exchange_weak(was, next)); }

Node * newhead = new Node(val); Node * oldhead = stack.head; do { May 15, 2012 void push(Val val) { Node * newhead = new Node(val); Node * oldhead = stack.head; do { newhead->next = oldhead; } while(!stack.head.compare_exchange_weak(oldhead, newhead)); }

Node * newhead = new Node(val); May 15, 2012 void push(Val val) { Node * newhead = new Node(val); Node * oldhead = stack.head.load(relaxed); do { newhead->next = oldhead; } while(!stack.head.compare_exchange_weak(oldhead, newhead, release)); }

May 15, 2012 Val pop() { //...? }

Node * oldhead = stack.head; } May 15, 2012 Val pop() { Node * oldhead = stack.head; }

Node * oldhead = stack.head; Node * newhead = oldhead->next; } May 15, 2012 Val pop() { Node * oldhead = stack.head; Node * newhead = oldhead->next; }

Node * oldhead = stack.head; if (oldhead == NULL) throw StackEmpty(); May 15, 2012 Val pop() { Node * oldhead = stack.head; if (oldhead == NULL) throw StackEmpty(); Node * newhead = oldhead->next; }

Node * oldhead = stack.head; if (oldhead == NULL) throw StackEmpty(); May 15, 2012 Val pop() { Node * oldhead = stack.head; if (oldhead == NULL) throw StackEmpty(); Node * newhead = oldhead->next; }

Node * oldhead = stack.head; if (oldhead == NULL) throw StackEmpty(); May 15, 2012 Val pop() { Node * oldhead = stack.head; if (oldhead == NULL) throw StackEmpty(); Node * newhead = oldhead->next; }

Node * oldhead = stack.head.load(acquire); do May 15, 2012 Val pop() { Node * oldhead = stack.head.load(acquire); do if (!oldhead) throw StackEmpty(); newhead = oldhead->next; } while (!head.compare_exchange_weak(oldhead, newhead, acq_rel)); Val val = oldhead->val; recycle(oldhead); return val;

Node * oldhead = stack.head.load(acquire); do May 15, 2012 Val pop() { Node * oldhead = stack.head.load(acquire); do if (!oldhead) throw StackEmpty(); newhead = oldhead->next; } while (!head.compare_exchange_weak(oldhead, newhead, acq_rel)); Val val = oldhead->val; recycle(oldhead); return val;

Node * oldhead = stack.head.load(acquire); do May 15, 2012 Val pop() { Node * oldhead = stack.head.load(acquire); do if (!oldhead) throw StackEmpty(); newhead = oldhead->next; } while (!head.compare_exchange_weak(oldhead, newhead, acq_rel)); Val val = oldhead->val; recycle(oldhead); return val;

Node * oldhead = stack.head.load(acquire); do May 15, 2012 Val pop() { Node * oldhead = stack.head.load(acquire); do if (!oldhead) throw StackEmpty(); newhead = oldhead->next; } while (!head.compare_exchange_weak(oldhead, newhead, acq_rel)); Val val = oldhead->val; recycle(oldhead); return val; someone’s been eating my porridge

Node * oldhead = stack.head.load(acquire); do May 15, 2012 Val pop() { Node * oldhead = stack.head.load(acquire); do if (!oldhead) throw StackEmpty(); newhead = oldhead->next; } while (!head.compare_exchange_weak(oldhead, newhead, acq_rel)); Val val = oldhead->val; recycle(oldhead); return val;

Node * oldhead = stack.head.load(acquire); do May 15, 2012 Val pop() { Node * oldhead = stack.head.load(acquire); do if (!oldhead) throw StackEmpty(); newhead = oldhead->next; } while (!head.compare_exchange_weak(oldhead, newhead, acq_rel)); Val val = oldhead->val; recycle(oldhead); return val;

Node * oldhead = stack.head.load(acquire); do May 15, 2012 Val pop() { Node * oldhead = stack.head.load(acquire); do if (!oldhead) throw StackEmpty(); newhead = oldhead->next; } while (!head.compare_exchange_weak(oldhead, newhead, acq_rel)); Val val = oldhead->val; recycle(oldhead); return val;

ABA Val pop() { Node * oldhead = stack.head.load(acquire); do May 15, 2012 ABA Val pop() { Node * oldhead = stack.head.load(acquire); do if (!oldhead) throw StackEmpty(); newhead = oldhead->next; } while (!head.compare_exchange_weak(oldhead, newhead, acq_rel)); Val val = oldhead->val; recycle(oldhead); return val; // compare_exchange: if (count untouched) count = next; else was = count; // compare_exchange: if (count == was) count = next; else was = count;

More Scary Things 42 memory_order_consume False Sharing May 15, 2012 More Scary Things 42 memory_order_consume False Sharing Bonus Question…

More Scary Things 42 // Thread 1: r1 = y.load(relaxed); May 15, 2012 More Scary Things 42 // Thread 1: r1 = y.load(relaxed); x.store(r1, relaxed); assert(r1 == 42); // Thread 2: r2 = x.load(relaxed); y.store(42, relaxed); assert(r2 == 42);

More Scary Things memory_order_consume int y = bar + 17; May 15, 2012 More Scary Things memory_order_consume int y = bar + 17; if (p != NULL) int x = *p; foo = 42; // ‘publish’ p = &foo; foo| | |bar | | p | assert(x == 42);

More Scary Things False Sharing next.load(); May 15, 2012 More Scary Things False Sharing next.load(); prev.compare_exchange(...); struct Node { atomic<Node*> next; atomic<Node*> prev; //… } next | prev |

More Scary Things Bonus Question… May 15, 2012 More Scary Things Bonus Question… atomic<T> is implemented with locks if/when T is too large to be natively atomic. locks use acquire/release semantics atomics offer sequential consistency How do you implement sequential consistency given only acquire/release? (Note, that acq + rel != seq_cst, for example, recall…)

Sequential Consistency vs Acquire/Release vs Relaxed May 15, 2012 Sequential Consistency vs Acquire/Release vs Relaxed x 1st y 1st x 2nd y 2nd BONUS QUESTION! y == 0 implies x == 0 implies ∴ z != 0

May 15, 2012

Thanks to… Michael Wong, IBM Toronto Lab, michaelw@ca.ibm.com May 15, 2012 Thanks to… Michael Wong, IBM Toronto Lab, michaelw@ca.ibm.com Hans Boehm, Hewlett-Packard, http://www.hpl.hp.com/personal/Hans_Boehm Joe Pfeiffer, New Mexico State University, http://www.cs.nmsu.edu/~pfeiffer Bartosz Milewski, http://bartoszmilewski.com Anthony Williams, http://www.justsoftwaresolutions.co.uk/ Dmitriy V’jukov, http://www.1024cores.net/ David Hilley, http://www.thegibson.org/blog/ Jeremy Manson, http://jeremymanson.blogspot.com

May 15, 2012 Use Locks!

^ atomics from Abstrusegoose.com - licensed under CC BY-NC 3.0 May 15, 2012 ^ from Abstrusegoose.com - licensed under CC BY-NC 3.0 atomics