Yaser Zhian Dead Mage IGDI, Workshop 10, May 30 th -31 st, 2013.

Yaser Zhian Dead Mage IGDI, Workshop 10, May 30 th -31 st, 2013

Today: auto, decltype, range-based for, etc. Lambdas Rvalue references and moving Variadic templates Tomorrow Threads, atomics and the memory model Other features: initializer lists, constexpr, etc. Library updates: new containers, smart pointers, etc. General Q&A http://yaserzt.com/1

Ways to write code that is: Cleaner and less error-prone Faster Richer and can do more (occasionally) Know thy language You can never have too many tools Elegance in interface; complexity (if any) in implementation Take everything here with a grain of salt! http://yaserzt.com/2

We will use Visual Studio 2012 (with the Nov 2012 CTP compiler update.) Go ahead. Open it up, make a project, add a file, set the toolset in project options. Write a simple hello, world and run it. Please do try and write code; the sound of keyboard does not disrupt the workshop. http://yaserzt.com/3

We will also use IDE One online compiler at http://ideone.com/ http://ideone.com/ You might want to register an account there. Do so while I talk about unimportant stuff and answer any questions… Remember to select the C++11 compiler. Write and run a simple program here as well. http://yaserzt.com/4

What is the type of a + b ? a ? int ? double ? Dependent on operator +, and on a and b. And a whole lot of name lookup, type deduction and overload resolution rules. Even if you dont know, the compiler always does. decltype(a + b) c; c = a + b; (Instead of e.g. double c; ) http://yaserzt.com/6

Whats the return type of this function? template ??? Add (T const & a, U const & b) { return a + b; } One answer is decltype(T() + U()) Not entirely correct. (Why?) The correct answer is decltype(a + b) But that wont compile. http://yaserzt.com/7

What is wrong with this? template decltype(a + b) Add (T const & a, U const & b) { return a + b; } This is basically the motivation behind the new function declaration syntax in C++11. http://yaserzt.com/8

auto Fun (type1 p1) -> returntype; The previous function template then becomes: template auto Add (T const & a, U const & b) -> decltype(a + b) { return a + b; } This works for ordinary functions too: auto Sqr (float x)->float {return x*x;} http://yaserzt.com/9

Putting auto where a type name is expected, instructs the compiler to infer type from initializing expression, e.g. auto foo = a * b + c * d; auto bar = new std::map ; auto baz = new std::map, std::vector >::const_iterator; http://yaserzt.com/10

Some more examples: auto x = 0; auto y = do_stuff (x); auto const & y = do_stuff (x); auto f = std::bind (foo, _1, 42); for (auto i = c.begin(), e = c.end(); i != e; ++i) {…} http://yaserzt.com/11

Sometimes, you have to be very careful with auto and decltype : std::vector const & v (1); auto a = v[0]; // int decltype(v[1]) b = 1; // int const & auto c = 0; // int auto d = c; // int decltype(c) e = 1; // int decltype((c)) f = c; // int & decltype(0) g; // int http://yaserzt.com/12

How common is this code snippet? vector v; for (vector ::iterator i = v.begin(); i != u.end(); i++) cout << *i << endl; How many problems can you see? Heres a better version: for (auto i = v.cbegin(), e = v.cend(); i != e; ++i) cout << *i << endl; This is the best version: for (auto const & s : v) cout << s << endl; http://yaserzt.com/13

This loop: for (for-range-declaration : expression) statement will get expanded to something like this: { auto && __range = range-init; for (auto __begin= begin-expr, __end= end-expr; __begin != __end; ++__begin) { for-range-declaration = *__begin; statement } } http://yaserzt.com/14

Introducing more functionality into C++ http://yaserzt.com/15

Lambdas are unnamed functions that you can write almost anywhere in your code (that you can write an expression.) For example: [] (int x) -> int {return x * x;} [] (int x,int y){return x<y ? y : x;} What does this do? [] (double v) {cout << v;} (4.2); http://yaserzt.com/16

Storing lambdas: auto sqr = [] (int x) -> int {return x * x;}; auto a = sqr(42); std::function g = [] (int a, int b) {return a + b;}; int d = g(43, -1); auto h = std::bind ( [](int x,int y){return x<y ? y : x;}, _1, 0); auto n = h (-7); http://yaserzt.com/17

Consider these functions: template void Apply (C & c, F const & f) { for (auto & v : c) f(v); } template void Apply2 (C & c, function const & f) { for (auto & v : c) f(v); } Used like this: int a [] = {10, 3, 17, -1}; Apply (a, [] (int & x) {x += 2;}); http://yaserzt.com/18

Apply (a, [](int x) {cout << x << endl;}); int y = 2; Apply (a, [y](int & x) {x += y;}); int s = 0; Apply (a, [&s](int x) {s += x;}); Apply (a, [y, &s](int x) {s += x + y;} ); http://yaserzt.com/19

int y = 2; auto f = [y](int & x) {x += y;}; y = 10; Apply (a, f); int y = 2; auto f = [&y](int & x) {x += y;}; y = 10; Apply (a, f); By the way, you can capture everything by value ( [=] ) or by reference ( [&] ). http://yaserzt.com/20

http://yaserzt.com/21

C++ used to have a tendency to copy stuff around if you werent paying attention! What happens when we call this function? vector GenerateNames () { return vector (50, string(100, '*')); } A whole lot of useless stuff are created and copied around. All sorts of techniques and tricks to avoid those copies. http://yaserzt.com/22

string s = string("Hello") + " " + "world."; 1. string (char const *) 2. string operator + (string const &, char const *) 3. string operator + (string const &, char const *) 4. this ultimately called the copy ctor string (string const &). In total, there can be as many as 5 (or even 7) temporary strings here. (Unrelated note) Some allocations can be avoided with Expression Templates. http://yaserzt.com/23

When dealing with anonymous temporary objects, the compiler can elide their (copy-) construction, which is called copy elision. This is a unique kind of optimization, as the compiler is allowed to remove code that has side effects! Return Value Optimization is one kind of copy elision. http://yaserzt.com/24

C++11 introduces rvalue references to let you work with (kinda) temporary objects. Rvalue references are denoted with &&. e.g. int && p = 3; or void foo (std::string && s); or Matrix::Matrix (Matrix && that){…} http://yaserzt.com/25

In situations where you used to copy the data from an object into another object, if your first object is an rvalue (i.e. temporary) now you can move the data from that to this. Two important usages of rvalue references are move construction and move assignment. e.g. string (string && that);// move c'tor and string & operator = (string && that); // move assignment http://yaserzt.com/26

template class Matrix { private: T * m_data; unsigned m_rows, m_columns; public: Matrix (unsigned rows, unsigned columns); ~Matrix (); Matrix (Matrix const & that); template Matrix (Matrix const & that); Matrix & operator = (Matrix const & that); Matrix (Matrix && that); Matrix & operator = (Matrix && that);... }; http://yaserzt.com/27

template class Matrix {... unsigned rows () const; unsigned columns () const; unsigned size () const; T & operator () (unsigned row, unsigned col);// m(5, 7) = 0; T const & operator () (unsigned row, unsigned col) const; template auto operator + (Matrix const & rhs) const -> Matrix ; template auto operator * (Matrix const & rhs) const -> Matrix ; }; http://yaserzt.com/28

Matrix (unsigned rows, unsigned columns) : m_rows (rows), m_columns (columns), m_data (new T [rows * columns]) { } ~Matrix () { delete[] m_data; } Matrix (Matrix const & that) : m_rows (that.m_rows), m_columns (that.m_columns), m_data (new T [that.m_rows * that.m_columns]) { std::copy ( that.m_data, that.m_data + (m_rows * m_columns), m_data ); } http://yaserzt.com/29

Matrix & operator = (Matrix const & that) { if (this != &that) { T * new_data = new T [that.m_rows * that.m_columns]; std::copy ( that.m_data, that.m_data + (m_rows * m_columns), new_data ); delete[] m_data; m_data = new_data; m_rows = that.m_rows; m_columns = that.m_columns; } return *this; } http://yaserzt.com/30

Matrix (Matrix && that) : m_rows (that.m_rows), m_columns (that.m_columns), m_data (that.m_data) { that.m_rows = that.m_columns = 0; that.m_data = nullptr; } http://yaserzt.com/31

Matrix & operator = (Matrix && that) { if (this != &that) { delete[] m_data; m_rows = that.m_rows; m_columns = that.m_columns; m_data = that.data; that.m_rows = rhs.m_columns = 0; that.m_data = nullptr; } return *this; } http://yaserzt.com/32

struct SomeClass { string s; vector v; public: // WRONG! WRONG! WRONG! // Doesnt move, just copies. SomeClass (SomeClass && that) : s (that.s), v (that.v) {} SomeClass (SomeClass && that) : s (std::move(that.s)), v (std::move(that.v)) {} }; http://yaserzt.com/33

In principle, std::move should look like this: template ??? move (??? something) { return something; } What should the argument type be? T&& ? T& ? Both? Neither? We need to be able to pass in both lvalues and rvalues. http://yaserzt.com/34

We can overload move() like this: move (T && something) move (T & something) But that will lead to exponential explosion of overloads if the function has more arguments. Reference collapse rule in C++98: int& & is collapsed to int&. In C++11, the rules are: (in addition to the above) int&& & is collapsed to int&. int&& && is collapsed to int&&. http://yaserzt.com/35

Therefore, only the T&& version should be enough. If you pass in an lvalue to our move, the actual argument type will collapse into T&, which is what we want (probably.) So, move looks like this thus far: template ??? move (T && something) { return something; } http://yaserzt.com/36

Now, what is the return type? T&& ? It should be T&& in the end. But if we declare it so, and move() is called on an lvalue, then T will be SomeType& then T&& will be SomeType& && then it will collapse into SomeType& then we will be returning an lvalue reference from move(), which will prevent any moving at all. We need a way to remove the & if T already has one. http://yaserzt.com/37

We need a mechanism to map one type to another In this case, to map T& and T&& to T, and T to T. There is no simple way to describe the process, but this is how its done: template struct RemoveReference { typedef T type; }; With that, RemoveReference ::type will be equivalent to int. But we are not done. http://yaserzt.com/38

Now we specialize: template struct RemoveReference { typedef T type; }; template struct RemoveReference { typedef T type; }; Now, RemoveReference ::type will be int too. http://yaserzt.com/39

Our move now has the correct signature: template typename RemoveReference ::type && move (T && something) { return something; } But its not correct. That something in there is an lvalue, remember? http://yaserzt.com/40

…so we cast it to an rvalue reference: template typename RemoveReference ::type && move (T && something) { return static_cast ::type && > (something); } Hopefully, this is correct now! http://yaserzt.com/41

There is no such thing as universal references! But, due to the C++11 reference collapsing, sometimes when you write T && v, you can get anything; both lvalues and rvalues. These can be thought of as universal references. Two preconditions: There must be T&&, And there must be type deduction. http://yaserzt.com/42

Any questions? http://yaserzt.com/43

A Simple Method to Do RAII and Transactions http://yaserzt.com/44

This is an extremely common pattern in programming: if ( ) { if (! ) } For example: if (OpenDatabase()) { if (!WriteNameAndAge()) UnwriteNameAndAge(); CloseDatabase (); } http://yaserzt.com/45

The object-oriented way might be: class RAII { RAII () { } ~RAII () { } }; … RAII raii; try { } catch (...) { throw; } http://yaserzt.com/46

What happens if you need to compose actions? if ( ) { if ( ) { if (! ) { } } else } http://yaserzt.com/47

What if we could write this: SCOPE_EXIT { }; SCOPE_FAIL { }; http://yaserzt.com/48

Extremely easy to compose: SCOPE_EXIT { }; SCOPE_FAIL { }; SCOPE_EXIT { }; SCOPE_FAIL { }; http://yaserzt.com/49

To start, we want some way to execute code when the execution is exiting the current scope. The key idea here is to write a class that accepts a lambda at construction and calls it at destruction. But how do we store a lambda for later use? We can use std::function, but should we? http://yaserzt.com/50

Lets start like this: template class ScopeGuard { public: ScopeGuard (F f) : m_f (std::move(f)) {} ~ScopeGuard () {m_f();} private: F m_f; }; And a helper function: template ScopeGuard MakeScopeGuard (F f) { return ScopeGuard (std::move(f)); } http://yaserzt.com/51

This is used like this: int * p = new int [1000]; auto g = MakeScopeGuard([&]{delete[] p;}); //… Without MakeScopeGuard(), we cant construct ScopeGuard instances that use lambdas, because they dont have type names. But we dont have a way to tell scope guard not to execute its clean-up code (in case we dont want to roll back.) http://yaserzt.com/52

So we add a flag and a method to dismiss the scope guard when needed: template class ScopeGuard { public: ScopeGuard (F f) : m_f (std::move(f)), m_dismissed (false) {} ~ScopeGuard () {if (!m_dismissed) m_f();} void dismiss () {m_dismissed = true;} private: F m_f; bool m_dismissed; }; http://yaserzt.com/53

A very important part is missing though… A move constructor: ScopeGuard (ScopeGuard && that) : m_f (std::move(that.m_f)), m_dismissed (std::move(that.m_dismissed)) { that.dismiss (); } And we should disallow copying, etc. private: ScopeGuard (ScopeGuard const &); ScopeGuard & operator = (ScopeGuard const &); http://yaserzt.com/54

Our motivating example becomes: auto g1 = MakeScopeGuard([&]{ }); auto g2 = MakeScopeGuard([&]{ }); auto g3 = MakeScopeGuard([&]{ }); auto g4 = MakeScopeGuard([&]{ }); g2.dismiss(); g4.dismiss(); http://yaserzt.com/55

Do you feel lucky?! http://yaserzt.com/56

Templates with variable number of arguments For example template size_t log (int severity, char const * msg, Ts&&... vs); Remember the old way? size_t log (int severity, char const * msg,...); Using va_list, va_start, va_arg and va_end in Or #define LOG_ERROR(msg,...)\ log (SevError, msg, __VA_ARGS__) http://yaserzt.com/57

Almost the same for classes: template class ManyParents : Ts... { ManyParents () : Ts ()... {} }; Now these are valid: ManyParents a; ManyParents b; http://yaserzt.com/58

template T * Create (T * parent, PTs&&... ps) { T* ret = new T; ret->create (parent, std::forward (ps)...); return ret; } PTs and ps are not types, values, arrays, tuples or initializer lists. They are new things. http://yaserzt.com/59

Rules of expansion are very interesting: Ts... T1,T2,…,Tn Ts&&... T1&&,…,Tn&& A... A,…,A f(42, vs...) f(42,v1,…,vn) f(42, vs)... f(42,v1),…,f(42,vn) One more operation you can do: size_t items = sizeof...(Ts); // or vs http://yaserzt.com/60

Lets implement the sizeof... operator as an example. template struct CountOf; template <> struct CountOf<> { enum { value = 0 }; }; template struct CountOf { enum { value = CountOf ::value + 1 }; }; Use CountOf like this: size_t items = CountOf ::value; http://yaserzt.com/61

Lets implement a function named IsOneOf() that can be used like this: IsOneOf(42, 3, -1, 3.1416, 42.0f, 0) which should return true or IsOneOf (0, "hello") which should fail to compile How do we start the implementation? Remember, think recursively! http://yaserzt.com/62

template bool IsOneOf (A && a, T && t0) { return a == t0; } template bool IsOneOf (A && a, T0 && t0, Ts&&... ts) { return a == t0 || IsOneOf(a, std::forward (ts)...); } http://yaserzt.com/63

Finally! http://yaserzt.com/64

The machine we code for (or want to code for): Each statement in your high-level program gets translated into several machine instructions The (one) CPU runs the instructions in the program one by one All interactions with memory finish before the next instruction starts This is absolutely not true even in a single-threaded program running on a single-CPU machine It hasnt been true for about 2-3 decades now CPU technology, cache and memory systems and compiler optimizations make it not true http://yaserzt.com/65

Even in a multi-core world, we assume that: Each CPU runs the instructions one-by-one All interactions of each CPU with memory finish before the next instruction starts on that CPU Memory ops from different CPUs are serialized by the memory system and effected one before the other The whole system behaves as if we were executing some interleaving of all threads as a single stream of operations on a single CPU This is even less true (if thats at all possible!) http://yaserzt.com/66

Y OUR COMPUTER DOES NOT EXECUTE THE PROGRAMS YOU WRITE. If it did, your programs would have been 10s or 100s of times slower It makes it appear as though your program is being executed http://yaserzt.com/67

The expected behavior of hardware with respect to shared data among threads of execution Obviously important for correctness Also important for optimization If you want to have the slightest chance to know what the heck is going on! http://yaserzt.com/68

We have sequential consistency if: the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program. E.g., if A, B, C are threads in a program, This is SC: A 0, A 1, B 0, C 0, C 1, A 2, C 2, C 3, C 4, B 1, A 3 This is not: A 0, A 1, B 0, C 0, C 2, A 2, C 1, C 3, C 4, B 1, A 3 http://yaserzt.com/69

You have a race condition if: A memory location can be simultaneously accessed by two threads and at least one thread is a writer Memory location is defined as Either a non-bitfield variable Or a sequence of non-zero-length bitfields Simultaneously is defined as you cant prove that one happens before the other Remember that in case of a race condition in your code, anything can happen. Anything. http://yaserzt.com/70

Transformations (reorder, change, add, remove) to your code: Compiler: eliminate/combine subexprs, move code around, etc. Processor: execute your code out-of-order or speculatively, etc. Caches: delay your writes, poison or share data with each other, etc. But you dont care about all this. What you care about are: The code that you wrote The code that gets finally executed You dont (usually) care who did what; you only care that your correctly-synchronized program behaves as if some sequentially- consistent interleaving of the instructions (specially memory ops) of your threads is being executed. Also, all writes are visible atomically, globally, simultaneously http://yaserzt.com/71

Consider Petersons algorithm: (Both flags are atomic and initially zero) http://yaserzt.com/72 Does this actually work? Thread 1: flag1 = 1; // (1) if (flag2 != 0)// (2) else Thread 2: flag2 = 1; // (3) if (flag1 != 0)// (4) else

The system (compiler, processor, memory) gives you sequentially-consistent execution, as long as your program is data-race free. This is the memory model that C++11 (and C11) expect compilers and hardware to provide for the programmer. The memory model is a contract between programmer and the system The programmer promises to correctly synchronize her program (no race conditions) The system promises to provide the illusion that it is executing the program you wrote http://yaserzt.com/73

Transaction: a logical op on related data that maintains an invariant Atomic: all or nothing Consistent: takes the system from one valid state to another Independent: correct in the presence of other transactions on the same data Example: (We have two bank accounts: A and B) Begin transaction (we acquire exclusivity) 1.Add X units to account B 2.Subtract X units from account A End transaction (we release exclusivity) http://yaserzt.com/74

Critical Region (or Critical Section): Code that must be executed in isolation from rest of program A tool that is used to implement transactions E.g., youd implement CR using a mutex like this: mutex MX; // MX is a mutex protecting X … { lock_guard lock (MX); // Acquire } // Release Same principle using atomic variables, etc. http://yaserzt.com/75

Important rule: code cant move out of a CR E.g., if you have: MX.lock (); // Acquire x = 42; MX.unlock (); // Release The system cant transform it to: x = 42; MX.lock (); // Acquire MX.unlock (); // Release MX.lock (); // Acquire MX.unlock (); // Release x = 42; http://yaserzt.com/76

If we have: x = 7; M.lock(); y = 42; M.unlock(); z = 0; Which of these can/cant be done? http://yaserzt.com/77 M.lock(); x = 7; y = 42; z = 0; M.unlock(); M.lock(); z = 0; y = 42; x = 7; M.unlock(); z = 0; M.lock(); y = 42; M.unlock(); x = 7;

A pattern emerges! For SC acquire/release: You cant move things up across an acquire. You cant move things down across a release. You cant move an acquire up across a release. Acquire and release are also called one-way barriers (or one-way fences.) A release store makes its prior accesses visible to an acquire load that sees (pairs with) that store. Important: a release pairs with an acquire in another thread. A mutex lock or loading from an atomic variable is an acquire. A mutex unlock or storing to an atomic variable is a release. http://yaserzt.com/78

Weapons of Mass Destruction http://yaserzt.com/79

Defined in header Use like std::atomic x; E.g, std::atomic ai; or std::atomic_int ai; or, std::atomic as; Might use locks (spinlocks) under the hood. Check with x.is_lock_free() No operation works on two atomics at once or return an atomic. Available ops are =, T, ++, --, +=, -=, &=, |=, ^= There is also: T exchange (T desired, …) bool compare_exchange_strong (T& expected, T desired, …) bool compare_exchange_weak (T& expected, T desired, …) You can also use std::atomic_flag which has test_and_set(…) and clear(…). (And dont forget ATOMIC_FLAG_INIT.) http://yaserzt.com/80

Represented by class std::thread (in header ) default-constructible and movable (not copyable) template explicit thread (F&& f, Args&&... args); Should always call join() or detach() t.join() waits for thread t to finish its execution t.detach() detaches t from the actual running thread otherwise the destructor will terminate the program Get information about a thread object using std::thread::id get_id () bool joinable () http://yaserzt.com/81

The static function unsigned std::thread::hardware_concurrency() returns the number of threads that the hardware can run concurrently There is also a namespace std::this_thread with these members: std::thread::id get_id () void yield () void sleep_for ( ) void sleep_until ( ) http://yaserzt.com/82

There are four types of mutexes in C++ (in header ) mutex : basic mutual exclusion device timed_mutex : provides locking with a timeout recursive_mutex : can be acquired more than once by the same thread recursive_timed_mutex They all provide lock(), unlock() and bool try_lock() The timed versions provide bool try_lock_for ( ) and bool try_lock_until ( ) Generally, you want to use a std::lock_guard to lock/unlock the mutex Locks the mutex on construction; unlocks on destruction http://yaserzt.com/83

It is not uncommon to need to do something once and exactly once, e.g., initialization of some state, setting up of some resource, etc. Multiple threads might attempt this, because they need the result of the initialization, setup, etc. You can use (from header ) template void call_once (std::once_flag & flag, F && f, Args&& args...); Like this: (remember that it also acts as a barrier) std::once_flag init_done; void ThreadProc () { std::call_once (init_done, []{InitSystem();}); } http://yaserzt.com/84

async() can be used to run functions asynchronously (from header ) template std::future async (F && f, Args&&... args); returns immediately, but runs f(args...) asynchronously (possibly on another thread) e.g. future t0 = async(FindMin, v); or future t1 = async([&]{return FindMin(v);}); An object of type std::future basically means that someone has promised to put a T in there in the future. Incidentally, the other half of future is called promise Key operation is T get (), which waits for the promised value. http://yaserzt.com/85

#include string flip (string s) { reverse (s.begin(), s.end()); return s; } int main () { vector > v; v.push_back (async ([] {return flip( ",olleH");})); v.push_back (async ([] {return flip(" weN evarB");})); v.push_back (async ([] {return flip( "!dlroW");})); for (auto& i : v) cout << i.get(); cout << endl; return 0; } http://yaserzt.com/86

Really Getting Rid of NULL http://yaserzt.com/87

What do you do when you have char const * get_object_name (int id), and the object ID does not exist in your objects? unsigned get_file_size (char const * path), and the file does not exist? double sqrt (double x), and x is negative? You might use NULL, or special error values or even exceptions, but the fact remains that sometimes, you dont want to return (or pass around) anything. You want some values to be optional. Aha! Lets write a class that allows us to work with such values… http://yaserzt.com/88

Any questions? http://yaserzt.com/89

Implementation of Optional Discussion of wrapping objects with locking General wrapping of asynchronous transactions Initializer lists and uniform initialization constexpr std::unordered_containers Smart pointers std::unique_ptr std::shared_ptr Implementing shared pointer http://yaserzt.com/90

If you write C-style code, youll end up with C-style bugs. -- Bjarne Stroustrup If you write Java-style code, youll have Java-level performance. http://yaserzt.com/91

Contact us at http://deadmage.com/ And me at yaserzt@gmail.com

Yaser Zhian Dead Mage IGDI, Workshop 10, May 30 th -31 st, 2013.

Similar presentations

Presentation on theme: "Yaser Zhian Dead Mage IGDI, Workshop 10, May 30 th -31 st, 2013."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Yaser Zhian Dead Mage IGDI, Workshop 10, May 30 th -31 st, 2013.

Similar presentations

Presentation on theme: "Yaser Zhian Dead Mage IGDI, Workshop 10, May 30 th -31 st, 2013."— Presentation transcript:

Similar presentations

About project

Feedback