Multicore programming

Multicore programming
Unordered sets: concurrent hash tables Week 2 - Wednesday Trevor Brown

Last time Queues Relaxed data structures Ordered vs unordered sets
Strict ordering not so helpful, and non-scalable Relaxed data structures Somewhat ordered for better scalability Multi-Queue: scalable, and ordering can be “close to” FIFO w.h.p Ordered vs unordered sets Fixed size, probing, insert-only hash tables (more this class)

Simplest hash table possible: Locking, Insert-only, probing, fixed capacity
Array of Bucket: data[] Bucket: (Lock, Key) Simplest possible locking protocol: Lock bucket before any access (read or write) to its key Insert(7) hashes to 2 7 9 3 Insert(3) hashes to 5 Insert(9) hashes to 2

Implementation and correctness
Reasoning about correctness: Important lemmas Buckets change only once, from NULL to non-NULL (not sketching here) If we probe a sequence of cells at times t1 < t2 < … < tn, and see non-NULL values, then we will see the same values if we do all probing atomically at time tn or later Linearization points (LPs) last read of data[index] write to data[index] last read of data[index] int insert(int key) 1 int h = hash(key); 2 for (int i=0;i<capacity;++i) { 3 int index = (h+i) % capacity; 4 lock(index); int found = data[index]; if (found == key) { unlock(index); return false; } else if (found == NULL) { data[index] = key; unlock(index); return true; } 14 unlock(index); 15 } 16 return FULL;

Linearizability proof sketch
Case (LP: last read of data[index]) Easy case We returned at 16 after seeing data[index] != NULL for every index Non-NULL buckets don’t change So, when we do the last read of data[index], all buckets are full! So, our insert would return FULL if performed atomically at the LP int insert(int key) 1 int h = hash(key); 2 for (int i=0;i<capacity;++i) { 3 int index = (h+i) % capacity; 4 lock(index); int found = data[index]; if (found == key) { unlock(index); return false; } else if (found == NULL) { data[index] = key; unlock(index); return true; } 14 unlock(index); 15 } 16 return FULL;

Case (LP: last read of data[index]) Suppose we in the k-th iteration In the first k-1 iterations, we see data[index] != NULL and != key Claim: if our insert(key) ran atomically at the LP, it would perform k iterations, and see the same values as it sees in the concurrent execution In its k-th iteration, the atomic insert(key) reads data[index] at the same time as the concurrent insert(key), so it sees data[index] == key So, it returns false int insert(int key) 1 int h = hash(key); 2 for (int i=0;i<capacity;++i) { 3 int index = (h+i) % capacity; 4 lock(index); int found = data[index]; if (found == key) { unlock(index); return false; } else if (found == NULL) { data[index] = key; unlock(index); return true; } 14 unlock(index); 15 } 16 return FULL; Insert(key) time … LP: Iteration k reads data[index] Iteration 0 reads data[index] Iteration k-1 reads data[index] != NULL, != key, & does not change! != NULL, != key, & does not change! The values read in previous iterations are unchanged!

Case (LP: write to data[index]) Argument is similar to the previous case: Suppose we in the k-th iteration In the first k-1 iterations, we see data[index] != NULL and != key Claim: if our insert(key) ran atomically at the LP, it would perform k iterations, and see the same values as it sees in the concurrent execution In its k-th iteration, the atomic insert(key) reads data[index] at the same time as the concurrent insert(key), so it sees data[index] == NULL So, it returns true int insert(int key) 1 int h = hash(key); 2 for (int i=0;i<capacity;++i) { 3 int index = (h+i) % capacity; 4 lock(index); int found = data[index]; if (found == key) { unlock(index); return false; } else if (found == NULL) { data[index] = key; unlock(index); return true; } 14 unlock(index); 15 } 16 return FULL;

Using lock-free techniques
Instead of: Locking data[index], then If it is NULL, writing to it Use CAS to atomically change data[index] from NULL to the new value If we fail, someone else inserted there If it’s our value, we return false (because the value is now already present) Otherwise, we go to the next cell and retry int insert(int key) 1 int h = hash(key); 2 for (int i=0;i<capacity;++i) { 3 int index = (h+i) % capacity; 4 int found = data[index]; 5 if (found == key) { return false; 8 } else if (found == NULL) { if (CAS(&data[index], NULL, key)) { return true; } else if (data[index] == key) { return false; } 14 } 15 } 16 return FULL; Correctness is fairly similar to the lock-based algorithm (some additional reasoning, esp. when the CAS fails) Progress argument?

An alternative design: Chaining
Array of concurrent linked lists (Any concurrent linked list works) Insert(7) hashes to 2 Insert(3) hashes to 5 Advantages? Disadvantages? 7 3 Insert(9) hashes to 2 Reduced need for table expansion Deletion is easy (handled by the list) 9 Lists add overhead (vs ~one CAS) Cache misses!

Hashing numeric strings with:
bad hash functions Hashing numeric strings with: SDBM

bad hash functions Hashing numeric strings with: DBJ2A

bad hash functions Hashing numeric strings with: FNV1

Good hash functions Hashing numeric strings with: FNV1-A

Good hash functions Hashing numeric strings with: Murmur2

What about deletion? Insert(7) hashes to 2 For example, in our lock-based, fixed size, probing hash table: Insert(3) hashes to 5 7 9 3 Insert(9) hashes to 2 Delete(7) hashes to 2 Incorrect! Delete(9) hashes to 2

A workaround: Tombstones
Introduce a special key value called a TOMBSTONE To delete a key Instead of setting key = NULL, set it to TOMBSTONE TOMBSTONE essentially acts like a regular key (that never matches a key you are trying to find/insert/delete) Insertions must still probe past it to find NULL Deletions must still probe past it to find the desired key

How tombstones fix our problem
Insert(7) hashes to 2 Insert(3) hashes to 5 7 9 3 Insert(9) hashes to 2 Delete(7) hashes to 2 Delete(9) hashes to 2

Lock-based delete with tombstones
bool erase(int key) 1 int h = hash(key); 2 for (int i=0;i<capacity;++i) { 3 int index = (h+i) % capacity; 4 lock(index); int found = data[index]; if (found == NULL) { unlock(index); return false; } else if (found == key) { data[index] = TOMBSTONE; unlock(index); return true; } 14 unlock(index); 15 } 16 return false; Insert and contains remain the same They already skip past keys that do not match the argument key TOMBSTONE acts like such a key

A Lock-free version bool erase(int key) 1 int h = hash(key);
2 for (int i=0;i<capacity;++i) { 3 int index = (h+i) % capacity; 4 int found = data[index]; 5 if (found == NULL) { return false; 7 } else if (found == key) { return CAS(&data[index], key, TOMBSTONE); 9 } 10 } 11 return false; If CAS succeeds, we deleted key If CAS fails, another thread concurrently deleted key (return false, since we did not delete it)

Downsides of tombstones?
TOMBSTONEs are never cleaned up Cleaning them up seems hard Cannot remove a TOMBSTONE until there are no keys in the table that probed past the TOMBSTONE when they were inserted Thought: if we do table expansion, there’s no need to copy over TOMBSTONEs… 9 3 Probes when 9 was inserted

Summary Unordered sets Deletion in concurrent hash tables
Fixed size, probing, insert-only hash tables Lock-based (linearizability proof) and lock-free Probing vs chaining Deletion in concurrent hash tables

Multicore programming

Similar presentations

Presentation on theme: "Multicore programming"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multicore programming

Similar presentations

Presentation on theme: "Multicore programming"— Presentation transcript:

Similar presentations

About project

Feedback