Review: threads and processes Landon Cox January 20, 2016.

Review: threads and processes Landon Cox January 20, 2016

Intro to processes Remember, for any area of OS, ask What interface does the hardware provide? What interface does the OS provide? Physical reality? Single computer (CPUs + memory) Execute instructions from many programs What an application sees? Each app thinks it has its own CPU + memory

Hardware, OS interfaces Hardware OS Applications MemoryCPUs CPU, Mem Job 1 CPU, Mem Job 2 CPU, Mem Job 3

What is a process? Informal A program in execution Running code + things it can read/write Process ≠ program Formal ≥ 1 threads in their own address space (soon threads will share an address space)

Parts of a process Thread Sequence of executing instructions Active: does things Address space Data the process uses as it runs Passive: acted upon by threads

Play analogy Process is like a play performance Program is like the play’s script Threads Address space What are the threads? What is the address space?

What is in the address space? Program code Instructions, also called “text” Data segment Global variables, static variables Heap (where “new” memory comes from) Stack Where local variables are stored

Review of the stack Each stack frame contains a function’s Local variables Parameters Return address Saved values of calling function’s registers The stack enables recursion

const1=1 const2=0 const1=1 const2=0 main Example stack tmp=1 RA=0x804838c tmp=1 RA=0x804838c A RA=0x8048361 B const=0 RA=0x8048354 const=0 RA=0x8048354 C tmp=0 RA=0x8048347 tmp=0 RA=0x8048347 A 0xfffffff 0x0 Memory void C () { A (0); } void B () { C (); } void A (int tmp){ if (tmp) B (); } int main () { A (1); return 0; } 0x8048347 0x8048354 0x8048361 0x804838c Code Stack … SP

const1=3 const2=0 const1=3 const2=0 main The stack and recursion bnd=3 RA=0x804838c bnd=3 RA=0x804838c A bnd=2 RA=0x8048361 bnd=2 RA=0x8048361 A bnd=1 RA=0x8048361 bnd=1 RA=0x8048361 A bnd=0 RA=0x8048361 bnd=0 RA=0x8048361 A 0xfffffff 0x0 Memory void A (int bnd){ if (bnd) A (bnd-1); } int main () { A (3); return 0; } 0x8048361 0x804838c Code Stack … SP How can recursion go wrong? Can overflow the stack … Keep adding frame after frame …

What is missing? What state isn’t in the address space? Registers Program counter (PC) General purpose registers Review architecture for more details

Multiple threads in an addr space Several actors on a single set Sometimes they interact (speak, dance) Sometimes they are apart (different scenes)

Private vs global thread state What state is private to each thread? PC (where actor is in his/her script) Stack, SP (actor’s mindset) What state is shared? Global variables, heap (props on set) Code (like lines of a play)

Concurrency Having multiple threads active at one time Thread is the unit of concurrency Primary topics How threads cooperate on a single task How multiple threads can share the CPU

Address spaces Address space Unit of “state partitioning” Primary topics Many addr spaces sharing physical memory Efficiency Safety (protection)

Cooperating threads Assume each thread has its own CPU We will relax this assumption later CPUs run at unpredictable speeds Can be stalled for any number of reasons Source of non-determinism Memory CPU Thread A CPU Thread B CPU Thread C

Non-determinism and ordering Time Thread A Thread B Thread C Global ordering Why do we care about the global ordering? Might have dependencies between events Different orderings can produce different results Why is this ordering unpredictable? Can’t predict how fast processors will run

Non-determinism example 1 Thread A: cout << “ABC”; Thread B: cout << “123”; Possible outputs? “A1BC23”, “ABC123”, … Impossible outputs? Why? “321CBA”, “B12C3A”, … What is shared between threads? Screen, maybe the output buffer

Non-determinism example 2 y=10; Thread A: int x = y+1; Thread B: y = y*2; Possible results? A goes first: x = 11 and y = 20 B goes first: y = 20 and x = 21 What is shared between threads? Variable y

Non-determinism example 3 x=0; Thread A: x = 1; Thread B: x = 2; Possible results? B goes first: x = 1 A goes first: x = 2 Is x = 3 possible?

Example 3, continued What if “ x = ; ” is implemented as x := x & 0 x := x | Consider this schedule Thread A: x := x & 0 Thread B: x := x & 0 Thread B: x := x | 1 Thread A: x := x | 2

Atomic operations Must know what operations are atomic before we can reason about cooperation Atomic Indivisible Happens without interruption Between start and end of atomic action No events from other threads can occur

Review of examples Print example (ABC, 123) What did we assume was atomic? What if “print” is atomic? What if printing a char was not atomic? Arithmetic example ( x=y+1, y=y*2 ) What did we assume was atomic?

Atomicity in practice On most machines Memory assignment/reference is atomic E.g.: a=1, a=b Many other instructions are not atomic E.g.: double-precision floating point store (often involves two memory operations)

Virtual/physical interfaces Hardware OS Applications SW atomic operations HW atomic operations If you don’t have atomic operations, you can’t make one.

Constraining concurrency Synchronization Controlling thread interleavings Some events are independent No shared state Relative order of these events don’t matter Other events are dependent Output of one can be input to another Their order can affect program results

Goals of synchronization 1. All interleavings must give correct result Correct concurrent program Works no matter how fast threads run Important for your projects! 2. Constrain program as little as possible Why? Constraints slow program down Constraints create complexity

Raising the level of abstraction Locks Also called mutexes Provide mutual exclusion Prevent threads from entering a critical section Lock operations Lock (aka Lock::acquire ) Unlock (aka Lock::release )

Lock operations Lock: wait until lock is free, then acquire it This is a busy-waiting implementation We’ll fix this in a few lectures Unlock: atomic lock = 0 do { if (lock is free) { lock = 1 break } } while (1) Must be atomic with respect to other threads calling this code

Elements of locking 1. The lock is initially free 2. Threads acquire lock before an action 3. Threads release lock when action completes 4. Lock() must wait if someone else has lock Key idea All synchronization involves waiting Threads are either running or blocked

Example: thread-safe queue dequeue () { lock (qLock); element=NULL; if (head != NULL) { // if queue non-empty if (head->next!=0) { // remove head element=head->next; head->next= head->next->next; } else { element = head; head = NULL; } } unlock (qLock); return element; } enqueue () { lock (qLock) // ptr is private // head is shared new_element = new node(); if (head == NULL) { head = new_element; } else { node *ptr; // find queue tail for (ptr=head; ptr->next!=NULL; ptr=ptr->next){} ptr->next=new_element; } new_element->next=0; unlock(qLock); } What can go wrong?

Thread-safe queue Can enqueue unlock anywhere? No Must leave shared data In a consistent/sane state Data invariant “consistent/sane state” “always” true enqueue () { lock (qLock) // ptr is private // head is shared new_element = new node(); if (head == NULL) { head = new_element; } else { node *ptr; // find queue tail for (ptr=head; ptr->next!=NULL; ptr=ptr->next){} ptr->next=new_element; } unlock(qLock); // safe? new_element->next=0; }

Invariants What are the queue invariants? Each node appears once (from head to null) Enqueue results in prior list + new element Dequeue removes exactly one element Can invariants ever be false? Must be Otherwise you could never change states

More on invariants So when is the invariant broken? Can only be broken while lock is held And only by thread holding the lock

http://www.flickr.com/photos/jacobaaron/3489644869/

More on invariants So when is the invariant broken? Can only be broken while lock is held And only by thread holding the lock Really a “public” invariant The data’s state in when the lock is free Like having your house tidy before guests arrive Hold lock whenever accessing shared data

More on invariants What about reading shared data? Still must hold lock Else another thread could break invariant (Thread A prints Q as Thread B enqueues)

Intro to ordering constraints Say you want dequeue to wait while the queue is empty Can we just busy-wait? No! Still holding lock dequeue () { lock (qLock); element=NULL; while (head==NULL) {} // remove head element=head->next; head->next=NULL; unlock (qLock); return element; }

Release lock before spinning? dequeue () { lock (qLock); element=NULL; unlock (qLock); while (head==NULL) {} lock (qLock); // remove head element=head->next; head->next=NULL; unlock (qLock); return element; } What can go wrong? Head might be NULL when we try to remove entry

One more try Does it work? Seems ok Why? Shared state is protected Downside? Busy-waiting Wasteful dequeue () { lock (qLock); element=NULL; while (head==NULL) { unlock (qLock); lock (qLock); } // remove head element=head->next; head->next=NULL; unlock (qLock); return element; }

Ideal solution Would like dequeueing thread to “sleep” Add self to “waiting list” Enqueuer can wake up when Q is non-empty Problem: what to do with the lock? Why can’t dequeueing thread sleep with lock? Enqueuer would never be able to add

Release the lock before sleep? dequeue () { acquire lock … if (queue empty) { release lock add self to wait list sleep acquire lock } … release lock } enqueue () { acquire lock find tail of queue add new element if (dequeuer waiting){ remove from wait list wake up dequeuer } release lock } Does this work?

Release the lock before sleep? dequeue () { acquire lock … if (queue empty) { release lock add self to wait list sleep acquire lock } … release lock } enqueue () { acquire lock find tail of queue add new element if (dequeuer waiting){ remove from wait list wake up dequeuer } release lock } 2 1 3 Thread can sleep forever

Release the lock before sleep? dequeue () { acquire lock … if (queue empty) { add self to wait list release lock sleep acquire lock } … release lock } enqueue () { acquire lock find tail of queue add new element if (dequeuer waiting){ remove from wait list wake up dequeuer } release lock }

Release the lock before sleep? dequeue () { acquire lock … if (queue empty) { add self to wait list release lock sleep acquire lock } … release lock } enqueue () { acquire lock find tail of queue add new element if (dequeuer waiting){ remove from wait list wake up dequeuer } release lock } 2 1 3 Problem: missed wake-up Note: this can be fixed, but it’s messy

Two types of synchronization As before we need to raise the level of abstraction 1. Mutual exclusion One thread doing something at a time Use locks 2. Ordering constraints Describe “before-after” relationships One thread waits for another Use monitors: a lock + its condition variable

Locks and condition variables Condition variables Let threads sleep inside a critical section Internal atomic actions (for now, by definition) CV State = queue of waiting threads + one lock // begin atomic release lock put thread on wait queue go to sleep // end atomic

Condition variable operations wait (lock){ release lock put thread on wait queue go to sleep // after wake up acquire lock } signal (){ wakeup one waiter (if any) } broadcast (){ wakeup all waiters (if any) } Atomic Lock always held Lock usually held Lock always held

CVs and invariants Ok to leave invariants violated before wait? No: wait can release the lock Larger rule about returning from wait Lock may have changed hands State can change between wait entry and return Don’t make assumptions about shared state

Multi-threaded queue dequeue () { acquire lock if (queue empty) { wait (lock, CV) } remove item from queue release lock return removed item } enqueue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } What if “queue empty” takes more than one instruction? Any problems with the “if” statement in dequeue?

Multi-threaded queue dequeue () { acquire lock if (queue empty) { // begin atomic wait release lock add wait list, sleep // end atomic wait re-acquire lock } remove item from queue release lock return removed item } enqueue () { acquire lock find tail of queue add new element signal (lock, CV) release lock }

Multi-threaded queue dequeue () { acquire lock if (queue empty) { // begin atomic wait release lock add wait list, sleep // end atomic wait re-acquire lock } remove item from queue release lock return removed item } enqueue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } 2 1 dequeue () { acquire lock … return removed item } 3 4

Multi-threaded queue dequeue () { acquire lock if (queue empty) { // begin atomic wait release lock add wait list, sleep // end atomic wait re-acquire lock } remove item from queue release lock return removed item } enqueue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } How to solve?

Multi-threaded queue dequeue () { acquire lock while (queue empty) { wait (lock, CV) } remove item from queue release lock return removed item } enqueue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } Solve with a while loop (“loop before you leap”) You can now do first programming project The “condition” in condition variable

Recap and looking ahead Hardware OS Applications Threads, synchronization primitives Atomic Load-Store, Interrupt enable- disable, Atomic Test-Set

Course administration Next lecture Memory and address spaces Paging, page tables, TLB, etc. Next week Start reading papers Look at early operating systems First programming project out

Threads that aren’t running What is a non-running thread? thread=“sequence of executing instructions” non-running thread=“paused execution” Must save thread’s private state To re-run, re-load private state Want thread to start where it left off

Private vs global thread state What state is private to each thread? Code (like lines of a play) PC (where actor is in his/her script) Stack, SP (actor’s mindset) What state is shared? Global variables, heap (props on set)

Thread control block (TCB) What needs to access threads’ private data? The CPU This info is stored in the PC, SP, other registers The OS needs pointers to non-running threads’ data Thread control block (TCB) Container for non-running threads’ private data Values of PC, code, SP, stack, registers

Thread control block CPU Address Space TCB1 PC SP registers TCB2 PC SP registers TCB3 PC SP registers Code Stack Code Stack Code Stack PC SP registers Thread 1 running Ready queue

Thread control block CPU Address Space TCB2 PC SP registers TCB3 PC SP registers Code Stack PC SP registers Thread 1 running Ready queue

Thread states Running Currently using the CPU Ready Ready to run, but waiting for the CPU Blocked Stuck in lock (), wait (), or down ()

Switching threads What needs to happen to switch threads? 1. Thread returns control to OS For example, via the “yield” call 2. OS chooses next thread to run 3. OS saves state of current thread To its thread control block 4. OS loads context of next thread From its thread control block 5. Run the next thread On Linux swapcontext

1. Thread returns control to OS How does the thread system get control? Voluntary internal events Thread might block inside lock or wait Thread might call into kernel for service (system call) Thread might call yield Are internal events enough?

1. Thread returns control to OS Involuntary external events (events not initiated by the thread) Hardware interrupts Transfer control directly to OS interrupt handlers From your architecture course –CPU checks for interrupts while executing –Jumps to OS code with interrupt mask set Interrupts lead to pre-emption (a forced yield) Common interrupt: timer interrupt

2. Choosing the next thread If no ready threads, just spin Modern CPUs execute a “halt” instruction Loop switches to thread if one is ready Many ways to prioritize ready threads Huge literature on scheduling algorithms

3. Saving state of current thread What needs to be saved? Registers, PC, SP What makes this tricky? Self-referential sequence of actions Need registers to save state But you’re trying to save all the registers Saving the PC is particularly tricky

Saving the PC Why won’t this work? Returning thread will execute instruction at 100 And just re-execute the switch Really want to save address 102 100 store PC in TCB 101 switch to next thread Instruction address

4. OS loads the next thread Where is the next thread’s state/context? Thread control block (in memory) How to load the registers? Use load instructions to grab from memory How to load the stack? Stack is already in memory, load SP

5. OS runs the next thread How to resume thread’s execution? Jump to the saved PC On whose stack are these steps running? or Who jumps to the saved PC? The thread that called yield (or was interrupted or called lock/wait) How does this thread run again? Some other thread must switch to it

Why use locks? If we have disable-enable, why do we need locks? Program could bracket critical sections with disable-enable Might not be able to give control back to thread library Can’t have multiple locks (over-constrains concurrency) disable interrupts while (1){}

Why use locks? How do we know if disabling interrupts is safe? Need hardware support CPU has to know if running code is trusted (i.e, is the OS) Example of why we need the kernel Other things that user programs shouldn’t do? Manipulate page tables Reboot machine Communicate directly with hardware Will cover in upcoming memory review

Kernel implementation Disable interrupts + busy-waiting Lock implementation #1 lock () { disable interrupts while (value != FREE) { enable interrupts disable interrupts } value = BUSY enable interrupts } unlock () { disable interrupts value = FREE enable interrupts } Why is it ok for lock code to disable interrupts? It’s in the trusted kernel (we have to trust something).

Kernel implementation Disable interrupts + busy-waiting Lock implementation #1 lock () { disable interrupts while (value != FREE) { enable interrupts disable interrupts } value = BUSY enable interrupts } unlock () { disable interrupts value = FREE enable interrupts } Do we need to disable interrupts in unlock? Only if “value = FREE” is multiple instructions (safer)

Kernel implementation Disable interrupts + busy-waiting Lock implementation #1 lock () { disable interrupts while (value != FREE) { enable interrupts disable interrupts } value = BUSY enable interrupts } unlock () { disable interrupts value = FREE enable interrupts } Why enable-disable in lock loop body? Otherwise, no one else will run (including unlockers)

Using read-modify-write instructions Disabling interrupts Ok for uni-processor, breaks on multi-processor Why? Could use atomic load-store to make a lock Inefficient, lots of busy-waiting Hardware people to the rescue!

Using read-modify-write instructions Most modern processor architectures Provide an atomic read-modify-write instruction Atomically Read value from memory into register Write new value to memory Implementation details Lock memory location at the memory controller

Test&set on most architectures test&set (X) { tmp = X X = 1 return (tmp) } Set: sets location to 1 Test: retruns old value Slightly different on x86 (Exchange) Atomically swaps value between register and memory

Use test&set Initially, value = 0 Lock implementation #2 lock () { while (test&set(value) == 1) { } } unlock () { value = 0 } What happens if value = 1? What happens if value = 0?

Locks and busy-waiting All implementations have used busy- waiting Wastes CPU cycles To reduce busy-waiting, integrate Lock implementation Thread dispatcher data structures

Interrupt disable, no busy-waiting Lock implementation #3 lock () { disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts } unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts }

Lock implementation #3 lock () { disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts } unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts } Who gets the lock after someone calls unlock? This is called a “hand- off” lock.

Lock implementation #3 lock () { disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts } unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts } Who might get the lock if it weren’t handed-off directly? (i.e., if value weren’t set BUSY in unlock) This is called a “hand- off” lock.

Lock implementation #3 lock () { disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts } unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts } What kind of ordering of lock acquisition guarantees does the hand-off lock provide? Fumble lock? This is called a “hand- off” lock.

Lock implementation #3 lock () { disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts } unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts } What does this mean? Are we saving the PC? This is called a “hand- off” lock.

Lock implementation #3 lock () { disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { lockqueue.push(&current_thread->ucontext); swapcontext(&current_thread->ucontext, &new_thread->ucontext)); } enable interrupts } No, just adding a pointer to the TCB/context. This is called a “hand- off” lock. unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts }

Thread AThread B yield () { disable interrupts … switch (B->A) enable interrupts } // exit thread library lock () { disable interrupts … switch (A->B) back from switch (B->A) … enable interrupts } // exit yield unlock () // moves A to ready queue yield () { disable interrupts … switch (B->A) back from switch (A->B) … enable interrupts } // exit lock B holds lock

Test&set, minimal busy-waiting Lock implementation #4 lock () { while (test&set (guard)) {} // like interrupt disable if (value == FREE) { value = BUSY } else { put on queue of threads waiting for lock switch to another thread // don’t add to ready queue } guard = 0 // like interrupt enable } unlock () { while (test&set (guard)) {} // like interrupt disable value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } guard = 0 // like interrupt enable }

Lock implementation #4 lock () { while (test&set (guard)) {} // like interrupt disable if (value == FREE) { value = BUSY } else { put on queue of threads waiting for lock switch to another thread // don’t add to ready queue } guard = 0 // like interrupt enable } unlock () { while (test&set (guard)) {} // like interrupt disable value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } guard = 0 // like interrupt enable } Why is this better than t&s-only lock implementation? Only busy- wait while another thread is in lock or unlock Before, we busy-waited while lock was held

Lock implementation #4 lock () { while (test&set (guard)) {} // like interrupt disable if (value == FREE) { value = BUSY } else { put on queue of threads waiting for lock switch to another thread // don’t add to ready queue } guard = 0 // like interrupt enable } unlock () { while (test&set (guard)) {} // like interrupt disable value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } guard = 0 // like interrupt enable } What is the switch invariant? Threads promise to call switch with guard set to 1.

Summary of implementing locks Synchronization code needs atomicity Three options Atomic load-store Lots of busy-waiting Interrupt disable-enable No busy-waiting Breaks on a multi-processor machine Atomic test-set Minimal busy-waiting Works on multi-processor machines

Semaphores First defined by Dijkstra in mid 60s Two operations: up and down // aka “P” (“proberen”) down () { do { // begin atomic if (value > 0) { value-- break } // end atomic } while (1) } // aka “V” (“verhogen”) up () { // begin atomic value++ // end atomic } What is going on here? Can value ever be < 0?

More semaphores Key state of a semaphore is its value Initial value determines semaphore’s behavior Value cannot be accessed outside semaphore (i.e., there is no semaphore.getValue() call) Semaphores can be both synchronization types Mutual exclusion (like locks) Ordering constraints (like monitors)

Semaphore mutual exclusion Ensure that 1 (or < N) thread is in critical section How do we make a semaphore act like a lock? Set initial value to 1 (or N) Like lock/unlock, but more general (could allow 2 threads in critical section if initial value = 2) Lock is equivalent to a binary semaphore s.down (); // critical section s.up ();

Semaphore ordering constraints Thread A waits for B before proceeding How to make a semaphore behave like a monitor? Set initial value of semaphore to 0 A is guaranteed to wait for B to finish Doesn’t matter if A or B run first Like a CV in which condition is “ sem.value==0” Can think of as a “prefab” condition variable // Thread A s.down (); // continue // Thread B // do task s.up ();

Upcoming Friday lecture Review of memory and address spaces Next week We’ll start reading papers Start programming Project 1 Other questions?

Review: threads and processes Landon Cox January 20, 2016.

Similar presentations

Presentation on theme: "Review: threads and processes Landon Cox January 20, 2016."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Review: threads and processes Landon Cox January 20, 2016.

Similar presentations

Presentation on theme: "Review: threads and processes Landon Cox January 20, 2016."— Presentation transcript:

Similar presentations

About project

Feedback