Thread-Safe Programming Living With Linux
Thread-Safe Programming Tommy Reynolds Fedora Documentation Project Steering Committee
What Are Threads? Multiple execution paths in a single program. Threads share common address space, credentials, and resource limits. POSIX threading model assumes threads are not visible to the O/S kernel. Linux doesn’t work this way.
Native POSIX Thread Library Traditional POSIX threads require an application-side manager thread 1.O/S schedules application based on overall behavior 2.Application’s manager thread chooses appropriate thread to run. On Linux, threads are known to kernel This lets Linux scheduler make an optimal choice
Mindset Warping Plan for multi-threaded behavior from the beginning “Multi-threaded” is all or nothing; even one little thread brings ALL the thread synchronization problems to the forefront Always think “is it safe for my other application threads to run while I’m executing this statement” is appropriate paranoia
Compiling A Threaded Program Compiler dependent, no standard technique Often, special defines and libraries are needed Preferred GCC method: $ gcc -pthread -o foo foo.c
Application Considerations Threaded application is a different animal from traditional applications Any thread could start running at any time Resources shared between threads must be carefully guarded. Synchronization tools 1.Semaphore (aka MUTEX) 2.Condition Variables (aka CV)
Your First Thread Is Free... In every program, a default thread is created that runs the main() function Traditional POSIX implementations also create a manager thread, but Linux doesn't All other threads are created by application Unlike fork / wait model, there is no parent / child relationship among threads. Knowing when sibling thread terminates takes special effort.
Keeping The Balance Traditional applications can use fork(2) and wait(2) to manage processes. Threads use pthread_create(P) and pthread_join(P) to a similar purpose. Threads are expected to return an exit status. Threads may be detached if the exit status is of no interest; avoid zombie threads!
Shape Of A Thread #include void * thread( void *arg ) { myarg_t * myarg = arg; intresults;... pthread_exit( &results ); }
Cleaning Up Cleanly #include... void unwind( void * arg ) { /* Whatever */ } void * thread( void * arg ) { pthread_cleanup_push( unwind, arg ); /* Funky stuff */ pthread_cleanup_pop( 1 ); pthread_exit( NULL ); }
Signal Handling Threads all share signal handlers Asynchronous signals (SIGHUP, SIGTERM, SIGUSR1) offered to all threads in a process Synchronous signals (SIGFPE, SIGBUS, SIGSEGV) delivered to offending thread Use pthread_kill(P) to send signal to a specific thread
What Are Shared Resources? Global variables accessed (read or write) by multiple threads Shared data areas (read or write); often passed via pointer Most standard library routines – many have re-entrant versions Even malloc(3) and free(3) are not thread safe
Critical Regions Considered Harmful Puts guards around lines of code that access shared resources disable_interrupts() hw->reg &= ~(1 << 10) enable_interrupts() Wrong focus – protect shared resources, not code Think about shared resources, not code
Application Toolkit Needed for the same reason you can’t call printf() from a signal handler. Use semaphores (MUTEX) and Condition Variables Avoid race conditions, where the result depends on who finishes first Non-repeatable behavior is very hard to debug
Semaphores (Spelled “MUTEX”) POSIX version of Nicholas Wirth’s P/V semaphores (yes / no decisions) Defined in Always in either locked or unlocked state pthread_mutex_lock() blocks until lock is unlocked and then returns pthread_mutex_unlock() unlocks semaphore, awakens any blocked threads
Example Using MUTEX #include pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER; static int k; void * producer( void * arg ) { pthread_mutex_lock( &m ); ++k; pthread_mutex_unlock( &m ); } void * consumer( void * arg ) { pthread_mutex_lock( &m ); printf( “k = %d\n”, k ); pthread_mutex_unlock( &m ); }
Condition Variables Semaphores are for yes or no decisions Condition Variables permit complicated decisions to be made atomically Guards the test (function call!) with a MUTEX Grabs the MUTEX; checks the condition; if not true, release MUTEX then block then re-acquire MUTEX without missing anything
Condition Variable Example #include pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t cv = PTHREAD_COND_INITIALIZER; void * consumer( void * ) { pthread_mutex_lock( &m ); while( check_for_messages() == 0 ) { pthread_cond_wait( &cv, &m ); } consume_messages(); pthread_mutex_unlock( &m ); } void * producer( void * ) { pthread_mutex_lock( &m ); accept_another_message(); pthread_cond_broadcast( &cv ); pthread_mutex_unlock( &m ); }
Using Standard Library With Threads Thread-safe standard library version of foo() is usally named foo_r() Protect others using a MUTEX pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER; … pthread_mutex_lock( &m ); ptr = malloc( 10 ); pthread_mutex_unlock( &m ); …
Accessing One File From Two Threads Use O_EXCL for open(2) system call, although this is advisory only Threads within the same process can wrap the access with a MUTEX Separate applications often link(2) a known filename, do the access, then unlink(2) the guard file
In Summary, En Passant POSIX-standard thread API permits portable programs with expected behavior Requires “three-dimensional” thinking by the application programmer Seemingly innocuous practices can be fraught with peril – programmers must be aware of ALL threats There is no cook-book solution
Protect Your Thread Territory -or- This Is The End