Presentation is loading. Please wait.

Presentation is loading. Please wait.

Concurrent Programming with Threads

Similar presentations


Presentation on theme: "Concurrent Programming with Threads"— Presentation transcript:

1 Concurrent Programming with Threads
Rajkumar Buyya School of Computer Science and Software Engineering Monash Technology Melbourne, Australia URL:

2 Objectives Explain the parallel computing right from architecture, OS, programming paradigm, and applications Explain the multithreading paradigm, and all aspects of how to use it in an application Cover all basic MT concepts Explore issues related to MT Contrast Solaris, POSIX, Java threads Look at the APIs in detail Examine some Solaris, POSIX, and Java code examples Debate on: MPP and Cluster Computing

3 Agenda Overview of Computing Operating Systems Issues Threads Basics
Multithreading with Solaris and POSIX threads Multithreading in Java Distributed Computing Grand Challenges Solaris, POSIX, and Java example code

4 Multi-Processor Computing System
Computing Elements Applications Programming paradigms P Microkernel Multi-Processor Computing System Threads Interface Operating System Hardware Process Processor Thread P

5 Two Eras of Computing Architectures Compilers Applications P.S.Es
Sequential Era Parallel Era Commercialization R & D Commodity

6 History of Parallel Processing
PP can be traced to a tablet dated around 100 BC. Tablet has 3 calculating positions. Infer that multiple positions: Reliability/ Speed

7 We modeled PP after those of biological species.
Motivating Factors d Just as we learned to fly, not by constructing a machine that flaps its wings like birds, but by applying aerodynamics principles demonstrated by nature... We modeled PP after those of biological species.

8 Motivating Factors Aggregated speed with which complex calculations carried out by individual neurons response is slow (ms) - demonstrate feasibility of PP

9 Why Parallel Processing?
Computation requirements are ever increasing -- visualization, distributed databases, simulations, scientific prediction (earthquake), etc.. Sequential architectures reaching physical limitation (speed of light, thermodynamics)

10 Technical Computing Solving technology problems using
computer modeling, simulation and analysis Geographic Information Systems Life Sciences Aerospace Mechanical Design & Analysis (CAD/CAM)

11 Computational Power Improvement
Multiprocessor C.P.I. Uniprocessor No. of Processors

12 Computational Power Improvement
Vertical Horizontal Growth Age

13 Why Parallel Processing?
The Tech. of PP is mature and can be exploited commercially; significant R & D work on development of tools & environment. Significant development in Networking technology is paving a way for heterogeneous computing.

14 Why Parallel Processing?
Hardware improvements like Pipelining, Superscalar, etc., are non-scalable and requires sophisticated Compiler Technology. Vector Processing works well for certain kind of problems.

15 Parallel Program has & needs ...
Multiple “processes” active simultaneously solving a given problem, general multiple processors. Communication and synchronization of its processes (forms the core of parallel programming efforts).

16 Processing Elements Architecture

17 Processing Elements Simple classification by Flynn:
(No. of instruction and data streams) SISD - conventional SIMD - data parallel, vector computing MISD - systolic arrays MIMD - very general, multiple approaches. Current focus is on MIMD model, using general purpose processors. (No shared memory)

18 SISD : A Conventional Computer
Processor Data Input Data Output Instructions Speed is limited by the rate at which computer can transfer information internally. Ex:PC, Macintosh, Workstations

19 The MISD Architecture Data Input Stream Output Processor A B C Instruction Stream A Stream B Instruction Stream C More of an intellectual exercise than a practical configuration. Few built, but commercially not available

20 Ex: CRAY machine vector processing, Thinking machine cm*
SIMD Architecture Instruction Stream Processor A B C Data Input stream A stream B stream C Data Output Ci<= Ai * Bi Ex: CRAY machine vector processing, Thinking machine cm*

21 MIMD Architecture Instruction Stream A Instruction Stream B Instruction Stream C Data Output stream A Data Input stream A Processor A Data Output stream B Data Input stream B Processor B Data Output stream C Processor C Data Input stream C Unlike SISD, MISD, MIMD computer works asynchronously. Shared memory (tightly coupled) MIMD Distributed memory (loosely coupled) MIMD

22 Shared Memory MIMD machine
Processor A Processor B Processor C MEMORY BUS MEMORY BUS MEMORY BUS Global Memory System Comm: Source PE writes data to GM & destination retrieves it Easy to build, conventional OSes of SISD can be easily be ported Limitation : reliability & expandability. A memory component or any processor failure affects the whole system. Increase of processors leads to memory contention. Ex. : Silicon graphics supercomputers....

23 Distributed Memory MIMD
IPC channel IPC channel Processor A Processor B Processor C MEMORY BUS MEMORY BUS MEMORY BUS Memory System A System B System C Communication : IPC on High Speed Network. Network can be configured to ... Tree, Mesh, Cube, etc. Unlike Shared MIMD easily/ readily expandable Highly reliable (any CPU failure does not affect the whole system)

24 Laws of caution..... Speed of computers is proportional to the square of their cost. i.e.. cost = Speed Speedup by a parallel computer increases as the logarithm of the number of processors. C S (speed = cost2) S P log2P Speedup = log2(no. of processors)

25 Caution.... Very fast development in PP and related area have blurred concept boundaries, causing lot of terminological confusion : concurrent computing/ programming, parallel computing/ processing, multiprocessing, distributed computing, etc..

26 It’s hard to imagine a field that changes as rapidly as computing.

27 Computer Science is an Immature Science.
Caution.... Computer Science is an Immature Science. (lack of standard taxonomy, terminologies)

28 Caution.... There is no strict delimiters for contributors to the area of parallel processing : CA, OS, HLLs, databases, computer networks, all have a role to play. This makes it a Hot Topic of Research

29 Parallel Programming Paradigms
Multithreading Task level parallelism

30 Serial Vs. Parallel COUNTER 2 COUNTER COUNTER 1 Q Please

31 High Performance Computing
function1( ) { //......function stuff } t1 function2( ) { //......function stuff } t2 Serial Machine function1 ( ): function2 ( ): Single CPU Time : add (t1, t2) Parallel Machine : MPP function1( ) || function2 ( ) massively parallel system containing thousands of CPUs Time : max (t1, t2)

32 Single and Multithreaded Processes
Single-threaded Process Multiplethreaded Process Threads of Execution Multiple instruction stream Single instruction stream Common Address Space

33 Threaded Libraries, Multi-threaded I/O
OS: Multi-Processing, Multi-Threaded Threaded Libraries, Multi-threaded I/O Application Application Application Application CPU CPU CPU CPU CPU CPU Better Response Times in Multiple Application Environments Higher Throughput for Parallelizeable Applications

34 Multi-threading, continued
Multi-threading, continued... Multi-threaded OS enables parallel, scalable I/O Application Application Application Multiple, independent I/O requests can be satisfied simultaneously because all the major disk, tape, and network drivers have been multi-threaded, allowing any given driver to run on multiple CPUs simultaneously. OS Kernel CPU CPU CPU

35 Shared memory segments, pipes, open files or mmap’d files
Basic Process Model STACK STACK Shared memory segments, pipes, open files or mmap’d files DATA DATA TEXT TEXT Shared Memory maintained by kernel processes processes

36 What are Threads? Local state Global/ shared state PC
Thread is a piece of code that can execute in concurrence with other threads. It is a schedule entity on a processor Registers Hardware Context Status Word Program Counter Running Local state Global/ shared state PC Hard/Software Context Thread Object

37 Threaded Process Model
THREAD STACK SHARED MEMORY THREAD DATA Threads within a process THREAD TEXT Independent executables All threads are parts of a process hence communication easier and simpler.

38 Levels of Parallelism Task i-l Task i Task i+1 + x Task Control Data
Code-Granularity Code Item Large grain (task level) Program Medium grain (control level) Function (thread) Fine grain (data level) Loop Very fine grain (multiple issue) With hardware Task i-l Task i Task i+1 func1 ( ) { .... } func2 ( ) { .... } func3 ( ) { .... } Task Control Data Multiple Issue a ( 0 ) =.. b ( 0 ) =.. a ( 1 )=.. b ( 1 )=.. a ( 2 )=.. b ( 2 )=.. + x Load

39 Simple Thread Example void *func ( ) { /* define local data */
/* function code */ thr_exit(exit_value); } main ( ) thread_t tid; int exit_value; thread_create (0, 0, func (), NULL, &tid); thread_join (tid, 0, &exit_value);

40 Few Popular Thread Models
POSIX, ISO/IEEE standard Mach C threads, CMU Sun OS LWP threads, Sun Microsystems PARAS CORE threads, C-DAC Java-Threads, Sun Microsystems Chorus threads, Paris OS/2 threads, IBM Windows NT/95 threads, Microsoft

41 Multithreading - Uniprocessors
Concurrency Vs Parallelism Concurrency P1 CPU P2 P3 time Number of Simulatneous execution units > no of CPUs

42 Multithreading - Multiprocessors
Concurrency Vs Parallelism CPU P1 CPU P2 CPU P3 time No of execution process = no of CPUs

43 Computational Model Parallel Execution due to : User Level Threads
Virtual Processors Physical Processors User-Level Schedule (User) Kernel-Level Schedule (Kernel) Parallel Execution due to : Concurrency of threads on Virtual Processors Concurrency of threads on Physical Processor True Parallelism : threads : processor map = 1:1

44 General Architecture of Thread Model
Hides the details of machine architecture Maps User Threads to kernel threads Process VM is shared, state change in VM by one thread visible to other.

45 Process Parallelism Data a b r1 c d r2 MISD and MIMD Processing
int add (int a, int b, int & result) // function stuff int sub(int a, int b, int & result) Data Processor a b r1 c d r2 IS1 add pthread t1, t2; pthread-create(&t1, add, a,b, & r1); pthread-create(&t2, sub, c,d, & r2); pthread-par (2, t1, t2); Processor IS2 sub MISD and MIMD Processing

46 Data Parallelism SIMD Processing do “ dn/2 dn2/+1 dn Data
sort( int *array, int count) //...... Processor do dn/2 dn2/+1 dn Sort pthread-t, thread1, thread2; pthread-create(& thread1, sort, array, N/2); pthread-create(& thread2, sort, array, N/2); pthread-par(2, thread1, thread2); IS Processor Sort SIMD Processing

47 Process and Threaded models
Purpose Process Model Threads Model Creation of a new thread fork ( ) thr_create( ) Start execution of a new thread exec( ) [ thr_create() builds the new thread and starts the execution Wait for completion of thread wait( ) thr_join() Exit and destroy the thread exit( ) thr_exit()

48 Code Comparison Segment (Process) main ( ) { fork ( ); }
Segment(Thread) main() { thread_create(0,0,func(),0,0); }

49 Printing Thread Editing Thread

50 Independent Threads printing() { - - - - - - - - - - - - } editing()
main() id1 = thread_create(printing); id2 = thread_create(editing); thread_run(id1, id2);

51 Cooperative threads - File Copy
reader() { lock(buff[i]); read(src,buff[i]); unlock(buff[i]); } writer() { lock(buff[i]); write(src,buff[i]); unlock(buff[i]); } buff[0] buff[1] Cooperative Parallel Synchronized Threads

52 RPC Call Client Network Server RPC(func) func() { /* Body */ }

53 Multithreaded Server User Mode Kernel Mode Server Process
Client Process Server Threads Client Process User Mode Kernel Mode Message Passing Facility

54 Multithreaded Compiler
Source Code Object Code Preprocessor Thread Compiler Thread

55 Thread Programming models
1. The boss/worker model 2. The peer model 3. A thread pipeline

56 The boss/worker model Resources Program Workers Files Databases Boss
taskX Files Databases Boss taskY main ( ) Input (Stream) Disks taskZ Special Devices

57 Example main() /* the boss */ { forever { get a request;
switch( request ) case X: pthread_create(....,taskX); .... } taskX() /* worker */ perform the task, sync if accessing shared resources taskY() /* worker */ --Above runtime overhead of creating thread can be solved by thread pool * the boss thread creates all worker thread at program initialization and each worker thread suspends itself immediately for a wakeup call from boss

58 The peer model Input Resources (static) Program Workers Files
Databases Disks Special Devices Workers Input (static) taskX taskY taskZ

59 Example main() { pthread_create(....,thread1...task1);
signal all workers to start wait for all workers to finish do any cleanup } task1() /* worker */ wait for start perform the task, sync if accessing shared resources task2() /* worker */

60 A thread pipeline Program Filter Threads Resources Files Databases
Stage 1 Stage 2 Stage 3 Input (Stream) Files Databases Disks Special Devices Files Databases Disks Special Devices Files Databases Disks Special Devices Resources

61 Example main() { pthread_create(....,stage1);
wait for all pipeline threads to finish do any cleanup } stage1() { get next input for the program do stage 1 processing of the input pass result to next thread in pipeline stage2(){ get input from previous thread in pipeline do stage 2 processing of the input stageN() do stage N processing of the input pass result to program output.

62 Multithreaded Matrix Multiply...
= A B C C[1,1] = A[1,1]*B[1,1]+A[1,2]*B[2,1].. …. C[m,n]=sum of product of corresponding elements in row of A and column of B. Each resultant element can be computed independently.

63 Multithreaded Matrix Multiply
typedef struct { int id; int size; int row, column; matrix *MA, *MB, *MC; } matrix_work_order_t; main() { int size = ARRAY_SIZE, row, column; matrix_t MA, MB,MC; matrix_work_order *work_orderp; pthread_t peer[size*zize]; ... /* process matrix, by row, column */ for( row = 0; row < size; row++ ) for( column = 0; column < size; column++) id = column + row * ARRAY_SIZE; work_orderp = malloc( sizeof(matrix_work_order_t)); /* initialize all members if wirk_orderp */ pthread_create(peer[id], NULL, peer_mult, work_orderp); } } /* wait for all peers to exist*/ for( i =0; i < size*size;i++) pthread_join( peer[i], NULL ); }

64 Multithreaded Server... void main( int argc, char *argv[] ) {
int server_socket, client_socket, clilen; struct sockaddr_in serv_addr, cli_addr; int one, port_id; #ifdef _POSIX_THREADS pthread_t service_thr; #endif port_id = 4000; /* default port_id */ if( (server_socket = socket( AF_INET, SOCK_STREAM, 0 )) < 0 ) printf("Error: Unable to open socket in parmon server.\n"); exit( 1 ); } memset( (char*) &serv_addr, 0, sizeof(serv_addr)); serv_addr.sin_family = AF_INET; serv_addr.sin_addr.s_addr = htonl(INADDR_ANY); serv_addr.sin_port = htons( port_id ); setsockopt(server_socket, SOL_SOCKET, SO_REUSEADDR, (char *)&one, sizeof(one));

65 Multithreaded Server... if( bind( server_socket, (struct sockaddr *)&serv_addr, sizeof(serv_addr)) < 0 ) { printf( "Error: Unable to bind socket in parmon server->%d\n",errno ); exit( 1 ); } listen( server_socket, 5); while( 1 ) clilen = sizeof(cli_addr); client_socket = accept( server_socket, (struct sockaddr *)&serv_addr, &clilen ); if( client_socket < 0 ) { printf( "connection to client failed in server.\n" ); continue; #ifdef POSIX_THREADS pthread_create( &service_thr, NULL, service_dispatch, client_socket); #else thr_create( NULL, 0, service_dispatch, client_socket, THR_DETACHED, &service_thr); #endif

66 Multithreaded Server // Service function -- Thread Funtion
void *service_dispatch(int client_socket) { …Get USER Request if( readline( client_socket, command, 100 ) > 0 ) …IDENTI|FY USER REQUEST ….Do NECESSARY Processing …..Send Results to Server } …CLOSE Connect and Terminate THREAD close( client_socket ); #ifdef POSIX_THREADS pthread_exit( (void *)0); #endif

67 The Value of MT Program structure Parallelism Throughput
Responsiveness System resource usage Distributed objects Single source across platforms (POSIX) Single binary for any number of CPUs

68 To thread or not to thread
Improve efficiency on uniprocessor systems Use multiprocessor Hardware Improve Throughput Simple to implement Asynchronous I/O Leverage special features of the OS

69 To thread or not to thread
If all operations are CPU intensive do not go far on multithreading Thread creation is very cheap, it is not free thread that has only five lines of code would not be useful

70 DOS - The Minimal OS Stack & Stack Pointer Program Counter User Code
Global Data DOS User Space Kernel DOS Data DOS Hardware

71 (UNIX, VMS, MVS, NT, OS/2 etc.)
Multitasking OSs Process User Space Kernel Process Structure UNIX Hardware (UNIX, VMS, MVS, NT, OS/2 etc.)

72 (Each process is completely independent)
Multitasking Systems Processes P1 P2 P3 P4 The Kernel Hardware (Each process is completely independent)

73 Multithreaded Process
T1’s SP T3’sPC T1’sPC T2’sPC T1’s SP User Code Global Data T2’s SP Process Structure The Kernel (Kernel state and address space are shared)

74 Kernel Structures Traditional UNIX Process Structure
Solaris 2 Process Structure Process ID UID GID EUID EGID CWD. Priority Signal Mask Registers Kernel Stack CPU State File Descriptors Signal Dispatch Table Memory Map Process ID UID GID EUID EGID CWD. Signal Dispatch Table Memory Map File Descriptors LWP 2 LWP 1

75 Scheduling Design Options
M:1 HP-UNIX 1:1 DEC, NT, OS/1, AIX. IRIX M:M 2-level

76 SunOS Two-Level Thread Model
Traditional process Proc 1 Proc 2 Proc 3 Proc 4 Proc 5 User LWPs Kernel threads Kernel Hardware Processors

77 Thread Life Cycle T1 T2 main() main() { ... {
pthread_create(...func...) pthread_exit() T2 main() main() { { pthread_create( func, arg); thr_create( ..func..,arg..); } } void * func() { .... } POSIX Solaris

78 Waiting for a Thread to Exit
pthread_join() pthread_exit() T2 main() main() { { pthread_join(T2); thr_join( T2,&val_ptr); } } void * func() { .... } POSIX Solaris

79 Scheduling States: Simplified View of Thread State Transitions
RUNNABLE Wakeup Stop Continue STOPPED Preempt Stop SLEEPING ACTIVE Stop Sleep

80 Preemption ! = Time slicing All of the libraries are preemptive
The process of rudely interrupting a thread and forcing it to relinquish its LWP (or CPU) to another. CPU2 cannot change CPU3’s registers directly. It can only issue a hardware interrupt to CPU3. It is up to CPU3’s interrupt handler to look at CPU2’s request and decide what to do. Higher priority threads always preempt lower priority threads. Preemption ! = Time slicing All of the libraries are preemptive

81 EXIT Vs. THREAD_EXIT The normal C function exit() always causes the process to exit. That means all of the process -- All the threads. The thread exit functions: UI : thr_exit() POSIX : pthread_exit() OS/2 : DosExitThread() and _endthread() NT : ExitThread() and endthread() all cause only the calling thread to exit, leaving the process intact and all of the other threads running. (If no other threads are running, then exit() will be called.)

82 Cancellation Cancellation is the means by which a thread can tell another thread that it should exit. main() main() main() {... {... {... pthread_cancel (T1); DosKillThread(T1); TerminateThread(T1) } } } There is no special relation between the killer of a thread and the victim. (UI threads must “roll their own” using signals) (pthread exit) T1 (pthread cancel() T2 POSIX OS/2 Windows NT

83 Cancellation State and Type
PTHREAD_CANCEL_DISABLE (Cannot be cancelled) PTHREAD_CANCEL_ENABLE (Can be cancelled, must consider type) Type PTHREAD_CANCEL_ASYNCHRONOUS (any time what-so-ever) (not generally used) PTHREAD_CANCEL_DEFERRED (Only at cancellation points) (Only POSIX has state and type) (OS/2 is effectively always “enabled asynchronous”) (NT is effectively always “enabled asynchronous”)

84 Cancellation is Always Complex!
It is very easy to forget a lock that’s being held or a resource that should be freed. Use this only when you absolutely require it. Be extremely meticulous in analyzing the possible thread states. Document, document, document!

85 Returning Status POSIX and UI OS/2 NT
A detached thread cannot be “joined”. It cannot return status. An undetached thread must be “joined”, and can return a status. OS/2 Any thread can be waited for No thread can return status No thread needs to be waited for. NT No threads can be waited for Any thread can return status

86 Suspending a Thread T1 T2 Solaris: main() { ... thr_suspend(T1);
continue() T2 Solaris: main() { ... thr_suspend(T1); thr_continue(T1); } * POSIX does not support thread suspension

87 Proposed Uses of Suspend/Continue
Garbage Collectors Debuggers Performance Analysers Other Tools? These all must go below the API, so they don’t count. Isolation of VM system “spooling” (?!) NT Services specify that a service should b suspendable (Questionable requirement?) Be Careful

88 Do NOT Think about Scheduling!
Think about Resource Availability Think about Synchronization Think about Priorities Ideally, if you’re using suspend/ continue, you’re making a mistake!

89 Synchronization Websters: “To represent or arrange events to indicate coincidence or coexistence.” Lewis : “To arrange events so that they occur in a specified order.” * Serialized access to controlled resources. Synchronization is not just an MP issue. It is not even strictly an MT issue!

90 Threads Synchronization :
On shared memory : shared variables - semaphores On distributed memory : within a task : semaphores Across the tasks : By passing messages

91 Unsynchronized Shared Data is a Formula for Disaster
Thread1 Thread2 temp = Your - > BankBalance; dividend = temp * InterestRate; newbalance = dividend + temp; Your->Dividend += dividend; Your->BankBalance+= deposit; Your->BankBalance = newbalance;

92 Atomic Actions An action which must be started and completed with no possibility of interruption. A machine instruction could need to be atomic. (not all are!) A line of C code could need to be atomic. (not all are) An entire database transaction could need to be atomic. All MP machines provide at least one complex atomic instruction, from which you can build anything. A section of code which you have forced to be atomic is a Critical Section.

93 Critical Section (Good Programmer!)
reader() { lock(DISK); unlock(DISK); } writer() { lock(DISK); unlock(DISK); } Shared Data

94 Critical Section (Bad Programmer!)
reader() { lock(DISK); unlock(DISK); } writer() { } Shared Data

95 Lock Shared Data! Globals Shared data structures Static variables
(really just lexically scoped global variables)

96 Mutexes Thread 1 Thread2 item = create_and_fill_item(); mutex_lock( &m ); item->next = list; list = item; mutex_unlock(&m); mutex_lock( &m ); this_item = list; list = list_next; mutex_unlock(&m); .....func(this-item); POSIX and UI : Owner not recorded, block in priority order. OS/2 and NT. Owner recorded, block in FIFO order.

97 Synchronization Variables in Shared Memory (Cross Process)
Thread

98 Synchronization Problems

99 Deadlocks Thread 2 Thread 1
lock( M1 ); lock( M2 ); lock( M2 ); lock( M1 ); Thread1 is waiting for the resource(M2) locked by Thread2 and Thread2 is waiting for the resource (M1) locked by Thread1

100 Avoiding Deadlocks Establish a hierarchy : Always lock Mutex_1 before Mutex_2, etc..,. Use the trylock primitives if you must violate the hierarchy. { while (1) { pthread_mutex_lock (&m2); if( EBUSY |= pthread mutex_trylock (&m1)) break; else { pthread _mutex_unlock (&m1); wait_around_or_do_something_else(); } do_real work(); /* Got `em both! */ Use lockllint or some similar static analysis program to scan your code for hierarchy violations.

101 Race Conditions A race condition is where the results of a program are different depending upon the timing of the events within the program. Some race conditions result in different answers and are clearly bugs. Thread 1 Thread 2 mutex_lock (&m) mutex_lock (&m) v = v - 1; v = v * 2; mutex_unlock (&m) mutex_unlock (&m) --> if v = 1, the result can be 0 or 1based on which thread gets chance to enter CR first

102 Operating System Issues

103 Library Goals Make it fast! Make it MT safe! Retain UNIX semantics!

104 Are Libraries Safe ? getc() OLD implementation:
extern int get( FILE * p ) { /* code to read data */ } getc() NEW implementation: pthread_mutex_lock(&m); pthread_mutex_unlock(&m);

105 ERRNO In UNIX, the distinguished variable errno is used to hold the error code for any system calls that fail. Clearly, should two threads both be issuing system calls around the same time, it would not be possible to figure out which one set the value for errno. Therefore errno is defined in the header file to be a call to thread-specific data. This is done only when the flag_REENTRANT (UI) _POSIX_C_SOURCE=199506L (POSIX) is passed to the compiler, allowing older, non-MT programs to continue to run. There is the potential for problems if you use some libraries which are not reentrant. (This is often a problem when using third party libraries.)

106 Are Libraries Safe? MT-Safe This function is safe
MT-Hot This function is safe and fast MT-Unsafe This function is not MT-safe, but was compiled with _REENTRANT Alternative Call This function is not safe, but there is a similar function (e.g. getctime_r()) MT-Illegal This function wasn’t even compiled with _REENTRANT and therefore can only be called from the main thread.

107 Threads Debugging Interface
Debuggers Data inspectors Performance monitors Garbage collectors Coverage analyzers Not a standard interface!

108 The APIs

109 Different Thread Specifications
Functionality UI Threads POSIX Thteads NT Threads OS/2 Threads Design Philosophy Base Near-Base Complex Complex Primitives Primitives Primitives Primitives Scheduling Classes Local/ Global Local/Global Global Global Mutexes Simple Simple Complex Complex Counting Semaphores Simple Simple Buildable Buildable R/W Locks Simple Buildable Buildable Buildable Condition Variables Simple Simple Buildable Buildable Multiple-Object Buildable Buildable Complex Complex Synchronization Thread Suspension Yes Impossible Yes Yes Cancellation Buildable Yes Yes Yes Thread-Specific Data Yes Yes Yes Yes Signal-Handling Primitives Yes Yes n/a n/a Compiler Changes Required No No Yes No Vendor Libraries MT-safe? Moat Most All? All? ISV Libraries MT-safe? Some Some Some Some

110 POSIX and Solaris API Differences
POSIX API Solaris API continue suspend semaphore vars concurrency setting reader/ writer vars daemon threads thread cancellation scheduling policies sync attributes thread attributes join exit key creation priorities sigmask create thread specific data mutex vars kill condition vars

111 Error Return Values Many threads functions return an error value which can be looked up in errno.h. Very few threads functions set errno(check man pages). The “lack of resources” errors usually mean that you’ve used up all your virtual memory, and your program is likely to crash very soon.

112 Attribute Objects UI, OS/2, and NT all use flags and direct arguments to indicate what the special details of the objects being created should be. POSIX requires the use of “Attribute objects”: thr_create(NULL, NULL, foo, NULL, THR_DETACHED); Vs: pthread_attr_t attr; pthread_attr_init(&attr); pthread_attr_setdetachstate(&attr,PTHREAD_CREATE_DETACHED); pthread_create(NULL, &attr, foo, NULL);

113 Attribute Objects Although a bit of pain in the *** compared to passing all the arguments directly, attribute objects allow the designers of the threads library more latitude to add functionality without changing the old interfaces. (If they decide they really want to, say, pass the signal mask at creation time, they just add a function pthread_attr_set_signal_mask() instead of adding a new argument to pthread_create().) There are attribute objects for: Threads stack size, stack base, scheduling policy, scheduling class, scheduling scope, scheduling inheritance, detach state. Mutexes Cross process, priority inheritance Condition Variables Cross process

114 Attribute Objects Attribute objects must be: Allocated Initialized
Values set (presumably) Used Destroyed (if they are to be free’d) pthread_attr_t attr; pthread_attr_init (&attr); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED)’ pthread_create(NULL, &attr, foo, NULL); pthread_attr_destroy (&attr);

115 Thread Attribute Objects
pthread_attr_t; Thread attribute object type: pthread_attr_init (pthread_mutexattr_t *attr) pthread_attr_destroy (pthread_attr_t *attr) pthread_attr_getdetachstate (pthread_attr_t *attr, in *state) pthread_attr_setdetachstate (pthread_attr_t *attr, int state) Can the thread be joined?: pthread_attr_getscope(pthread_attr_t *attr, in *scope) pthread_attr_setscope(pthread_attr_t *attr, int scope)

116 Thread Attribute Objects
pthread_attr_getinheritpolicy(pthread_attr_t *attr, int *policy) pthread_attr_setinheritpolicy(pthread_attr_t *attr, int policy) Will the policy in the attribute object be used? pthread_attr_getschedpolicy(pthread_attr_t *attr, int *policy) pthread_attr_setschedpolicy(pthread_attr_t *attr, int policy) Will the scheduling be RR, FIFO, or OTHER? pthread_attr_getschedparam(pthread_attr_t *attr, struct sched param *param) pthread_attr_setschedparam(pthread attr_t *attr, struct sched param *param); What will the priority be?

117 Thread Attribute Objects
pthread_attr_getinheritsched(pthread_attr_t *attr, int *inheritsched) pthread_attr_setinheritsched(pthread_attr_t *attr, int inheritsched) Will the policy in the attribute object be used? pthread_attr_getstacksize(pthread_attr_t *attr, int *size) pthread_attr_setstacksize(pthread_attr_t *attr, int size) How big will the stack be? pthread_attr_getstackaddr (pthread_attr_t *attr, size_t *base) pthread_attr_setstackaddr(pthread_attr_t *attr, size_t base) What will the stack’s base address be?

118 Mutex Attribute Objects
pthread_mutexattr_t; mutex attribute object type pthread_mutexattr_init(pthread_mutexattr_t *attr) pthread_mutexattr_destroy(pthread_mutexattr_t *attr) pthread_mutexattr_getshared(pthread_mutexattr_t*attr, int shared) pthread_mutexattr_setpshared (pthread_mutex attr_t *attr, int shared) Will the mutex be shared across processes?

119 Mutex Attribute Objects
pthread_mutexattr_getprioceiling(pthread_mutexattr_t *attr, int *ceiling) pthread_mutexattr_setprioceiling(pthread_mutexattr_t What is the highest priority the thread owning this mutex can acquire? pthread_mutexattr_getprotocol (pthread_mutexattr_t *attr, int *protocol) pthread_mutexattr_setprotocol (pthread_mutexattr_t *attr, int protocol) Shall the thread owning this mutex inherit priorities from waiting threads?

120 Condition Variable Attribute Objects
pthread_condattr_t; CV attribute object type pthread_condattr_init(pthread_condattr_t * attr) pthread_condattr_destroy(pthread_condattr_t *attr) pthread_condattr_getpshared (pthread_condattr_t *attr, int *shared) pthread_condattr_setpshared (pthread_condattr_t *attr, int shared) Will the mutex be shared across processes?

121 Creation and Destruction (UI & POSIX)
int thr_create(void *stack_base, size_t stacksize, void *(*start_routine) (void *), void * arg, long flags, thread_t thread); void thr_exit (void *value_ptr); int thr_join (thread_t thread, void **value_ptr); int pthread_create (pthread_t *thread, const pthread_attr_t *attr, void * (*start_routine) (void *), void *arg); void pthread_exit (void *value_ptr); int pthread_join (pthread_t thread, void **value_ptr); int pthread_cancel (pthread_t thread);

122 Suspension (UI & POSIX)
int thr_suspend (thread_t target) int thr_continue (thread_t target)

123 Changing Priority (UI & POSIX)
int thr_setpriority(thread_t thread, int priority) int thr_getpriority(thread_t thread, int *priority) int pthread_getschedparam(pthread_t thread, int *policy, struct sched param *param) int pthread_setschedparam(pthread_t thread, int policy, struct sched param *param)

124 Readers / Writer Locks (UI)
int rwlock_init (rwlock_t *rwlock, int type, void *arg); int rw_rdlock (rwlock_t *rwlock); int rw_wrlock (rwlock_t *rwlock); int rw_tryrdlock (rwlock_t *rwlock); int rw_trywrlock (rwlock_t *rwlock); int rw_unlock (rwlock_t *rwlock); int rw_destroy (rwlock_t *rwlock);

125 (Counting) Semaphores (UI & POSIX)
int sema_init (sema_t *sema, unsigned int sema_count, int type, void *arg) int sema_wait (sema_t *sema) int sema_post (sema_t *sema) int sema_trywait (sema_t *sema) int sema_destroy (sema_t *sema) int sem_init (sem_t *sema, int pshared, unsigned int count) int sem_post (sem_t *sema) int sem_trywait (sem_t *sema) int sem_destroy (sem_t *sema) (POSIX semaphores are not part of pthread. Use the libposix4.so and posix4.h)

126 Condition Variables (UI & POSIX)
int cond_init(contd_t *cond, int type, void *arg) int cond_wait(cond_t *cond, mutex_t *mutex); int cond_signal(cond_t *cond) int cond_broadcast(cond_t *cond) int cond_timedwait(cond_t *cond, mutex_t *mutex, timestruc_t *abstime) int cond_destroy (cond_t *cond) int pthread_cond_init(pthread_cond_t *cond,pthread_condattr_t *attr) int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex) int pthread_cond_signal (pthread_cond_t *cond) int pthread_cond_broadcast(pthread_cond_t *cond, pthread_mutex_t *mutex, struct timespec *abstime) int pthread_cond_destroy(pthread_cond_t *cond)

127 Signals (UI & POSIX) int thr_sigsetmask(int how, const sigset_t *set, sigset_t *oset); int thr_kill(thread_t target thread, int sig) int sigwait(sigset_t *set) int pthread_sigmask(int how, const sigset_t *set, sigset_t *oset); int pthread_kill(thread_t target_thread, int sig) int sigwait(sigset_t *set, int *sig)

128 Cancellation (POSIX) int pthread_cancel (pthread_thread_t thread)
int pthread cleanup_pop (int execute) int pthread_cleanup_push (void (*funtion) (void *), void *arg) int pthread_setcancelstate (int state, int *old_state) int pthread_testcancel (void)

129 Other APIs thr_self(void) thr_yield() int pthread_atfork (void (*prepare) (void), void (*parent) (void), void (*child) (void) pthread_equal (pthread_thread_t tl, pthread_thread_t t2) pthread_once (pthread_once_t *once_control, void (*init_routine) (void)) pthread_self (void) pthread_yield() (Thread IDs in Solaris recycle every 2^32 threads, or about once a month if you do create/exit as fast as possible.)

130 Compiling

131 Solaris Libraries Solaris has three libraries: libthread.so, libpthread.so, libposix4.so Corresponding new include files: synch.h, thread.h, pthread.h, posix4.h Bundled with all O/S releases Running an MT program requires no extra effort Compiling an MT program requires only a compiler (any compiler!) Writing an MT program requires only a compiler (but a few MT tools will come in very handy)

132 Compiling UI under Solaris
Compiling is no different than for non-MT programs libthread is just another system library in /usr/lib Example: %cc -o sema sema.c -lthread -D_REENTRANT %cc -o sema sema.c -mt All multithreaded programs should be compiled using the _REENTRANT flag Applies for every module in a new application If omitted, the old definitions for errno, stdio would be used, which you don’t want All MT-safe libraries should be compiled using the _REENTRANT flag, even though they may be used single in a threaded program.

133 Compiling POSIX under Solaris
Compiling is no different than for non-MT programs libpthread is just another system library in /usr/lib Example : %cc-o sema sema.c -lpthread -lposix D_POSIX_C_SOURCE=19956L All multithreaded programs should be compiled using the _POSIX_C_SOURCE=199506L flag Applies for every module in a new application If omitted, the old definitions for errno, stdio would be used, which you don’t want All MT-safe libraries should be compiled using the _POSIX_C_SOURCE=199506L flag, even though they may be used single in a threaded program

134 Compiling mixed UI/POSIX under Solaris
If you just want to use the UI thread functions (e.g., thr_setconcurrency()) %cc-o sema sema.c -1thread -1pthread -1posix4 D_REENTRANT - _POSIX_PTHREAD_SEMANTICS If you also want to use the UI semantics for fork(), alarms, timers, sigwait(), etc.,.

135 Summary Threads provide a more natural programming paradigm
Improve efficiency on uniprocessor systems Allows to take full advantage of multiprocessor Hardware Improve Throughput: simple to implement asynchronous I/O Leverage special features of the OS Many applications are already multithreaded MT is not a silver bullet for all programming problems. Threre is already standard for multithreading--POSIX Multithreading support already available in the form of language syntax--Java Threads allows to model the real world object (ex: in Java)

136 Multithreading in Java

137 Java - An Introduction Java - The new programming language from Sun Microsystems Java -Allows anyone to publish a web page with Java code in it Java - CPU Independent language Created for consumer electronics Java - James , Arthur Van , and others Java -The name that survived a patent search Oak -The predecessor of Java Java is “C “

138 Object Oriented Languages -A comparison

139 Sun defines Java as: Simple and Powerful Safe Object Oriented Robust
Architecture Neutral and Portable Interpreted and High Performance Threaded Dynamic

140 Power of Compiled Languages Flexibility of Interpreted Languages
Java Integrates Power of Compiled Languages and Flexibility of Interpreted Languages

141 Classes and Objects Classes and Objects Method Overloading
Method Overriding Abstract Classes Visibility modifiers default public protected private protected , private

142 Threads Java has built in thread support for Multithreading
Synchronization Thread Scheduling Inter-Thread Communication: currentThread start setPriority yield run getPriority sleep stop suspend resume Java Garbage Collector is a low-priority thread

143 Ways of Multithreading in Java
Create a class that extends the Thread class Create a class that implements the Runnable interface 1st Method: Extending the Thread class class MyThread extends Thread { public void run() // thread body of execution } Creating thread: MyThread thr1 = new MyThread(); Start Execution: thr1.start();

144 2nd method: Threads by implementing Runnable interface
class ClassName implements Runnable { ..... public void run() // thread body of execution } Creating Object: ClassName myObject = new ClassName(); Creating Thread Object: Thread thr1 = new Thread( myObject ); Start Execution: thr1.start();

145 Thread Class Members... public class java.lang.Thread extends java.lang.Object implements java.lang.Runnable { // Fields public final static int MAX_PRIORITY; public final static int MIN_PRIORITY; public final static int NORM_PRIORITY; // Constructors public Thread(); public Thread(Runnable target); public Thread(Runnable target, String name); public Thread(String name); public Thread(ThreadGroup group, Runnable target); public Thread(ThreadGroup group, Runnable target, String name); public Thread(ThreadGroup group, String name); // Methods public static int activeCount(); public void checkAccess(); public int countStackFrames(); public static Thread currentThread(); public void destroy(); public static void dumpStack(); public static int enumerate(Thread tarray[]); public final String getName();

146 ...Thread Class Members. public final int getPriority(); // 1 to 10 priority-pre-emption at mid. public final ThreadGroup getThreadGroup(); public void interrupt(); public static boolean interrupted(); public final boolean isAlive(); public final boolean isDaemon(); public boolean isInterrupted(); public final void join(); public final void join(long millis); public final void join(long millis, int nanos); public final void resume(); public void run(); public final void setDaemon(boolean on); public final void setName(String name); public final void setPriority(int newPriority); public static void sleep(long millis); public static void sleep(long millis, int nanos); public void start(); public final void stop(); public final void stop(Throwable obj); public final void suspend(); public String toString(); public static void yield(); }

147 Manipulation of Current Thread
// CurrentThreadDemo.java class CurrentThreadDemo { public static void main(String arg[]) { Thread ct = Thread.currentThread(); ct.setName( "My Thread" ); System.out.println("Current Thread : "+ct); try { for(int i=5; i>0; i--) { System.out.println(" " + i); Thread.sleep(1000); } catch(InterruptedException e) { System.out.println("Interrupted."); } Run: Current Thread : Thread[My Thread,5,main] 5 4 3 2 1

148 Creating new Thread... // ThreadDemo.java
class ThreadDemo implements Runnable { ThreadDemo() Thread ct = Thread.currentThread(); System.out.println("Current Thread : "+ct); Thread t = new Thread(this,"Demo Thread"); t.start(); try Thread.sleep(3000); } catch(InterruptedException e) System.out.println("Interrupted."); System.out.println("Exiting main thread.");

149 ...Creating new Thread. public void run() { try {
for(int i=5; i>0; i--) { System.out.println(" " + i); Thread.sleep(1000); } } catch(InterruptedException e) { System.out.println("Child interrupted."); } System.out.println("Exiting child thread."); public static void main(String args[]) { new ThreadDemo(); Run: Current Thread : Thread[main,5,main] 5 4 3 Exiting main thread. 2 1 Exiting child thread.

150 Thread Priority... // HiLoPri.java class Clicker implements Runnable {
int click = 0; private Thread t; private boolean running = true; public Clicker(int p) { t = new Thread(this); t.setPriority(p); } public void run() while(running) click++; public void start() t.start(); public void stop() running = false;

151 ...Thread Priority class HiLoPri {
public static void main(String args[]) Thread.currentThread().setPriority(Thread.MAX_PRIORITY); Clicker Hi = new Clicker(Thread.NORM_PRIORITY+2); Clicker Lo = new Clicker(Thread.NORM_PRIORITY-2); Lo.start(); Hi.start(); try { Thread.sleep(10000); } catch (Exception e) { } Lo.stop(); Hi.stop(); System.out.println(Lo.click + " vs. " + Hi.click); Run1: (on Solaris) 0 vs Run2: (Window 95) vs

152 The Java monitor model Method 1 Method 2 Key Block 1 Threads
Monitor (synchronised) solves race-condition problem

153 Threads Synchronisation...
// Synch.java: race-condition without synchronisation class Callme { // Check synchronized and unsynchronized methods /* synchronized */ void call(String msg) { System.out.print("["+msg); try { Thread.sleep(1000); } catch(Exception e) { } System.out.println("]"); class Caller implements Runnable String msg; Callme Target; public Caller(Callme t, String s) Target = t; msg = s; new Thread(this).start();

154 ...Threads Synchronisation.
public void run() { Target.call(msg); } class Synch { public static void main(String args[]) { Callme Target = new Callme(); new Caller(Target, "Hello"); new Caller(Target, "Synchronized"); new Caller(Target, "World"); Run 1: With unsynchronized call method (race condition) [Hello[Synchronized[World] ] Run 2: With synchronized call method [Hello] [Synchronized] [World] Run3: With Synchronized object synchronized(Target) { Target.call(msg); } The output is the same as Run2

155 Queue (no inter-threaded communication)...
// pc.java: produce and consumer class Queue { int n; synchronized int get() System.out.println("Got : "+n); return n; } synchronized void put(int n) this.n = n; System.out.println("Put : "+n); class Producer implements Runnable Queue Q; Producer(Queue q) Q = q; new Thread( this, "Producer").start();

156 Queue (no inter-threaded communication)...
public void run() { int i = 0; while(true) Q.put(i++); } class Consumer implements Runnable Queue Q; Consumer(Queue q) Q = q; new Thread( this, "Consumer").start(); Q.get();

157 ...Queue (no inter-threaded communication).
class PC { public static void main(String[] args) Queue Q = new Queue(); new Producer(Q); new Consumer(Q); } Run: Put: 1 Got: 1 Put: 2 Put: 3 Got: 3 ^C

158 Queue (interthread communication)...
// PCnew.java: produce-consumenr with interthread communication class Queue { int n; boolean ValueSet = false; synchronized int get() try if(!ValueSet) wait(); } catch(InterruptedException e) System.out.println("Got : "+n); ValueSet = false; notify(); return n;

159 Queue (interthread communication)...
synchronized void put(int n) { try { if(ValueSet) wait(); } catch(InterruptedException e) { } this.n = n; System.out.println("Put : "+n); ValueSet = true; notify(); class Producer implements Runnable Queue Q; Producer(Queue q) Q = q; new Thread( this, "Producer").start();

160 Queue (interthread communication)...
public void run() { int i = 0; while(true) Q.put(i++); } class Consumer implements Runnable Queue Q; Consumer(Queue q) Q = q; new Thread( this, "Consumer").start(); Q.get();

161 ...Queue (no interthread communication).
class PCnew { public static void main(String[] args) Queue Q = new Queue(); new Producer(Q); new Consumer(Q); } Run: Put : 0 Got : 0 Put : 1 Got : 1 Put : 2 Got : 2 Put : 3 Got : 3 Put : 4 Got : 4 ^C

162 Deadlock... // DeadLock.java class A { synchronized void foo(B b)
String name = Thread.currentThread().getName(); System.out.println(name + " entered A.foo"); try Thread.sleep(1000); } catch(Exception e) System.out.println(name + " trying to call B.last()"); b.last(); synchronized void last() System.out.println("Inside A.last");

163 Deadlock... class B { synchronized void bar(A a)
String name = Thread.currentThread().getName(); System.out.println(name + " entered B.bar"); try Thread.sleep(1000); } catch(Exception e) System.out.println(name + " trying to call A.last()"); a.last(); synchronized void last() System.out.println("Inside B.last");

164 ...Deadlock. class DeadLock implements Runnable { A a = new A();
B b = new B(); DeadLock() { Thread.currentThread().setName("Main Thread"); new Thread(this).start(); a.foo(b); System.out.println("Back in the main thread."); } public void run() { Thread.currentThread().setName("Racing Thread"); b.bar(a); System.out.println("Back in the other thread"); public static void main(String args[]) { new DeadLock(); Run: Main Thread entered A.foo Racing Thread entered B.bar Main Thread trying to call B.last() Racing Thread trying to call A.last() ^C

165 Grand Challenges (Is PP Practical?)
Need OS and Compiler support to use multiprocessor machines. Ideal would be for the user to be unaware if the problem is running on sequential or parallel hardware - a long way to go. With Highspeed Networks and improved microprocessor performance, multiple stand-alone machines can also be used as a parallel machine - a Popular Trend. (appealing vehicle for parallel computing) Language standards have to evolve. (Portability). Re-orientation of thinking Sequential Parallel

166 Grand Challenges (Is PP Practical?)
Language standards have to evolve. (Portability). Re-orientation of thinking Sequential Parallel

167 Breaking High Performance Computing Barriers
2100 G F L O P S 2100 Single Processor Shared Memory Local Parallel Cluster Global Parallel Cluster

168 Thank You ...


Download ppt "Concurrent Programming with Threads"

Similar presentations


Ads by Google