Concurrency
Readings r Tanenbaum and van Steen: r Coulouris: Chapter 6 r cs402 web page links r UNIX Network Programming by W. Richard Stevens
Introduction r Consider a request for a file transfer. m Client A requests a huge file from disk. m This request means that the process is I/O bound. m In the server program presented earlier, all other client requests have to wait until client A’s request is processed. m Waste! r In writing distributed applications, it often necessary that a process be constructed such that communication and local processing can overlap.
Different Types of Servers r Iterative Server – The server cannot process a pending client until it has completely serviced the current client. r Concurrent Server – There are two basic versions: m The fork systems call is used to spawn a child process for every client. m The server creates a thread for each client
Iterative Servers r Iterative servers process one request at a time. client 1serverclient 2 call connect call accept ret connect ret accept call connect call read write ret read close call accept ret connect call read ret read close write ret accept close
Iterative Servers r Solution: use concurrent servers instead. m Concurrent servers use multiple concurrent flows to serve multiple clients at the same time. client 1serverclient 2 call connect call accept call read ret connect ret accept call connect call fgets User goes out Client 1 blocks waiting for user to type in data Client 2 blocks waiting to complete its connection request until after lunch! Server blocks waiting for data from Client 1
Concurrent Servers: Multiple Processes r Concurrent servers handle multiple requests concurrently. client 1serverclient 2 call connect call accept call read ret connect ret accept call connect call fgets fork child 1 User goes out to lunch Client 1 blocks waiting for user to type in data call accept ret connect ret accept call fgets writefork call read child 2 write call read end read close...
Using the Fork Systems Call r Traditionally, a concurrent TCP server calls fork to spawn a child to handle each client request. r The only limit on the number of clients is the operating system limit on the number of child processes for the user identifier under which the server is running.
Using the Fork System Call r The fork system call creates a UNIX process by copying a parent’s memory image. r Both processes continue execution at the instruction after the fork. r Creation of two completely identical processes is not useful. Need a way to distinguish between parent and child. m The fork call returns 0 to the client process and returns the child’s process identifier to the parent.
Using the Fork System Call r The child inherits most of the parent’s environment and context including scheduling parameters and the file descriptor table. r The parent and children share the same files that are opened by the parent prior to the fork. r When a program closes a file, the entry in the file descriptor table is freed. r To be safer, children and parent processes should close file descriptions of files they do not need immediately after fork() returns.
Using the Fork System Call … pid_t pid; int listenfd, connfd; listenfd = socket(…); … listen(listenfd, queuesize); for (; ;) { connfd = accept(listenfd,…); if ( (pid = fork()) == 0) { close(listenfd); /*child closes listen socket*/ doit(connfd); /*process the request*/ close(connfd); /*child closes listening socket*/ exit(0); } close(connfd); /*parent closes connected socket */ … }
Zombies r When a program forks and the child finishes before the parent, the kernel still keeps some information about the child in case the parent needs it e.g., what is the child’s exit status. r The parent can get this information using wait(). r In between terminating and the parent calling wait() the child process is referred to as a zombie. r Zombies stay in the system until they are waited for.
Zombies r Even though the child process is still not running, it still takes a place in the process descriptor table. r This is not good. Eventually we run out of processes. r If a parent terminates without calling wait(), the child process (now an orphan) is adopted by init() (which is the first process started up on a UNIX system). r The init process periodically waits for children so eventually orphan zombies are removed.
Zombies r You should make sure that the parent process calls wait() (or waitpid()) for each child process that terminates. r When a child process terminates, the child enters the zombie state. This sends the SIGCHLD signal to the parent process. m A signal is an asynchronous notification to a process that an event has occurred. r A signal handler associated with the SIGCHLD signal executes the wait() function.
Signal Handling m A signal is a notification to a process that an event has occurred. m Signals usually occur asynchronously i.e., the process does not know ahead of time exactly when a signal will occur. m Signals can be sent by one process to another process (or to itself) or by the kernel to a process. m Every signal has a disposition (action) associated with the signal. m There are about 32 signals in UNIX. m Example signals: a child process terminates, keyboard interrupt (e.g., ctrl-Z, ctrl-C).
Signal Handling r A function (called the signal handler) can be provided that is called whenever a specific signal occurs. r This function is called with a single integer argument that is the signal number. The function does not return anything. r Many signals can be ignored by setting its disposition to SIG_IGN. Two exceptions: SIGKILL and SIGSTOP. r The default disposition is SIG_DFL. The default normally terminates a process on the receipt of a signal.
Signal Handling r The signal function can be used to set the disposition for a signal. m The first argument is the signal name and the second argument is either a pointer to a function or one of the constants: SIG_IGN or SIG_DFL. r The signal function has different semantics depending on the OS. r One solution is to define your own signal function that calls the Posix sigaction function which has a clear semantics. m Why not use sigaction to begin with? It is more difficult to deal with than signal.
Signal Handling #include …. int main() { signal(SIGCHLD, reapchild); …. } void reapchild(int s) { pid_t pid; int stat; pid = wait(&stat); return; }
Signal Handling r wait returns two values: m The return value of the function is the process identifier of the terminated child m The termination status of the child (an integer) is returned through stat. This assumes that the caller did not pass NULL which indicates that wait is applied to terminated child processes. r Can also use waitpid() m Gives better control since an input parameter of this function specifies the process id of the child process to be waited upon.
Signal Handling r A few points about signal handling on a Posix-compliant system: m Once a signal handler is installed, it remains installed. m While a signal handler is executing, the signal being delivered is blocked. m If a signal is generated one or more times while it is blocked, it is normally delivered only one time after the signal is unblocked i.e., Unix signals are not queued.
Signal Handling r A slow system call is any system call that can block forever (e.g., accept(), read()/write() from sockets). r System calls may be interrupted by signals and should be restarted. r When a process is blocked in a slow system call and the process catches a signal and the signal handler returns: m The system call can return an error EINTR (interrupted system call) m Some kernels automatically restart some system calls r For portability, when we write a program that catches signals, we must be prepared for slow system calls to return EINTR.
Signal Handling … void main() { … if ((connfd = accept(listenfd, (struct sockaddr *) &sin, &sizeof(sin)) < 0) { if (errno==EINTR) continue; else { perror(“accept”); exit(1); } …
TCP Preforked Server r Instead of doing one fork per client, the server preforks some number of children when it starts and then the children are ready to service the clients as each client connection arrives. r Advantage: new clients can be handled without the cost of a fork by the parent. r Disadvantage: When it starts the parent process must guess how many children to prefork. m If the number of clients at any time is greater than the number of children, additional client requests are queued.
TCP Preforked Server main() { … for (i=0; i < nchildren; i++) pids[i] = child_make(i,listenfd,addrlen); … }
TCP Preforked Server r The childmake() function is called by the main program to create each child. r As will be seen on the next slide, a call to fork in childmake() will be used to create a child; only the parent will return from childmake().
TCP Preforked Server pid_t child_make(int i, int listenfd, int addrlen) { pid_t pid; void child_main(int, int, int); if ( (pid = fork()) > 0) return (pid); child_main(i,listenfd,addrlen); }
TCP Preforked Server void child_main(int i, int listenfd, int addrlen) { … for (; ;) { connfd = accept(listenfd …); doit(connfd); close(connfd); }
TCP Preforked Server r Note that in the example there are multiple processes calling accept on the same socket descriptor. This is allowed in Berkeley-derived kernels. m If there are N children waiting on accept, then the next incoming client connection is assigned to the next child process to run r There are other approaches. For example: m The application places a lock of some form around the call to accept, so that only one process at a time is blocked in the call to accept. The remaining children will be blocked trying to obtain the lock. There are various ways to provide the lock.
TCP Preforked Server r In this type of server, it is desirable to catch the SIGINT signal generated when we type a terminal interrupt key. m The signal handler may kill all the child processes.
TCP Preforked Server void sig_int(int signo) { int i; for (i = 0; i < nchildren; i++) kill(pids[i],SIGTERM); while (wait(NULL) > 0); /*wait for all children*/ …. }
Problem with Forking Processes r Process creation requires that the operating system creates a complete (almost) independent address space. r Allocation can mean initializing memory segments by zeroing a data segment, copying the associated program into a text segment, and setting up a stack for temporary data. r Switching the CPU between two processes may be relatively expensive as well, since the CPU context (which consists of register values, program counter, stack counter, etc) must be saved. r Context switching also requires invalidation of address translation caches.
Problem with Forking Processes r The problem with using the fork systems call is the amount of CPU time that it takes to fork a child for each client. r In the 80’s when a busy server handled hundreds or perhaps even a few thousand clients per day this wasn’t an issue. r The WWW has changed this since some web servers measure the number of TCP connections per day in the millions.
Using the Fork Systems Call r Advantages m Handles multiple connections concurrently m Simple and straightforward (compared to other approaches) m Clean sharing model except for descriptors r Disadvantages m Overhead m It is non-trivial to share data
Threads r An alternative to forking processes is to use threads. r A thread (sometimes called a lightweight process) does not require lots of memory or startup time. r A process may include many threads all sharing: m Memory (program code and global data) m Open file/socket descriptors m Working environment (current directory, user ID, etc.) r Information needed for a thread includes CPU context and thread management information such as keeping track of the fact that a thread is currently blocked on a mutex variable (described later).
Threads r In contrast to processes, no attempt is made to achieve a high degree of concurrency transparency. r Implications m In many cases, multithreading leads to a performance gain. m Multithreaded applications are more difficult to design and implement than single threaded applications.
Process B Global Variables Code Stack Process A Global Variables Code Stack fork() Process A Global Variables Code Stack pthread_create()
Posix Threads r We will focus on Posix Threads - most widely supported threads programming API. You need to link with “-lpthread”
Thread Creation r Thread identifiers Each thread has a unique identifier (ID), a thread can find out its ID by calling pthread_self(). Thread IDs are of type pthread_t which is usually an unsigned int. When debugging it's often useful to do something like this: printf("Thread %u: blah\n",pthread_self());
Thread Creation pthread_create( pthread_t *tid, const pthread_attr_t *attr, void *(*func)(void *), void *arg); r The thread id is returned to tid r Return value is 0 for OK. r Does not set errno; Error codes are returned. r Thread attributes can be set using attr. You can specify NULL and get the system attributes.
Thread Creation func is the function to be called, when func() returns the thread is terminated; The function takes a single parameter specified by arg. r Thread arguments m Complex parameters can be passed by creating a structure and passing the address of the structure, m The structure cannot be a local variable.
Thread Creation r How can we have the main thread know when a child thread has terminated and it can now service a new client? pthread_join (which is sort of like waitpid() ) requires that we specify a thread id. We can wait for a specific thread, but we can't wait for "the next thread to exit".
Thread Example main () { pairint *foobar; pthread_t threadid; foobar->x = 8; foobar->y=10; pthread_create(&threadid, NULL, blah, (void )foobar); pthread_join(threadid,NULL); printf("Parent is continuing....\n"); return 0; }
Thread Example typedef struct {int x, y;} pairint; void *blah(void *arg){ printf("now in thread blah\n"); pairint *foo = (pairint *) arg; printf("thread id %u : sum of %d and %dis %d\n", thread_self(), foo->x, foo->y, foo->x+foo->y); return (NULL); }
Thread Lifespan Once a thread is created, it starts executing the function func() specified in the call to pthread_create(). If func() returns, the thread is terminated. A thread can also be terminated by calling pthread_exit(). If main() returns or any thread calls exit() all threads are terminated.
Detached State r Each thread can be either joinable or detached. Joinable: On thread termination the thread ID and exit status are saved by the OS. One thread can "join" another by calling pthread_join - which waits (blocks) until a specified thread exits. r Detached: On termination all thread resources are released by the OS. A detached thread cannot be joined.
Shared Global Variables All threads within a process can access global variables. int counter=0; void *blah(void *arg) { counter++; printf("Thread %u is number %d\n", pthread_self(),counter); } main() { int i; pthread_t threadid; for (i=0;i<10;i++) pthread_create(&threadid,NULL,blah,NULL); }
Problem r Sharing global variables is dangerous - two threads may attempt to modify the same variable at the same time. r pthreads includes support for mutual exclusion primitives that can be used to protect against this problem. r The general idea is to lock something before accessing global variables and to unlock as soon as you are done.
Support for Mutual Exclusion A global variable of type pthread_mutex_t is required: pthread_mutex_t counter_mtx=PTHREAD_MUTEX_INITIALIZER; To lock use: pthread_mutex_lock(pthread_mutex_t &mutex); To unlock use: pthread_mutex_unlock(pthread_mutex_t &mutex);
Threaded Server … for (;;) { printf("top of for loop\n"); connfd = accept(listenfd, 0, 0); if (connfd == -1) error("accept"); printf("accepted request\n"); pthread_create(&tid, NULL, serverthread, (void *) connfd ); printf("waiting for more requests\n"); } …
Threaded Server void * serverthread(void * parm) { … fd = (int) parm; pthread_detach(pthread_self()); /*code for process client request placed here */ … }
Summary r We have looked at different ways to implement concurrent servers. r If the server is not heavily used, the traditional concurrent server model with one fork per client is fine. r Creating a pool of child processes or threads can reduce CPU usage compared to the traditional concurrent server model. r Using threads is normally faster than using processes. r Having child processes/threads call accept is usually considered simpler and faster than having the parent process/main thread call accept and pass the descriptor to the child or thread.