Presentation is loading. Please wait.

Presentation is loading. Please wait.

Synchronization Comp6231 © Qingzhe Huang. What is synchronization? Generally speaking, synchronization is simply to place a bunch of jobs in a certain.

Similar presentations


Presentation on theme: "Synchronization Comp6231 © Qingzhe Huang. What is synchronization? Generally speaking, synchronization is simply to place a bunch of jobs in a certain."— Presentation transcript:

1 Synchronization Comp6231 © Qingzhe Huang

2 What is synchronization? Generally speaking, synchronization is simply to place a bunch of jobs in a certain order. This is not an official definition but it satisfies for understanding. And the word “order” here has rather broad meaning. For example, one by one is also a kind of order. See the exact scenario.

3 Synchronization scenario 1 1. You have created a demo program to simulate multiple client program to access your server simultaneously by creating a number of threads, say 30. However, the creatation of threads is a sequential job and you want them all to start at the same moment. i.e. You want the thread which is created first to wait for the thread which is created last.

4 Synchronization scenario 2 2. You have a global file pointer “fPtr” that multiple threads will try to use it to access file. The first thread move the pointer to location N and is ready to write data. Then context switch happens and the second thread gets chance to run which relocate the file pointer to location M. Then context switch happens again and first thread writes its data at location M.

5 Synchronization scenario 3 3. You have a table for active customers of size 20. Whenever a customer login, you add his data into this table until the table is full. Whenever a customer logout, you remove his data from the table. And you have a simple “if” statement like: if (cust_count<20){ cust_count++; add_cust(cust_data);}

6 Synchronization scenario 3 cont. Have you spot the problem of the code? What if there is context switch after the “if” statement and another customer login. Then the new customer executes the same code and write his data and after cust_count++ cust_count=20. And the first customer will create a overflow of table. Luckily you will get a crash. If your luck is bad, the program runs for a while and gives some strange result.

7 How to solve? These three scenarios are not exactly same synchronization techniques and they all get special term for themselves. 1. This is a barrier problem which requires all job to wait for a certain number of jobs to come. In java, you can use class “CyclicBarrier” to hold all running threads until all threads are ready to go.

8 How to solve? The second scenario is a typical mutual exclusion problem which requires exclusively usage of certain type of resources like file, socket etc. In java, you can use a counting semaphore with available to be one to achieve “mutex”. i.e. Semaphore sem=new Semaphore(1, true);

9 How to solve? The Third one can be called a counting semaphore problem. You have a limited resources to be allocated with potentially large number of demands. It can be regarded as a general type of mutual exclusion because “mutex” only has exactly one resource to allocate. Similarly you do like this: Semaphore sem=new Semaphore(Number_Job, true);

10 Show me how to use The usage may seem to be easier than understanding. sem.acuqire(); //here do your update work, like operate file pointer, or add new customer to table etc. sem.release();

11 Show me how to use cont. CyclicBarrier barrier=new CyclicBarrier(RunningThreadNumber); Within the “run” of thread, add “barrier.await(); “ before your job starts. Check this in “sun.com” to know more.

12 Is that simple? This seems easy but when do you know you need synchronization? For example, you have a server class and which method needs to be synchronized to make it “thread- safe”?

13 Is that simple? Cont. class Foo { public int global=100; //global data public void fun1(){ if (global>100) ….} Public void fun2(){ int local=100; //a local temporary variable if (local>100) …} } Question: which method needs synchronization?

14 Is that simple? Cont. Obviously the “fun1” does because it accesses a “global” variable which will be shared by other threads. However, before you give this answer, ask yourself if you DO have multiple threads running in your class. (Does it sound absurd?)

15 The multi-thread model at server In java, if you are extending from UnicastRemoteObject then basically you will only create one single instance of your class in server and there are indeed multiple requests trying to call methods of your class. But if you are doing webservice, the servlet will be simply created whenever there is an incoming request. In other words, your servlet is running with single thread. Then why do you have to worry about “thread-safe”?

16 The multi-thread model at server UnicastRemoteObject servlet inside servlet container (in RMI, CORBA) (Web Service) Three client requests are invoking Three client requests create Server object methods at the same three servlets for each request time

17 The multi-thread model at server Then obviously in the “web service” model, for each servlet, you have exact one request invoking one method. In other words, it is purely single thread model inside servlet. Good thing: It seems we don’t have “thread- safe” issue here, isn’t it? (really?) Bad thing: constructor of servlet will be called each time a request comes in which is very inefficient.

18 The multi-thread model at server Then can we say we don’t have to worry about thread-safe or synchronization issue in servlet? The answer is no if you are using static global variables. Modify the previous example a little bit by add “static” before the “global” variable, then you still need synchronization for “global” because it is “class-wise” global even you have multiple instance. i.e. All instances are sharing with “static” global variables.

19 Why do we bother this? Have you asked yourself this question? Yes, we do this for reasons because synchronization is expensive. 1.In the first scenario, all 30 threads must wait until all are created. 2.In second scenario, you can only do your job one by one. 3.In third scenario, you have an upper limit.(Even though this is from the nature of application.)

20 Why do we bother this? Cont. Apart from program execution restriction (limits the number of job) the synchronization tool itself is also non-trivial which may consumes much resources. The general principle is to use them ONLY when necessary. See the following example and try to ask yourself if you do need synchronization.

21 Do we need it? sem.acquire(); //assume sem is a semaphore flag=true; //flag is a global variable sem.release(); Here indeed we don’t need semaphore to protect because we simply change the value of flag which is an atomic operation. (In assembly language it is indeed one single instruction.)

22 Do it smartly sem.acquire(); //assume sem is a semaphore if (flag) do_a_really_long_job(); sem.release(); The above code segment is another typical issue in thread programming. Even some professional programmer may write like this. Assume “do_a_really_long_job()” is such a function that it really takes some time to finish. (Of course, it will modify “flag”, otherwise why should we include it within “critical section”? Think about it.) Is it wise to include such a long job inside “critical section”? Because all other threads will be blocked before “sem.acquire()”.

23 Do it smartly cont. One possible solution is to assign each thread with a unique ID. For example, passing index in constructor of thread class. Then use an array of boolean variable to diffierenciate the signal. All threads will simply check only one boolean variable and then set its own flag if “working” is false. Then quickly release the semaphore. So, for most of threads they will not block by the semaphore. (Of course you can also solve this by using non-blocking version of semaphore. This will be discussed later.)

24 Do it smartly cont. MyThread threads[]=new MyThread[10];//init array boolean available[]=new boolean[10]; boolean working=false; for (int i=0; i<10; i++){ threads[i]=new MyThread(i); available[i]=false; threads[i].start();} … sem.acquire(); if (!working){ working=true; available[MyThread::myID]=true;} sem.release(); if (available[MyThread::myID]) { do_a_really_long_job(); available[MyThread::myID]=false; working=false; }

25 Any shortcut? One of the good thing about java is that it supplies many built-in synchronization mechanism. And one of the bad thing is that you may be trapped if you don’t pay attention. There is a “synchronized” key word which guaranttee the method to be “thread-safe”. However, if you don’t really understand it, the “thread-safe” would become “thread-dead”.

26 Any shortcut? Cont. The exact meaning of “synchronized” is to make the function “atomic”. i.e. Whenever a thread is executing a “synchronized” method, context switch will not swap other threads to execute this method. i.e. Exactly one thread is executing “synchronized” method class-wise as long as it holds the “token”. In other word, if one thread is running within ONE of all “synchronized” member method, no other threads can execute ANY other “synchronized” methods.

27 Isn’t good? It seems good as long as you don’t run your “while” loop within one of these “synchronized” method. Please refer to the example in my website: http://www.cs.concordia.ca/~qingz_hu/MyClass.java http://www.cs.concordia.ca/~qingz_hu/MyClass.java Just imagine a TCP client thread is “while” looping in one “synchronized” method waiting for a “signal” from another working thread to finish its file operation. Accidentally the working thread needs to access a “synchronized” file operation method. However, as long as TCP client thread is inside its “synchronized” method, working thread cannot enter another “synchronized” method. A dead lock is created here.

28 Isn’t good? Cont. What is the lesson we learn here? Don’t abusively use “synchronized” key word. The TCP client thread probably will be the only user of that TCP socket accessing method. Then why do you worry about synchronization? So, let’s remove the “synchronized” from TCP client method.

29 Is this all? Is this all for synchronization? Definitely not. There is another approach for synchronization which is specificly for file access. This is “lock” which acts as a token for control access. And there are various kinds of file lock and they have some advantages which semaphore doesn’t have. And most importantly it seems more natural to most programmers.

30 Blocking vs. Non-blocking What else should I know about synchronization? First of all, you should understand there are both blocking and non-blocking method for most of synchronization tools. Let’s take file lock as an example to introduce this concept. Let’s create a file pointer. RandomAccessFile myFile=new RandomAccessFile("myFile.txt", "rw");

31 Blocking vs. Non-blocking cont. //the blocking method means the program may stop there FileLock myLock= myFile.getChannel().lock(); //the non-blocking method means it will return immediately FileLock myLock= myFile.getChannel().tryLock(); However, here you need to handle the possible exception created by “tryLock”. The advantage is obvious because you don’t want your program stuck there for undefinitely long.

32 What is special about java? One thing against intuition of programmer is that the file lock in java is mandatory which means event the owner needs to release the lock before he can modify files. So the following code may seem absurd like following: FileLock myLock= myFile.getChannel().lock(); //If you reach here, it means you get the lock. But why should I release what I just get? myLock.release(); //And here you have a synchronization flaw Please note the above synchronization flaw where you release the lock before you can write into file. But what if some other thread lock the file then by context switch? It seems that we need a semaphore to protect our file lock. However, why don’t we just use semaphore instead of file lock?

33 Semaphore or file lock? Potentially you can always use semaphore instead of touching file lock. However, file lock is a built-in mechanism and it maybe implemented by OS which means it has the highest performance and security advantage. i.e. semaphore is “cooperative” synchronization tool, what if the programmer doesn’t even take care about your semaphore? What if you also want to protect your data against other processes? Can you use semaphore? No, they even are not aware of such a semaphore! In these cases, file lock is your only choice because it is “mandatory”.

34 Deadlock Talking about deadlock, we need to turn back again with blocking and non-blocking issue. Generally speaking, you won’t get deadlock if you are carefully using non-blocking calls. See the following example. Customer A in stock broker M is trying to transfer stock S to account of customer B in stock broker N. At the same time, customer B in stock broker N is trying to transfer stock S to account of customer A. Since transfer is an transaction, it has property of all or nothing. Before the transfer, both A and B try to lock his account and send out transfer request to each other. The transfer managers in both A and B try to lock the destination account. However, they are already locked before transfer. If we are using blocking call of “file lock”, we are deadlock.

35 Deadlock cont. How to solve this deadlock problem? What if we have more complex situations like a triangle? For example, three customers try to transfer his stock to the other in a circle manner. The general scheme for avoiding deadlock situation is either to detect deadlock before it happens or break deadlock after it happens.

36 Deadlock cont. Is deadlock detection easy? Imagine we have hundreds active customers who are doing transfer continuously. Do we run our deadlock detection algorithm for every transfer request? What about deadlock breaking mechanism?

37 Deadlock cont. However, even we cannot detect deadlock in advance, we still can avoid deadlock by breaking it away. Let’s use “tryLock” instead “lock” which is a blocking call. If we catch the exception, we know someone else already gets the file lock and there is no way for us to finish transfer operation. The simple choice is just aborting the transaction gracefully. By doing this, we sacrifice the throughput a little bit to save us from the risk of deadlock. Of course there are many other scheme for your choice, this seems to be one intuitive choice.

38 Synchronized vs. Asynchronized Is synchronized the goal we pursue? No, because it generally has low throughput and usually we prefer asynchronized mode. However, sometimes we don’t have choices because correctness is always the most important issue. The less synchronization your program has, the less restrictions there are and the more efficient it will be.


Download ppt "Synchronization Comp6231 © Qingzhe Huang. What is synchronization? Generally speaking, synchronization is simply to place a bunch of jobs in a certain."

Similar presentations


Ads by Google