Presentation on theme: "CSC 580 - Multiprocessor Programming, Spring, 2011 Outline for Chapters 15, 16 & Appendix A, Week 3, Dr. Dale E. Parson."— Presentation transcript:
CSC Multiprocessor Programming, Spring, 2011 Outline for Chapters 15, 16 & Appendix A, Week 3, Dr. Dale E. Parson
Atomic Variables and Nonblocking Synchronization Unnecessary serialization of threads, and competition for shared resources by threads, lead to diminished concurrency. “Better safe than sorry” can be taken to extremes, but... “Skating on thin ice” can lead to hard-to-detect races. Ratio of scheduling overhead to useful work can be high when a lock is frequently contended (p320). Volatile variables do not support atomic compound operations. Most machine instruction sets provide test-and-set, fetch-and-increment, swap, compare-and-swap or load-linked/store-conditional instructions.
Atomic Compound Machine Instructions Operating systems and JVMs have utilized these instructions to implement locks, but have exposed them in library classes only since Java 5.0. Test-and-set retrieves the old value of a memory location, stores a new value, returns the old value. Compare-and-swap takes an expected old value and a new value as arguments. If the memory location contains the old value, it stores the new value. It always returns the old value.
Test-and-set usage for spin locking or OS / runtime locking Use memory value 0 as unlocked, 1 as locked. Test-and-set value of 1 to attempt locking. If return value is 0 Lock was acquired, use it, then set value back to 0. Notify OS or runtime if needed to unblock contenders. Else Spin on the memory location repeatedly if acquisition is typically a very short time. OR, request OS or runtime to block this thread.
Compare-and-swap usage pattern Read value A from V, derive new value B from A, then use CAS to atomically change V from A to B so long as no other thread has changed V to another value in the meantime. (p. 322) See Simulated CAS on 322. CAS can avoid locking by safely detecting interference from other threads. The question of what to do after interference – block, retry (spin), or do other work depends on the anticipated degree of contention.
CAS benefits and costs CAS is significantly more efficient than locking in zero-contention and low-contention cases. Locking entails traversing a relatively complicated code path in the JVM and may entail OS-level locking, thread suspension, and context switches. (p. 323) Primary disadvantage is that the caller must deal with contention by retrying, backing off or giving up.
Atomic Java variables java.util.concurrent.atomic, two of four groups: Scalars AtomicInteger, AtomicBoolean, AtomicLong, AtomicReference Float.floatToIntBits, Float.intBitsToFloat, Double.doubleToLongBits, Double.longBitsToDouble for floating point types. Reference uses == to compare object references, not equals(). Arrays AtomicIntegerArray, AtomicLongArray, AtomicReferenceArray Volatile access semantics for array elements, which are not supported for regular volatile arrays.
Atomic Java variables cont. java.util.concurrent.atomic, two of four groups: Field updaters AtomicIntegerFieldUpdater, AtomicLongFieldUpdater, AtomicReferenceFieldUpdater -- abstract classes Uses Java reflection and newUpdater factory method to manufacture and updater object for use by client. Fields must be volatile and independent. Compound variables AtomicMarkableReference – atomic object reference and a mark bit AtomicStampedReference – atomic object reference and an int
Locks versus atomic variables – see graphs on pages 328 and 329 At high contention levels locking tends to outperform atomic variables. Free-running, contending threads repeatedly compete for atomic resource access in tight loops. At more realistic contention levels atomic variables outperform locks. See Graphs 5 & 7 of Summer Locks have higher intrinsic overhead than atomics. In practice, atomics tend to scale better than locks because atomics deal more effectively with typical contention levels. (p 328) Relevant when there are many accesses to cross-thread resources.
Nonblocking algorithms An algorithm is called nonblocking if failure or suspension of any thread cannot cause failure or suspension of another thread. (p 329) An algorithm is called lock-free if, at each step, some thread can make progress. Algorithms that use CAS exclusively for coordination between threads can, if constructed correctly, be both nonblocking and lock-free. Nonblocking algorithms are immune to deadlock or priority inversion; starvation and livelock are possible because they can involve unbounded retries.
Nonblocking algorithms Examine nonblocking stack on page 330. Examine nonblocking linked list. In nonblocking algorithms some work is done speculatively and may have to be redone. The trick to building nonblocking algorithms is to limit the scope of atomic changes to a single variable. Examine the atomic field updaters.
Barriers and the Java Memory Model Compilers and the runtime environment may reorder instruction execution sequences. Registers and caches delay memory updates. In a single-threaded process these transformations are transparent except for speedup. Although they can impede interactive debugging. Java has within-thread as-if-serial semantics (p 337) Classic languages such as C use intrinsics & library functions that serve as barriers, causing cache flushes & inhibiting reordering across barriers. Accesses to volatile variables are not reordered. Flushing of memory updates tends to be conservative.
Rules for JVM happens-before (p. 341) Happens-before is a partial ordering. Of two events, one may happen before another, or their temporal relationship may be nondeterministic. 1.Program order rule: Each action in a thread happens-before every action in that thread that comes later in the program order. 2.Monitor lock rule: An unlock on a monitor lock happens-before every subsequent lock on that same monitor lock. (intrinsic & explicit locks)
Rules for JVM happens-before (p. 341) 3.Volatile variable rule: A write to a volatile field happens-before every subsequent read of that same field. (atomic variables as well) 4.Thread start rule: A call to Thread.start happens- before every action in the started thread. 5.Thread termination rule: Any action in a thread happens-before any other thread detects that thread has terminated, either by successfully return from Thread.join or by Thread.isAlive returning false.
Rules for JVM happens-before (p. 341) 6.Interruption rule: A thread calling interrupt on another thread happens-before the interrupted thread detects the interrupt (either by having InterruptedException thrown, or invoking isInterrupted or interrupted). 7.Finalizer rule: The end of a constructor for an object happens-before the start of the finalizer for that object. 8.Transitivity: If A happens-before B, and B happens- before C, then A happens-before C.
Unsafe object publication The possibility of reordering in the absence of a happens- before relationship explains why publishing an object without adequate synchronization can allow another thread to see a partially constructed object. (p. 344) Unsafe publication can happen as the result of an incorrect lazy initialization. With the exception of immutable objects, it is not safe to use an object that has been initialized by another thread unless the publication happens-before the consuming thread uses it. See examples pages
Initialization safety (p. 349) Initialization safety guarantees that for properly constructed objects, all threads will see the correct values of final fields that were set by the constructor, regardless of how the object is published. Further, any variables that can be reached (only) through a final field of a properly constructed object are also guaranteed to be visible to other threads.
Nonfinal fields Initialization safety makes visibility guarantees only for the values that are reachable through final fields as of the time the constructor finishes. For values reachable through nonfinal fields, or values that may change after construction, you must use synchronization to ensure visibility. (p 350) The explicit Java Memory Model helps jut-in-time compilation find opportunities for optimization. The alternative is overly conservative optimization because of insufficient guarantees in an under-specified memory model.
Annotations for Concurrency Use these to describe a class’s intended thread-safety means that the class is immutable, means that objects of the class are adequately synchronized for multithreaded access without additional client synchronization of individual objects of the class. is the optional default.
@GuardedBy (p. documents that a field or method should be accessed only with a specific lock held, where lock may be: “this” uses the object’s intrinsic lock. “fieldname” names a Lock field or another object’s intrinsic lock “Classname.fieldname” uses a static field of a class. “methodname()” means the lock object returned by the named method. “Classname.class” means the class literal object for the named class (intrinsic lock on the Class object).