Detecting Atomicity Violations via Access Interleaving Invariants

Slides:



Advertisements
Similar presentations
Wait-Free Linked-Lists Shahar Timnat, Anastasia Braginsky, Alex Kogan, Erez Petrank Technion, Israel Presented by Shahar Timnat 469-+
Advertisements

Threads Cannot be Implemented As a Library Andrew Hobbs.
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
Concurrent programming: From theory to practice Concurrent Algorithms 2014 Vasileios Trigonakis Georgios Chatzopoulos.
An Case for an Interleaving Constrained Shared-Memory Multi-Processor Jie Yu and Satish Narayanasamy University of Michigan.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
D. Tam, R. Azimi, L. Soares, M. Stumm, University of Toronto Appeared in ASPLOS XIV (2009) Reading Group by Theo 1.
ECE 454 Computer Systems Programming Parallel Architectures and Performance Implications (II) Ding Yuan ECE Dept., University of Toronto
Silberschatz, Galvin and Gagne ©2007 Operating System Concepts with Java – 7 th Edition, Nov 15, 2006 Chapter 6 (a): Synchronization.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Is SC + ILP = RC? Presented by Vamshi Kadaru Chris Gniady, Babak Falsafi, and T. N. VijayKumar - Purdue University Spring 2005: CS 7968 Parallel Computer.
On-the-Fly Garbage Collection: An Exercise in Cooperation Edsget W. Dijkstra, Leslie Lamport, A.J. Martin and E.F.M. Steffens Communications of the ACM,
An Case for an Interleaving Constrained Shared-Memory Multi- Processor CS6260 Biao xiong, Srikanth Bala.
Atomicity in Multi-Threaded Programs Prachi Tiwari University of California, Santa Cruz CMPS 203 Programming Languages, Fall 2004.
Slides 8d-1 Programming with Shared Memory Specifying parallelism Performance issues ITCS4145/5145, Parallel Programming B. Wilkinson Fall 2010.
Efficient and Flexible Architectural Support for Dynamic Monitoring YUANYUAN ZHOU, PIN ZHOU, FENG QIN, WEI LIU, & JOSEP TORRELLAS UIUC.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Dec 5, 2005 Topic: Intro to Multiprocessors and Thread-Level Parallelism.
Yuanyuan ZhouUIUC-CS Architectural Support for Software Bug Detection Yuanyuan (YY) Zhou and Josep Torrellas University of Illinois at Urbana-Champaign.
Mid Review of Class Argument Validation and Synchronization Guidelines April 26, 2000 Instructor: Gary Kimura.
PathExpander: Architectural Support for Increasing the Path Coverage of Dynamic Bug Detection S. Lu, P. Zhou, W. Liu, Y. Zhou, J. Torrellas University.
29-Jun-15 Java Concurrency. Definitions Parallel processes—two or more Threads are running simultaneously, on different cores (processors), in the same.
CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library.
/ PSWLAB Eraser: A Dynamic Data Race Detector for Multithreaded Programs By Stefan Savage et al 5 th Mar 2008 presented by Hong,Shin Eraser:
Learning From Mistakes—A Comprehensive Study on Real World Concurrency Bug Characteristics Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou Appeared.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
15-740/ Oct. 17, 2012 Stefan Muller.  Problem: Software is buggy!  More specific problem: Want to make sure software doesn’t have bad property.
Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.
Computer Architecture 2015 – Cache Coherency & Consistency 1 Computer Architecture Memory Coherency & Consistency By Yoav Etsion and Dan Tsafrir Presentation.
Thread-Level Speculation Karan Singh CS
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors THOMAS E. ANDERSON Presented by Daesung Park.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Transactional Coherence and Consistency Presenters: Muhammad Mohsin Butt. (g ) Coe-502 paper presentation 2.
HARD: Hardware-Assisted lockset- based Race Detection P.Zhou, R.Teodorescu, Y.Zhou. HPCA’07 Shimin Chen LBA Reading Group Presentation.
Atom-Aid: Detecting and Surviving Atomicity Violations Brandon Lucia, Joseph Devietti, Karin Strauss and Luis Ceze LBA Reading Group 7/3/08 Slides by Michelle.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
Eraser: A dynamic Data Race Detector for Multithreaded Programs Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, Thomas Anderson Presenter:
CSCI1600: Embedded and Real Time Software Lecture 17: Concurrent Programming Steven Reiss, Fall 2015.
Soyeon Park, Shan Lu, Yuanyuan Zhou UIUC Reading Group by Theo.
Agenda  Quick Review  Finish Introduction  Java Threads.
Testing Concurrent Programs Sri Teja Basava Arpit Sud CSCI 5535: Fundamentals of Programming Languages University of Colorado at Boulder Spring 2010.
1 Programming with Shared Memory - 3 Recognizing parallelism Performance issues ITCS4145/5145, Parallel Programming B. Wilkinson Jan 22, 2016.
740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University.
6/27/20161 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam King,
Lecture 20: Consistency Models, TM
Healing Data Races On-The-Fly
Speculative Lock Elision
Background on the need for Synchronization
Threads Cannot Be Implemented As a Library
Atomic Operations in Hardware
The University of Adelaide, School of Computer Science
Effective Data-Race Detection for the Kernel
Specifying Multithreaded Java semantics for Program Verification
Automatic Detection of Extended Data-Race-Free Regions
Konstantis Daloukas Nikolaos Bellas Christos D. Antonopoulos
References [1] LEAP:The Lightweight Deterministic Multi-processor Replay of Concurrent Java Programs [2] CLAP:Recording Local Executions to Reproduce.
Lecture 22: Consistency Models, TM
Chapter 5 Exploiting Memory Hierarchy : Cache Memory in CMP
Java Concurrency 17-Jan-19.
Hybrid Transactional Memory
CSCI1600: Embedded and Real Time Software
Programming with Shared Memory Specifying parallelism
Java Concurrency.
Java Concurrency.
Programming with Shared Memory - 3 Recognizing parallelism
CSCI1600: Embedded and Real Time Software
Programming with Shared Memory Specifying parallelism
Java Concurrency 29-May-19.
Rethinking Support for Region Conflict Exceptions
Presentation transcript:

Detecting Atomicity Violations via Access Interleaving Invariants AVIO Detecting Atomicity Violations via Access Interleaving Invariants S. Lu, J. Tucek, F. Qin and Y. Zhou UIUC

Outline Introduction Algorithm Implementation Evaluation Conclusion

Motivation Concurrency bugs hard to detect Appear only with specific interleaving Hard to detect during development Or even to repeat for post-mortem analysis CMP architecture is only going to make matters worse

Problem Tools like LockSet and Happens Before not enough Some buggy programs are actually race-free. Too many false positive Data Races NOT always bug! (benign races) Require knowledge of program synchronization semantics Barriers etc cause headaches

Problem - Example Race-free but buggy program. Data-Race detection is not enough What programmers want is not race-free program, but atomicity! What actually happened What the programmer wanted

AVIO idea - Atomicity ATOMICITY: The data manipulation of concurrently executed actions is equivalent to that of some serial execution of them Using locks, the programmer is actually trying to enforce atomicity. Bugs come from violation of unconscious atomicity intention

Access-Interleaving (AI) Invariance If an access to a variable and the previous access to the same variable from the same thread are NEVER interleaved unserializably (example a), then we have an Access-Interleaving Invariance. Example b cannot be interleaved serializably.

Access-Invariant Violation Violation of Access-Invariant when the programmer wanted atomic access, but failed to ensure that with locks, or other mechanisms Atomicity Violation types: P 1: Read X 1: Write X 2: Write X 2: Read X I

Access-Invariant Violation (cont)

Capturing Atomicity Intention AVIO automatically extracts AI invariants Run the program N times N different interleaves will probably be produced If the atomicity of two consecutive accesses to the same variable from the same thread has been violated at least T times, assume that the programmer does NOT wish atomic access. T times will ensure that a bug will not be considered acceptable.

Capturing Atomicity Intention

AVIO Implementation 2 implementations: Hardware Software The hardware implementation is fast, but coarse-grained The software is fine-grained, but sloooooow.

AVIO-H(ardware) CMP, physically indexed (?) private L1, unified L2. Invalidation-based cache coherence protocol Minimal changes to L1 (0.4% space increase) and the coherence protocol Add bit to I-instructions to identify them and notify L1

AVIO-H cont. INV=1: someone recently wrote to this line and invalidated my copy DG=1: someone downgraded my exclusive rights. PI: type of previous instr. of the same thread accessing the same line. Note: unserializable access must contact L2 to either get the copy, or upgrade rights → hide latency

AVIO-H drawbacks False sharing Padding, or increase granularity (at the expense of space) Cache Evictions, Context Switches, load-store queue, write-coalescing Happen way too rare, improbable between two accesses that were supposed to be atomic → Sweep, sweep, sweep and under the rug. It will catch violations caused by OoO issues Wait till retirement to raise the exception PREFETCH? What if someone prefetches X (and downgrades my rights) and does not use it?

AVIO-S(oftware) Pin binary instrumentation tool More precise than the AVIO-H, deals with the problems of the latter. Obviously, far more slower.

Evaluation - Platform AVIO-S runs on 4 Intel processors (which ones?) AVIO-H on Simics/SimFlex 4 core CMP, in-order x86, cycle-accurate 0.4% slowdown due to increased L1-size T = 0, during 100 training runs (with different inputs)

Evaluation – Applications tested

Evaluation – Bug Detection Detects more bugs than other methods Note that AVIO was trained on different inputs.

Evaluation – False Positives Less false positives than all other methods AVIO-S better than AVIO-H T = 0 can mask bugs, if bugs occurred during training runs

Evaluation - Overhead AVIO-H has minimum overhead (0.5%) AVIO-S better than other methods (25x)

Evaluation – Training Sensitivity 100 training runs seem to be enough.

AVIO Drawbacks Mainly the same problems with profilers: If initial input not correctly chosen, or not all path exercised, AVIO will fail. Unable to detect atomicity violations that involve multiple variables (as most tools) (Software) Training overhead could be problematic

Conclusion AVIO is a fast tool to detect atomicity violations AVIO-H may be worth its hardware cost As long as training overhead (not reported) remains reasonable. Questions