Presentation is loading. Please wait.

Presentation is loading. Please wait.

MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu

Similar presentations


Presentation on theme: "MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu"— Presentation transcript:

1 MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu Shan Lu, Soyeon Park, Chongfeng Hu, Xiao Ma, Weihang Jiang, Zhenmin Li, Raluca A. Popa, and Yuanyuan Zhou University of Illinois

2 Bugs are bad! Software bugs are costly! Account for 40% of system failures [Marcus2000] Cost US economy $59.5 billion annually [NIST] Techniques to improve program correctness are desired

3 Software bug categories Memory bugs Improper memory accesses and usage A lot of study and effective detection tools Semantic bugs Violation to the design requirements or programmer intentions Biggest part (~80%*) of software bugs No silver bullet Concurrency bugs Wrong synchronization in concurrent execution Increasingly important with the pervading concurrent program trend Hard to detect * Have Things Changed Now? -- An Empirical Study of Bug Characteristics in Modern Open Source Software [ACID’06]

4 An important type of semantic information Software programs contain many variables Variables are NOT isolated Semantic bond exists among variables Correct programs consistently access correlated variables x y z s t u vw Variable Access Correlation

5 Variable correlation in programs Semantic correlation widely exists among variables struct fb_var_screeninfo { … int red_msb; int blue_msb; int green_msb; int transp_msb; } Linux Different aspects struct net_device_stats { … long rv_packets long rv_bytes; } Linux Different representation struct st_test_file * cur_file; struct st_test_file * file_stack; MySQL Implementation -demand Class THD { … char* db; int db_length; } MySQL Constraint specification 4

6 write ( )  write ( ) Variable access correlation ( constraint ) Maintaining correlation usually needs consistent access db db_length red/…/transp red/…/transp A1 ( x )  A2 ( y ) access read write access read write rv_packets rv_bytes file_stack cur_file write ( )  access * ( ) write ( )  write ( ) access ( )  access ( ) Variable access correlation *access: read or write

7 Violating the correlations leads to bugs Programmers may forget to access correlated variables A type of semantic bugs not handled by previous tools Correlated variables Mostly consistent access --- correct Inconsistent access --- BUG! Confirmed by Linux developers Inconsistent update bugs More examples of inconsistent update bugs are in our paper.

8 Programmers may forget to synchronize concurrent accesses to correlated variables This is NOT a traditional data race bug Bug occurs even if accesses to each single variable are well synchronized js_FlushPropertyCache ( … ) { memset ( cache  table, 0, SIZE); … cache  empty = TRUE; } js_PropertyCacheFill ( … ) { cache  table[indx] = obj; … cache  empty = FALSE; } Violating the correlations leads to bugs (ii) Multi-variable concurrency bugs struct JSCache { … JSEntry table[SIZE]; bool empty; } Thread 1Thread 2 lock ( T ) unlock ( T ) lock ( E ) unlock ( E ) Mozilla lock ( T ) unlock ( E ) lock ( E ) BUG

9 Our contribution A technique to automatically infer variable access correlation Bug detection based on variable access correlation Inconsistent-update semantic bugs Multi-variable concurrency bugs Disclose correlations and new bugs from real-world applications (Linux -device_driver, Mozilla, MySQL, Httpd) > 6000 variable correlations 39 new inconsistent-update semantic bugs 4 new multi-variable concurrency bugs from Mozilla

10 Outline Motivation What is variable access correlation MUVI variable access correlation inference MUVI bug detection Inconsistent-update semantic bug detection Multi-variable concurrency bug detection Evaluation Conclusions

11 Access correlation Basic idea of correlation inference Our target: Our inference method: Assumption: mature program, mostly correct x and y appear together in many times x and y seldom appear separately Statistically infer access correlation based on variable access pattern in source code access correlation A1 ( x )  A2 ( y ) How to judge ``together’’?  Our metric: static code distance within a function scope  Our paper talks about other potential metrics How to do this efficiently?

12 Frequent itemset mining A common data mining technique Itemset: a set of items ( no order ) E.g. (v, w, x, y, z) Sub-itemset: E.g. (w, y) Itemset database Goal: find frequent sub-itemsets in an itemset database Support: number of appearances E.g. support of (w, y) is 3 Frequent: support > threshold (v, x, m, n) (v, w, y, t ) (v, w, y, z, s ) ( v, w, x, y, z )

13 Flowchart of variable correlation inference Source files Mining Frequent variable sets Itemset Database Pre-processing Variable access correlation Post-processing How?

14 MUVI Inference algorithm (pre-process) Program Source Code Itemset Database ? What is an item? A variable What is an itemset? A function What to put into an itemset? Accessed variables Access type (read/write)

15 MUVI Inference algorithm (pre-process) Input: program Output: an itemset database Flow-insensitive, inter-procedural analysis Consider Global variables and structure-typed variables Also consider variables accessed in callee functions ……… {read, z}f3 {write, S::y}f2 {read, x}f1 Database int x; f1 ( ) { read x; } f2 ( ) { S t; write t.y; } int z; f3 ( ) { read z; f1 ( ); f2 ( ); } {read, x} {write, S::y} f1f2 f3

16 MUVI Inference algorithm (post-process) Input: frequent variable sets (x, y), which appear together in many functions Pruning What if x and y appear separately many times? Prune out low confidence (conditional probability) pairs What if x is too popular, e.g. stderr, stdout? Categorize based on access type write (x)  write (y)? Or write (x)  read (y)? etc. Output: variable correlation A1 ( x )  A2 ( y )

17 Outline Motivation MUVI variable access correlation inference MUVI bug detection Inconsistent-update semantic bug detection Multi-variable concurrency bug detection Evaluation Conclusions

18 Inconsistent-update bug detection Step 1: get all write(x)  acc(y) correlations Step 2: get all violations to above correlations Step 3: prune out unlikely bugs Code analysis to check caller and callee functions write (fb_var_screeninfo::blue_msb)  access (fb_var_screeninfo::transp_msb) #support = 11 #violation = 1 (function neofb_check_var)  inconsistent-update bug

19 Multi-variable concurrency bug detection -- MUVI Lock-set algorithm Original algorithm Look for common locks among conflicting accesses to each shared variable MV Lock-Set algorithm Look for common locks among conflicting accesses to each shared variable and their correlated accesses

20 Multi-variable concurrency bug detection -- Other MUVI extension algorithm MUVI happens-before algorithm Check the happens-before relation among conflicting accesses to each single variable Check the happens-before relation among conflicting accesses to each single variable and correlated accesses Other extension Extending hybrid race detection Extending atomicity violation bug detection

21 Outline Motivation MUVI variable access correlation inference MUVI bug detection Inconsistent-update semantic bug detection Multi-variable concurrency bug detection Evaluation Conclusions

22 Methodology For variable correlation and inconsistent-update bug detection: Linux (device driver) Mozilla MySQL PostgreSQL For multi-variable concurrency bug detection: Five existing real bugs from Mozilla and MySQL All latest versions Find four new multi-variable concurrency bugs during the detection process

23 Results on correlation inference App.#Access- Correlation #Involved Variables %False Positives Analysis Time Mozilla %157m MySQL %19m Linux %175m Postgre-SQL %98m  Macro, inline functions  coincidence

24 Inconsistent-update bug detection results App.# of MUVI bug report # of new bugs found # of bad programming # of false positives Linux4022 (12)513 Mozilla307 (0)815 MySQL209 (5)38 Postgre-SQL 101 (0)45  Semantic exceptions  Wrong correlations  No future read access

25 MV-Happens-Before has similar results Multi-variable concurrency bug detection results Bug MV-Lockset Detect Bug?False Positive Moz-js1 Y1 Moz-js2 Y2 Moz-imap Y0 MySQL-log Y3 MySQL-blog N0  Variables are conditionally correlated  The correlation is missed by MUVI

26 Multi-variable concurrency bug detection results 4 new multi-variable concurrency bugs detected! Wrong result!

27 Conclusion Variable access correlations can be inferred Variable access correlation is important Help detect two types of bugs Other usage Provide specifications to ease programming Provide hints for assigning locks or TMs E.g. AtomicSet, AutoLocker, Colorama

28 Related works Program specification inference [ErnstICSE00], [EnglerSOSP01], [KremenekOSDI06], [LiblitPLDI03], [WhaleyISSTA02], [YangICSE06], etc. Code pattern mining [LiOSDI04], [LiFSE05], [LivshitsFSE05], etc. Concurrency bug detection [ChoiPLDI02], [EnglerSOSP03], [FlanaganPOPL04], [SavageTOCS97], [Praun01], [XuPLDI05], [YuSOSP05], etc. Techniques for easing concurrent programming [Harris03], [HerlihyISCA93], [McCloskeyPOPL06], [Rajwar02], [Hammond04], [Moore6], [Rossbach07], etc.

29 Acknowledgement Prof. Stefan Savage (shepherd) Anonymous reviewers Prof. Liviu Iftode GOOGLE student travel grant NSF, DOE, Intel research grants

30 Thanks!


Download ppt "MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs Shan Lu"

Similar presentations


Ads by Google