Code Learning and Transfer for Automatic Patch Generation

Code Learning and Transfer for Automatic Patch Generation
Fan Long MIT EECS & CSAIL

Generate And Validate Patching
A standard way to deal with this situation has emerged – called generate and validate patching.

Generate Candidate Patches
Suspicious Statements: line 340 at foo.c line 338 at foo.c line 337 at foo.c ... Apply Modifications Search Space of Candidate Patches …

Anatomy of a Modification
Statement in Original Unpatched Program Statement in Patched Program if (C) {…} else {…} if (C && E) {…} else {…} E is a clause of the form “exp == c” or “exp != c”, where exp is a variable or field access and c is a constant Here is one example modification that Prophet applies: Possibilities for E in control flow modifications are all 1) E is local variable, global variable, structure access path. Local variable occurs in same function. Global accessed somewhere in file. structure access path before last member access should occur in same basic block (a.b->c, a.b occurs in same basic block). Synthesized conditions check E == K or E != K for some K that appears in E during negative test case. 2) For tighten and loosen, also try clauses CL that appear in C, both CL and !CL. Split C on && and ||, try each alternative A and !A For machine learning ranking, put these in a bin with 3) below, feature says there is an abstract expression. 3) Also try 0 for tighten and insert guard. For loosen try 1. Goto L – L in same function. Return K – K is a constant (+0) that appears in the same function.

Other Modifications if (C) {…} else {…} if (C || E ) {…} else {…} S
if ( E ) { S } S if ( E ) return c; S Replace S S[replace v1 with v2] Here are the remaining modifications that Prophet applies to a statement S. ... This is an empirical set of modification transforms. Each modification transform is a hypothesis about an error the developer may have made when they wrote the code. The goal of the transform is to correct that error. Modifications manipulate program at the granularity of expressions. Possibilities for E in control flow modifications are all 1) E is local variable, global variable, structure access path. Local variable occurs in same function. Global accessed somewhere in file. structure access path before last member access should occur in same basic block (a.b->c, a.b occurs in same basic block). Synthesized conditions check E == K or E != K for some K that appears in E during negative test case. 2) For tighten and loosen, also try clauses CL that appear in C, both CL and !CL. Split C on && and ||, try each alternative A and !A For machine learning ranking, put these in a bin with 3) below, feature says there is an abstract expression. 3) Also try 0 for tighten and insert guard. For loosen try 1. Goto L – L in same function. Return K – K is a constant (+0) that appears in the same function. Copy & Replace S Q[replace v1 with v2]; S Initialize S memset(&e, 0, sizeof(e)); S

Validate Candidate Patches
… p->f1 = y; = Positive Inputs = = = ≠ Negative Inputs = ≠ Generate a search space of candidate patches

… p->f1 = y z; = Positive Inputs = = And then validate each candidate patch against the test suite = Negative Inputs = ≠ Validate each candidate patch against the test suite

… p->f1 f2 = y; = Positive Inputs = ≠ = Negative Inputs = Validate each candidate patch against the test suite

… if (p != 0) return; p->f1 = y; = Positive Inputs = = The system collects all candidate patches that = Negative Inputs = Collect all of the patches that validate

Challenges for Automatic Patch Generation
Only specification is the test suite Many validated but incorrect patches How to prioritize correct patches ahead? We are going to be able to patch systems with million lines of code. You may generate patches that validate but have negative effects. How to make patch generation systems work well in the presence of these patches.

A Validated but Incorrect Patch
… return; p->f1 = y; = Positive Inputs = = Change the small code example. = Negative Inputs = Because test suite is incomplete!

Negative Effects of Validated but Incorrect Patches
Remove functionality Generate incomplete bug fix Introduce vulnerability CVE tsize_t offset = dir->tdir_offset + cc; if ((tsize_t)dir->tdir_offset != offset - cc || offset > (tsize_t)tif->tif_size) goto bad; TODO: Validated but Incorrect patches could have negative effects. if (dir->tdir_offset + cc > tif->tif_size) goto bad; Original code: tif_dirread.c Developer patch

Negative Effects of Validated but Incorrect Patches
Remove functionality Generate incomplete bug fix Introduce vulnerability CVE tsize_t offset = dir->tdir_offset + cc; if ((tsize_t)dir->tdir_offset != offset - cc || offset > (tsize_t)tif->tif_size) goto bad; TODO: Validated but Incorrect patches could have negative effects. if (dir->tdir_offset + cc > tif->tif_size) goto bad; A validated but incorrect patch Developer patch

Challenges for Automatic Patch Generation
Only specification is the test suite Many validated but incorrect patches How to prioritize correct patches ahead? Search space explosion Tradeoff between coverage and tractability What mutation transforms we should use? We are going to be able to patch systems with million lines of code. You may generate patches that validate but have negative effects. How to make patch generation systems work well in the presence of these patches.

Learning-based Patch Generation
💡💡💡… Learned Human Knowledge Training Set of Past Human Patches Code Learning Systems A Program With Bug Generate-and-validate Patch Generation Result Patches

The Growing Number of Existing Programs
GitHub Open Source Projects It is not surprising that we now have more software programs than ever before. We got 12 millions projects hosted on the Github and this number is rapidly counting. This is just one of the repository hosting sites. If you count all software programs in the world, the number will be much bigger. There is an enormous amount of software in the world with more coming every day. While this software is very valuable and does great things for us, I’m here to tell you that this software is of variable quality. It often contains bugs that cause the software to give the wrong result or crash. And security vulnerabilities are always a prominent problem.

How Learned Human Knowledge Can Help?
Better patch prioritization: Prophet [POPL’16, Long and Rinard] HDRepair [SANER’16, Le et al.] Better search spaces: Genesis [MIT-TR , Long et al.]

Patch Generation System
Prophet Patch Generation System Ranked list of patches All of which pass test suite Prophet Test Suite Key issue here – for many bugs, there can be 10s or even 100s of patches that pass the test suite. But are nevertheless incorrect. Developer chooses correct patch Goal: rank correct patch first Inputs Correct Outputs

Prophet: Key Insights Correct patches share universal features that hold across applications These features capture interactions between the patch and the surrounding code TODO: Hypothesis: learn across application. Many people do not believe this, we tested this hypothesis in our experiments

Example Features PHP_METHOD(DatePeriod, __construct)
{ char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } Let's go back to our example.

Atomic characteristic (patch): Atomic characteristic (original code):
Example Features PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } Atomic characteristic (patch): A variable is checked by a condition. Atomic characteristic (original code): is also a call parameter at the current statement. Co-occurrence pairs: <checked, call para/C> Let's go back to our example.

Example Features PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } Atomic characteristic (patch): A variable is checked by a condition. Atomic characteristic (original code): is also address-taken before in the original code Co-occurrence pairs: <checked, call para/C> <checked, addr taken/B>

Example Features PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } Atomic characteristic (patch): A variable is checked by a condition. Atomic characteristic (original code): is also a local variable Co-occurrence pairs: <checked, call para/C> <checked, addr taken/B> <checked, local var> Interaction in a meaning full way

Atomic characteristic (patch):
Example Features PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } Atomic characteristic (patch): A variable is checked by a condition. Atomic characteristic (original code): is also a pointer Co-occurrence pairs: <checked, call para/C> <checked, addr taken/B> <checked, local var> <checked, pointer> Prophet extract all of these co-occurrence pairs of atomic characteristics as features. Each of the pair indicates an interaction between the patch and the original code.

Example Features Co-occurrence pairs: <checked, call para/C>
PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || ht == 1) { initialize(str, str_len, …); … } else { … } … } Co-occurrence pairs: <checked, call para/C> <checked, addr taken/B> <checked, local var> <checked, pointer>

PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || ht == 1) { initialize(str, str_len, …); … } else { … } … } Co-occurrence pairs: <checked, call para/C> <checked, addr taken/B> <checked, local var> <checked, pointer> <checked, global var>

PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || ht == 1) { initialize(str, str_len, …); … } else { … } … } Co-occurrence pairs: <checked, call para/C> <checked, addr taken/B> <checked, local var> <checked, pointer> How we formalize this intuitive idea? Features -> Apply the model <checked, global var> By learning from the corpus, Prophet identifies: Positive Features

PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || str != 0) { initialize(str, str_len, …); … } else { … } … } PHP_METHOD(DatePeriod, __construct) { char* str = NULL; … if (parse(&str, &str_len)==FAIL) return; if (str_len || ht == 1) { initialize(str, str_len, …); … } else { … } … } Co-occurrence pairs: <checked, call para/C> <checked, addr taken/B> <checked, local var> <checked, pointer> <checked, global var> By learning from the corpus, Prophet identifies: Positive Features Negative Features

Using Program Analysis + Machine Learning To Prioritize Correct Patches
Use program analysis to extract features Obtain corpus of patches from open source software development efforts Learn a probabilistic model to prioritize correct patches TODO: Put the number of features I collected. Learn properties of successful patches from one set of applications. Use those properties to recognize correct patches for a completely different set of application. Universal properties that characterize correct patches.

Setup for Model Goal: estimate , given , Program S if (E) { S }
Modification Location (statement in ) Goal of the model is to give us an estimate that the patch is correct. A patch is a modification m applied to a location l in the program. Model will estimate m, l given the program and the model parameters. Use the estimate to rank the patches A patch is a modification applied to a location ( identifies a statement in program )

Probabilistic Model Probability that modification applied at location in program given produces a correct patch Log linear distribution based on extracted features lllll l We are going to use a standard log linear model. But we will encode the error localization information, which gives us an error localization rank r(p,l) for every location l, Using a geometric distribution. We choose this mechanism simply because it works well in practice. Geometric distribution that encodes error localization A patch is a modification applied to a location ( identifies a statement in program )

Application Lines of Code Defects libtiff 77 K 8 lighttpd 62 K 7 php
31 gmp 145 K 2 gzip 491 K 4 python 407 K 9 wireshark 2,814 K 6 fbc 97 K GenProg Benchmark set by Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, Westley Weimer [ICSE 2012]

Prophet Results Number of Bugs Angelix System

History Driven Program Repair
Candidates: - frequently occur in the knowledge base - pass negative test cases Mutates buggy program to create repair candidates Test Cases The idea of using existing bug fixes to better prioritize patches has applied to Java as well. HDRepair is such a system for Java. Knowledge base: Mined from past bug fix behaviors

Bug Fix History Mining Collection of Bug Fixes Graph Representation Pre-fix AST GumTree Graph Bug Fix Post-Fix AST Collection of Graphs Count the frequencies of different AST diff graph patterns. Convert pairs of ASTs into AST diff graph. gSpan Closed Graph Mining Collection of Graph Patterns

Candidates with higher scores
Selection Fix Patterns Candidates with higher scores E1 Matching Average Score Select E2 A single candidate patch Prioritize patches with higher pattern frequencies

HDRepair Results HDRepair 18 bugs PAR 4 bugs GenProg 1 bug
For 90 selected defects4j cases With a perfect localization oracle

All previous systems operates with a set of manually defined transforms
Manual Transforms … Search Space of Candidate Patches ,but you like to answer those questions: language change, coding style change, etc.

Can we learn how developers write patches in the first place?

?: … Learn how developers write patches in the first place
Training Human Patches Inferred Transforms … Search Space of Candidate Patches

Genesis: … Learn how developers write patches in the first place
Training Human Patches Inferred Transforms … Search Space of Candidate Patches

Genesis Results # of cases with correct patches
TODO: put 13/20 rather than 13, etc. Genesis outperforms humans

Code Transfer

Motivation If a bug in a program can be fixed by existing code logics in another program. Can we extract and transfer the code for patch generation? Systems: CodePhage [PLDI’15, Stelios et al.] QACrashFix [ASE’15, Gao et al.]

Display Cat

Cat Crash

ViewNior Cat

ViewNior Protection

Application-Independent Representation of Check
CodePhage Overview Donor Recipient viewnior 1.4 (stripped binary) display 6.5.2 (source code) 3. Identify Patch Insertion Point 8B45FC 4863F0 8B45FC 4863D0 5. Verify Patch 2. Extract Check 1. Locate Check 4. Translate and Insert (source code patch) Application-Independent Representation of Check

Patch Display

Patched Display protects Cat

QACrashFix ……

QACrashFix Overview

QACrashFix Overview Query search engine with stack trace snippet to find relevant Q&A pages.

Scrap the Q&A pages and extract edit scripts.
QACrashFix Overview Scrap the Q&A pages and extract edit scripts.

QACrashFix Overview Search a variable name mapping and apply edit scripts accordingly at suspicious locations.

Filter out invalid patches.
QACrashFix Overview Filter out invalid patches.

Summary Code Learning: Extract useful human knowledge to enable
Better patch prioritization Better transforms and search spaces Code Transfer: Transfer program logics of existing code From a donor program to a recipient program From a Q&A website example to a program Code Transfer techniques have stronger assumption, narrow scope, but better precision.

Looking into the Future
Human Developers: Domain specific knowledge Software engineering training Patch Generation Systems: Computation power Ultimate Goal: Build systems that combine human developer knowledge and machine computation power.

Looking into the Future
The growing volume of existing programs is not just a challenge but also a great opportunity. Exploiting this opportunity is a key for solving future software engineering problems

Code Learning and Transfer for Automatic Patch Generation

Similar presentations

Presentation on theme: "Code Learning and Transfer for Automatic Patch Generation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Code Learning and Transfer for Automatic Patch Generation

Similar presentations

Presentation on theme: "Code Learning and Transfer for Automatic Patch Generation"— Presentation transcript:

Similar presentations

About project

Feedback