Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Low-Level Code Using Cooperating Decompilers Bor-Yuh Evan Chang Matthew Harren George C. Necula University of California, Berkeley SAS 2006.

Similar presentations


Presentation on theme: "Analysis of Low-Level Code Using Cooperating Decompilers Bor-Yuh Evan Chang Matthew Harren George C. Necula University of California, Berkeley SAS 2006."— Presentation transcript:

1 Analysis of Low-Level Code Using Cooperating Decompilers Bor-Yuh Evan Chang Matthew Harren George C. Necula University of California, Berkeley SAS 2006

2 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Why Analyze Low-Level Code? Analyze what is executed –Avoids issues with compiler bugs and underspecified source-language semantics Analyze when source is unavailable –Applications in mobile code safety assurance asm code asm code source code source code compiler Code Consumer Code Producer analyzer

3 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Motivation class C extends P { void m() { … } } P p = new P(); P c = … ? new C() : p; … c.m(); rmr r c := m[r sp ] r if (r c = 0) L exc rmr r 1 := m[r c ] rmr r 1 := m[r 1 +28] rr r sp := r sp - 4 mrmr m[r sp ] := m[r sp +4] r icall [r 1 ] Analyzers for low-level code are more difficult and tedious to build Example: Java Type Analysis r h r c : P, … i r h r c : nonnull P, … i r h r 1 : disp(P), … i r h r 1 : meth(P,28), … i mr h m[r sp ] : nonnull P, … i Type analysis intermixed with low-level reasoning e.g., args on stack Unsound: Dependencies must be carefully tracked Equal

4 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Motivation class C extends P { void m() { … } } P p = new P(); P c = … ? new C() : p; … c.m(); rmr r c := m[r sp ] r if (r c = 0) L exc rmr r 1 := m[r c ] rmr r 1 := m[r 1 +28] rr r sp := r sp - 4 mrr p m[r sp ] := r p r icall [r 1 ] Analyzers for low-level code are more difficult and tedious to build Example: Java Type Analysis h r c : P, … i h r c : nonnull P, … i h r 1 : disp(P), … i h r 1 : meth(P,28), … i h m[r sp ] : nonnull P, … iunsound

5 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Observations Handling low-level implementation details is common to many low-level analyzers –call stack (provides local variables) –register allocation –dependencies across instructions Ad-hoc modularization attempts failed Goal: Goal:Design a modular framework that makes it easy to write high-level analyses –for different architectures –for the output of different compilers

6 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Observations Intermediate languages abstract varying levels of detail –E.g., source language hides compiler details –Provides well-specified interface between (de)compiler phases Proposal: Proposal:Structure low-level analyses as small, incremental decompilation phases

7 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Basic Idea static void f(C c) { c.m(); } f: … rmr r c := m[r sp +12] r if (r c = 0) L exc rmr r 1 := m[r c ] rmr r 1 := m[r 1 +28] rr r sp := r sp - 4 mr m[r sp ] := mr m[r sp +16] r icall [r 1 ] … f: … rmr r c := m[r sp +12] r if (r c = 0) L exc rmr r 1 := m[r c ] rmr r 1 := m[r 1 +28] rr r sp := r sp - 4 mr m[r sp ] := mr m[r sp +16] r icall [r 1 ] … t f(t c ): t r c := t c r if (r c = 0) L exc rmr r 1 := m[r c ] rmr r 1 := m[r 1 +28] tt t 1 := t c rt icall [r 1 ](t 1 ) t f(t c ): t r c := t c r if (r c = 0) L exc rmr r 1 := m[r c ] rmr r 1 := m[r 1 +28] tt t 1 := t c rt icall [r 1 ](t 1 ) f(c): if (c = 0) L exc icall mm [m[m[c]+28]] (c) f(c): if (c = 0) L exc icall mm [m[m[c]+28]] (c) f(obj c): if (c = 0) L exc invokevirtual [c, 28] () f(obj c): if (c = 0) L exc invokevirtual [c, 28] () f(C c): if (c = 0) L exc c.m() LocalsSymEval OO JavaTypes your analyzer Symbolic Evaluation Dynamic Dispatch Local Variables

8 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Decompilers Intermediate languages define precise interfaces between modules/phases Modules communicate via decompiled instructions But … –Compilation is not easily invertible –E.g., many source-level operations map to a particular low-level instruction kind indirect jump function return exception raise method dispatch ? Need additional information

9 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Example Decompilation Pipeline Some information is obtained via analysis Stack depthStatically-scoped local variables Global value numbering Normalized expr. trees, SSA Object layout dependencies Java bytecode-like Local type inferenceJava-like LocalsSymEval OO JavaTypes Analysis: Decompiles To:

10 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Difficulties Unidirectional communication + analysis is insufficient rt icall [r 1 ](t 1 ) icall mm [m[m[c]+28]] (c) invokevirtual [c, 28] () c.m() r icall [r 1 ] LocalsSymEval OO JavaTypes Needs to know Needs to know call with 1 arg Can answer knows needs Can answer method call to C.m() because knows about the class table but needs previous instrs for analysis

11 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Overview of the Framework To enable bidirectional communication –cannot decompile in stages –decompile at all levels simultaneously Each decompiler analyzes the preceding instructions before decompiling the next Pipeline Reduced product

12 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers f: … r c := m[r sp +12] if (r c = 0) L exc r 1 := m[r c ] r 1 := m[r 1 +28] r sp := r sp - 4 m[r sp ] := m[r sp +16] r icall [r 1 ] … f(t c ): r c := t c if (r c = 0) L exc r 1 := m[r c ] r 1 := m[r 1 +28] t 1 := t c rt icall [r 1 ](t 1 ) f(c): if (c = 0) L exc icall mm [m[m[c]+28]] (c) f(obj c): if (c = 0) L exc invokevirtual [c, 28] () f(C c): if (c = 0) L exc c.m() Queries static void f(C c) { c.m(); } LocalsSymEval OO JavaTypes r sp : sp(-12) r 1 = m[m[c]+28]] t 1 = c c : nonnull objc : C isFunc(r 1 ) ? isFunc(m[m[c]+28]]) ? isMethod(c,28)? Yes, 0 args Yes, 1 arg Yes, 1 arg

13 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers … call C.m jump L exit L catch : … L exit : … … call C.m() jump L exit L catch : … L exit : … … call C.m() jump L exit L catch : … L exit : … … call C.m() jump L exit L catch : … L exit : … … C.m() jump L exit L catch (Exc e): … L exit : … Reinterpretations try { C.m(); } catch (Exc e) { … } LocalsSymEval OO JavaTypes r sp : sp(-12) ……… r sp : sp(-8) ……… … ? …… At L catch : havoc … rr r sp := r sp + 4 At L catch : havoc … rr r sp := r sp + 4 At L catch : havoc … At L catch : havoc …

14 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Communication Summary Low-to-High (Primary) –Decompilation Stream High-to-Low –Queries Initiated by lower-level Questions decompiled –Reinterpretations Initiated by higher-level Answers decompiled

15 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Soundness of Decompiler Pipelines Operational semantics for each language Safety encoded as not getting stuck ILIL IHIH l l0l0 h h0h0

16 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Soundness of Decompiler Pipelines Safe at high-level implies safe at low-level a a0a0 ILIL IHIH l l0l0 h h0h0 l 2 (a)

17 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Experiments Evaluate flexibility Evaluate modularity Evaluate applicability of existing source- level tools

18 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Flexibility For the output of gcc, gcj, and compilers for Cool (a mini-Java) To implement JavaTypes (no exns, interfaces) –3-4 hours, 500 lines LocalsSymEval JavaTypes CoolTypes OO CTypes MIPS x86 C Cool Java 40-50% 55-60%

19 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Modularity: Decompilers vs. Monolithic Compared on 10,339 tests –49 tests, 211 compilers –182 disagreements (pass/fail) Decompilers (new) –1 incompletenesses –0 soundness bugs Monolithic (used heavily) –5 incompletenesses –3 soundness bugs –mishandling of run-time library functions Calls uniform - Locals SSA - SymEval Type Checking Compiled Cool Modules approx. same size

20 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Applicability of Source-Level Tools Experimental Setup –Compiled 3 previously reported benchmarks for BLAST (B) and Cqual (Q) –Verified presence (or absence) of bugs as in the original Test Case Code SizeDecompVerification C (kloc) x86 (kloc)(s) Orig (s) Decomp (s) qpmouse.c (B) tlan.c (B) gamma_dma.c (Q) Limitations –Source-level tools needed types –Recovered types from debugging information On optimized code ( qpmouse.c ) –gcc –O2 except merge constants –reads byte in middle of word-sized field

21 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers Conclusion: Lessons Learned Need types for existing source analyses –E.g., BLAST on untyped C does not work Very useful low-level modules –Locals: recover statically-scoped local variables –SymEval: recover normalized expr. trees, SSA Defining output IL guides analysis impl. –IL specifies what analysis should guarantee –Leads to small and well-isolated modules

22 Thank You!

23 Chang, Harren, Necula - Analysis of Low-Level Code Using Cooperating Decompilers High-to-Low Communication Examples Queries –Does e point to a function/method? –Does e point to an object? Reinterpretations –Exceptional successor for try-catch blocks


Download ppt "Analysis of Low-Level Code Using Cooperating Decompilers Bor-Yuh Evan Chang Matthew Harren George C. Necula University of California, Berkeley SAS 2006."

Similar presentations


Ads by Google