Presentation is loading. Please wait.

Presentation is loading. Please wait.

Static Analysis of Executable Assembly Code to Ensure QA and Reuse Ramakrishnan Venkitaraman Graduate Student, Research Track Computer Science, UT-Dallas.

Similar presentations


Presentation on theme: "Static Analysis of Executable Assembly Code to Ensure QA and Reuse Ramakrishnan Venkitaraman Graduate Student, Research Track Computer Science, UT-Dallas."— Presentation transcript:

1 Static Analysis of Executable Assembly Code to Ensure QA and Reuse Ramakrishnan Venkitaraman Graduate Student, Research Track Computer Science, UT-Dallas Advisor: Dr. Gupta

2 Software Reuse & System Integration But, the Integrated System does not work Cost of Project Companies

3 Outline Need for reusable software binaries Our framework for reuse of software binaries Automated tool to enforce standard compliance.

4 Need for reusable software binaries Most third-party software is proprietary. COTS market place No recompiling, only linking Reduced development time Fewer bugs Time to Market

5 Motivation Problems faced by Embedded chip manufactures. System integration is very difficult.

6 Scope of the Framework Gives the sufficient conditions for software binary code reusability. Usability vs. Reusability Usability is a precondition for reusability E.g. Array index out of bound reference.

7 Framework for reusable software Binaries Code should not be hard coded Binaries should not be assumed to be located at a fixed virtual memory location Code should be reentrant No self-modifying code Should not make symbol resolution invalid

8 Problem and Solution Problem: Detection of hard coded addresses in programs without accessing source code. Solution: “Static Program Analysis”

9 Interest in Static Analysis “We actually went out and bought for 30 million dollars, a company that was in the business of building static analysis tools and now we want to focus on applying these tools to large-scale software systems” Remarks by Bill Gates, 17th Annual ACM Conference on Object- Oriented Programming, Systems, Languages and Application, November 2002.

10 Static Analysis Defined as any analysis of a program carried out without completely executing the program. Un-decidability: Impossible to build a tool that will precisely detect hard coding.

11 Hard Coded Addresses Bad Programming Practice. Results in non relocatable code. Results in non reusable code.

12 Some examples showing hard-coding void main() { int * p = 0x8800; // Some code *p = …; } Example1: Directly Hardcoded void main() { int *p = 0x80; int *q = p; //Some code *q = …; } Example2: Indirectly Hardcoded void main() { int *p, val; p = ….; val = …; if(val) p = 0x900; else p = malloc(…); *p; } Example3: Conditional Hardcoding NOTE: We don’t care if a pointer is hard coded and is never dereferenced.

13 Overview Of Our Approach Input: Object Code of the Software Output: Compliant or Not Compliant status Activity Diagram for our Static Analyzer Disassemble Object Code Split Into Functions Obtain Basic Blocks Obtain Flow Graph Static Analysis Output the Result

14 Basic Aim Of Analysis Find a path to trace pointer origin. Problem: Exponential Complexity Static Analysis approximation makes it linear

15 Analyzing Source Code – Easy { { q } } { { p } } P IS HARD CODED So, the program is not compliant with the standard

16 Analyzing Assembly Code is Hard Problem No type information is available Instruction level pipeline and parallelism Solution Backward analysis Use Abstract Interpretation

17 Analyzing Assembly – Hard 000007A0 main: 000007A0 07BD09C2 SUB.D2 SP,0x8,SP 000007A4 020FA02A MVK.S2 0x1f40,B4 000007A8 023C22F6 STW.D2T2 B4,*+SP[0x1] 000007AC 00002000 NOP 2 000007B0 023C42F6 STW.D2T2 B4,*+SP[0x2] 000007B4 00002000 NOP 2 000007B8 0280A042 MVK.D2 5,B5 000007BC 029002F6 STW.D2T2 B5,*+B4[0x0] 000007C0 00002000 NOP 2 000007C4 008C8362 BNOP.S2 B3,4 000007C8 07BD0942 ADD.D2 SP,0x8,SP 000007CC 00000000 NOP 000007D0 00000000 NOP {{ }} { { B4 } } B4 = 0x1f40 So, B4 is HARD CODED Code is NOT Compliant

18 Abstract Interpretation Based Analysis Domains from which variables draw their values are approximated by abstract domains. The original domains are called concrete domains.

19 Lattice Abstraction Lattice based abstraction is used to determine pointer hard-coded ness.

20 Contexts Contexts to Abstract Contexts Abstract Context to Context

21 Phases In Analysis Phase 1: Find the set of dereferenced pointers. Phase 2: Check the safety of dereferenced pointers.

22 Building Unsafe Sets (Phase 1) The first element is added to the unsafe set during pointer dereferencing. E.g. If “*Reg” in the disassembled code, the unsafe set is initialized to {Reg}. ‘N’ Pointers Dereferenced  ‘N’ Unsafe sets Maintained as SOUS (Set Of Unsafe Sets)

23 Populating Unsafe Sets (Phase 2) For e.g., if Reg = reg1 + reg2, the element “Reg” is deleted from the unsafe set, and the elements “reg1”, “reg2”, are inserted into the unsafe set. Contents of the unsafe set will now become {reg1, reg2}.

24 Pointer Arithmetic All pointer operations are abstracted during analysis.

25 Handling Loops Complex: # iterations of loop may not be known until runtime. Cycle the loop until the unsafe set reaches a “fixed point”. No new information is added to the unsafe set during successive iterations.

26 Merging Information If no merging, then exponential complexity. Mandatory when loops Information loss. If (Cond) Then Block B Else Block C Block D Block A Block E

27 Proof – Analysis is Sound Consistency of α and γ functions is established by showing the existence of Galois Connection. That is, x = α(γ(x)) y belongs to γ(α(y))

28 Extensive Compliance Checking Handle all cases occurring in programs. Single pointer, double pointer, triple pointer… Global pointer variables. Static and Dynamic arrays.

29 Extensive Compliance Checking Loops – all forms (e.g. for, while…) Function calls. Pipelining and Parallelism. Merging information from multiple paths.

30 Analysis Stops when… Compliance of all the pointers are established. Errors and warnings are reported. Log file containing statistics of the analysis is created.

31 Sample Code

32 Fig. Flow Graph

33 Analysis Results Program# Lines# * Ptrs # Hard Coded Chain Length Running Time (ms) t_read803 001280 timer112617 611441 mcbsp11960 001270 figtest29219 1021521 m_hdrv3456 212262 dat94910 8122512 gui_codec113910928 13063 codec118810928 13043 stress1203105 014505 demo135082 4794716

34 Related Work UNO Project – Bell Labs Analyze at source level TI XDAIS Standard Contains 35 rules and 15 guidelines. SIX General Programming Rules. No tool currently exists to check for compliance.

35 Current Status and Future Work Prototype Implementation done But, context insensitive, intra-procedural Extend to context sensitive, inter-procedural. Extend compliance check for other rules.

36 So… Reuse of software binaries is essential. Hard Coding and non-reentrancy are bad programming practices. Non relocatable/reusable code. A Static Analysis based technique is useful and practical.

37 Software Reuse & System Integration WOW!!!! It works… Select ONLY Compliant Software

38 Questions…

39 Click to continue Extra slides

40 TI XDAIS Standard Six General Programming Rules 1)All programs should follow the runtime conventions of TI’s C programming language. 2)Algorithms must be re-entrant. 3)No hard coded data memory locations. 4)No hard coded program memory locations. 5)Algorithms must characterize their ROM-ability. 6)No peripheral device accesses. No tool exists to check for compliance


Download ppt "Static Analysis of Executable Assembly Code to Ensure QA and Reuse Ramakrishnan Venkitaraman Graduate Student, Research Track Computer Science, UT-Dallas."

Similar presentations


Ads by Google