Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.

Slides:



Advertisements
Similar presentations
Semantics Static semantics Dynamic semantics attribute grammars
Advertisements

Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Programming Languages and Paradigms
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
Program Representations. Representing programs Goals.
Introduction to Programming Lesson 1. Objectives Skills/ConceptsMTA Exam Objectives Understanding Computer Programming Understand computer storage and.
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
CSCI 171 Presentation 11 Pointers. Pointer Basics.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
1 Static Testing: defect prevention SIM objectives Able to list various type of structured group examinations (manual checking) Able to statically.
CS 536 Spring Global Optimizations Lecture 23.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Addressing Optimization for Loop Execution Targeting DSP with Auto-Increment/Decrement Architecture Wei-Kai Cheng Youn-Long Lin* Computer & Communications.
Code Generation Professor Yihjia Tsai Tamkang University.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Program analysis Mooly Sagiv html://
Chapter 10 Recursion. Copyright © 2005 Pearson Addison-Wesley. All rights reserved Chapter Objectives Explain the underlying concepts of recursion.
IMSE Week 18 White Box or Structural Testing Reading:Sommerville (4th edition) ch 22 orPressman (4th edition) ch 16.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Overview of program analysis Mooly Sagiv html://
C++ Programming: Program Design Including Data Structures, Fifth Edition Chapter 17: Linked Lists.
Programming Fundamentals (750113) Ch1. Problem Solving
Overview of program analysis Mooly Sagiv html://
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Recursion and Implementation of Functions
Data Structures Using C++ 2E
Pointers (Continuation) 1. Data Pointer A pointer is a programming language data type whose value refers directly to ("points to") another value stored.
1 Chapter 5: Names, Bindings and Scopes Lionel Williams Jr. and Victoria Yan CSci 210, Advanced Software Paradigms September 26, 2010.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Department of Computer Science A Static Program Analyzer to increase software reuse Ramakrishnan Venkitaraman and Gopal Gupta.
Chapter 12 Recursion, Complexity, and Searching and Sorting
Compiler Construction
Recursion Textbook chapter Recursive Function Call a recursive call is a function call in which the called function is the same as the one making.
Computer Science and Software Engineering University of Wisconsin - Platteville 9. Recursion Yan Shi CS/SE 2630 Lecture Notes Partially adopted from C++
Chapter 4 Recursion. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.1-2 Chapter Objectives Explain the underlying concepts of recursion.
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Static Program Analysis of Embedded Software Ramakrishnan Venkitaraman Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Controlling Execution Programming Right from the Start with Visual Basic.NET 1/e 8.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Static Program Analysis of Embedded Software Ramakrishnan Venkitaraman Graduate Student, Computer Science Advisor: Dr. Gopal Gupta
Detecting Equality of Variables in Programs Bowen Alpern, Mark N. Wegman, F. Kenneth Zadeck Presented by: Abdulrahman Mahmoud.
CMP-MX21: Lecture 5 Repetitions Steve Hordley. Overview 1. Repetition using the do-while construct 2. Repetition using the while construct 3. Repetition.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Java Basics Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
Chapter 3 Top-Down Design with Functions Part II J. H. Wang ( 王正豪 ), Ph. D. Assistant Professor Dept. Computer Science and Information Engineering National.
1 Recursion Recursion is a powerful programming technique that provides elegant solutions to certain problems. Chapter 11 focuses on explaining the underlying.
Functions Illustration of: Pass by value, reference Scope Allocation Reference: See your CS115/215 textbook.
CS212: Object Oriented Analysis and Design Lecture 19: Exception Handling.
TM Design Macro Language D and SD MA/CSSE 474 Theory of Computation.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Static Analysis of Executable Assembly Code to Ensure QA and Reuse Ramakrishnan Venkitaraman Graduate Student, Research Track Computer Science, UT-Dallas.
Flow Control in Imperative Languages. Activity 1 What does the word: ‘Imperative’ mean? 5mins …having CONTROL and ORDER!
Framework for Safe Reuse Of Software Binaries Ramakrishnan Venkitaraman Advisor: Gopal Gupta The University of Texas at Dallas 11/15/2004.
Analyzing and Transforming Binary Code (for Fun & Profit) Gopal Gupta R. Venkitaraman, R. Reghuramalingam The University of Texas at Dallas 11/15/2004.
Introduction to Programming Lesson 1. Algorithms Algorithm refers to a method for solving problems. Common techniques for representing an algorithms:
CS223: Software Engineering Lecture 26: Software Testing.
LINKED LISTS.
Java Software Structures: John Lewis & Joseph Chase
CSCI1600: Embedded and Real Time Software
Programming Fundamentals (750113) Ch1. Problem Solving
Programming Fundamentals (750113) Ch1. Problem Solving
Programming Fundamentals (750113) Ch1. Problem Solving
Programming Fundamentals (750113) Ch1. Problem Solving
Course Overview PART I: overview material PART II: inside a compiler
Introduction to Programming
CSCI1600: Embedded and Real Time Software
Presentation transcript:

Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta

Software cost always on the rise

Why do we need a software standard? Lack of software reuse, because of lack of software standards. Non availability of a rich set of COTS components. Time to market new products, measured in years rather than months. Incompatibilities, make integration of software from multiple vendors impossible.

TI TMS320 DSP Algorithm Standard Contains 35 rules and 15 guidelines. Advantages include. Easy integration of compliant algorithms. Reduces time to market. Enables a rich set of COTS market place. Increases software reuse.

General Programming Rules No tool currently exists to check for compliance. SIX rules. 1) All programs should follow the runtime conventions of TI’s C programming language. 2) Algorithms must be re-entrant. 3) No hard coded data memory locations. 4) No hard coded program memory locations. 5) Algorithms must characterize their ROM-ability. 6) No peripheral device accesses.

The Problem and Our Approach Problem: Detection of hard coded addresses in programs. Our approach: We use “static program analysis” to detect the presence of hard coded addresses.

Hard Coded Addresses Generally a bad programming practice unless you are programming for device drivers. Results in non relocatable code. Results in non reusable code.

Static Program Analysis Static program analysis is defined as, any analysis of a program, carried out, without completely executing the program. The traditional data-flow analysis, found in compiler back-ends, is an example of static analysis. Another example of static analysis is abstract interpretation, in which, a program's data, and operations are approximated, and the program abstractly executed.

Why Static Analysis? In general, most interesting problems are undecidable, and so is this. We cannot just look at the code (without analysis), and say whether its hard coded. So, we need to do approximate analysis. With static analysis, we symbolically execute (abstract interpretation), the interesting parts of the code.

Overview of our approach Input: object code of the algorithm Output: compliant / not compliant status Activity Diagram for our Static Analyzer

Approach (Contd…) The basic aim of the analysis, is to find a path, from the point at which the dereferencing of a pointer occurs, to the point at which an address is assigned to the pointer, and then check, whether the source of the pointer is legitimate or not.

Some examples showing hardcoding void main() { int * p = 0x8800; // Some code *p = …; } Example1: Directly Hardcoded void main() { int *p = 0x80; int *q = p; //Some code *q = …; } Example2: Indirectly Hardcoded void main() { int *p, val; val = …; if(val) p = 0x900; else p = malloc(…); *p; } Example3: Conditional Hardcoding NOTE: We don’t care if a pointer is hard coded and is never dereferenced.

Basic Blocks and Flow Graphs A “Basic Block”, is a sequence of consecutive statements, in which flow of control enters at the beginning, and leaves at the end, without halting, or possibility of branching, except at the end. The basic blocks, form the nodes in a directed graph, called the “Control Flow- Graph”.

Phases in Static Analysis of the Flow Graph  Phase 1: The analyzer detects statements in the disassembled code, which correspond to the dereferencing of pointer variables, by scanning downwards in the flow graph  Phase 2: The analyzer checks whether any dereferencing detected in phase 1, is safe by scanning upwards in the flow graph

Phase1 - Detecting Dereferencing Start from the first node in the basic block. Keep scanning down, until the dereferencing of a register other than Stack Pointer is detected. Initialize the unsafe set with the register that was dereferenced.

Phase 2- Check if dereferencing is safe Made by scanning backward in the control flow-graph from the point of dereferencing. Look for statements in which, an element from the unsafe set, (in this case `Reg') is used as the destination register.

Building Unsafe Sets  The First element is added to the unsafe set, when phase 1 detects dereferencing.  Example: If we find “ *Reg ” in the disassembled code, the unsafe set is initialized to {Reg} int main() { int * Reg = malloc(sizeof(int)); //Some Code; *Reg = 5; }

Building unsafe sets (continued) Phase 2 populates the unsafe set. For example, if we find Reg = reg1 + reg2, the element “Reg” is deleted from the unsafe set, and the elements “reg1”, and “reg2”, are inserted into the unsafe set. Contents of the unsafe set will now become {reg1, reg2}.

Merging Information Assume that we are scanning up from block E. When at top of block D, we need to scan blocks B and C separately. Merge information when we want to scan block A after scanning B and block C. If we do not merge information, scanning will be of exponential complexity. Especially true in the case of looping constructs like “while” loops, “for” loops… Merging results in information loss. If (Cond) Then Block B Else Block C Block D Block A Block E

Handling Loops Complex because the number of iterations of the loop may not be known until runtime. We scan and cycle through the loop until the unsafe set reaches a “fixed point”. A fixed point is reached when. The unsafe set repeats itself at the same point in the loop during successive iterations. No new information is added to the unsafe set during successive iterations.

Handling Parallelism The || characters signify that an instruction is to execute in parallel with the previous instruction Instructions A, B, C are executed in parallel Example Instruction A || Instruction B || Instruction C

Analysis Stops when… All pointer dereferencing in the program are declared to be “safe” (not hard coded) Or At least one of the pointer dereferencing in the program is declared to be “unsafe” (hard coded)

Sample Code

Fig. Flow Graph

Conclusion Hardcoding is a bad programming practice and results in non relocatable/reusable code. Our work so far, can be regarded as an attempt to demonstrate, the efficacy of static analysis, to perform these checks, and aid in software reuse.

References Ramakrishnan Venkitaraman and Gopal Gupta, “Static Program Analysis to Detect Hard Coded Addresses and its Application to TI's DSP Processor”, CS department technical report UTD CS For More information, Visit:

Questions…

Additional Slides Click to continue…

Handling Function Calls Similar to a branch statement Marks the beginning and end of basic blocks Recursive function calls are handled as if they were looping constructs

Interpretations Using Static Analysis

Hard Coded Pointer? The analyzer adopts a set of criteria to decide whether a dereferencing is hard coded or not. A register corresponding to a pointer is hard coded if its assigned a constant value as in the statement “Reg=0x8800". Some of the ways in which an address could be legitimate are If the address is derived from a call to memory allocation routines like “malloc" or “calloc" If the address is derived as a function of the “stack pointer" If the address is derived from another pointer that is legitimate.

Our Algorithm for Static Analysis 1) Get the disassembled code from the input object code. 2) From the disassembled code, get the basic blocks. 3) From the basic blocks, construct the flow-graph. 4) Analyze the flow-graph, and check for the dereferencing of pointer variables. 5) For each such dereferencing, scan back, and find out from where did this pointer get its value from (involves the formation of unsafe sets which are explained later) If the original source of this pointer is hard coded, then declare that the algorithm is not compliant (“unsafe"). If the original source from of this pointer is legitimate, then declare that dereferencing is safe. 6) The algorithm is declared to be safe, if and only if, all such pointer dereferencing are safe.

Merging Information (Contd) If S1 and S2 are the unsafe sets corresponding to the “then” and “else” cases, merging them will result in a new set say S3 which is {S1 union S2} S3 is a set of sets and the hard coding check is made for each set in the set of sets. By merging information, we lose some information (Cannot say which particular path the unsafe set corresponds to)

Related Work Compared to Dynamic Analysis, Static Analysis can give correct results, for a larger set of cases, because of the very nature of the analysis

Current Work Current work includes fine tuning the handling of loops and extending our system for the remaining rules. The development and testing of the tool is currently in progress.