Eliminating Stack Overflow by Abstract Interpretation John Regehr Alastair Reid Kirk Webb University of Utah.

Slides:



Advertisements
Similar presentations
Part IV: Memory Management
Advertisements

Assembly Language.
Announcements You survived midterm 2! No Class / No Office hours Friday.
Chapter 6: Memory Management
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
INSTRUCTION SET ARCHITECTURES
Computer Organization and Architecture
Code Compaction of an Operating System Kernel Haifeng He, John Trimble, Somu Perianayagam, Saumya Debray, Gregory Andrews Computer Science Department.
Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider.
4 July 2005 overview Traineeship: Mapping of data structures in multiprocessor systems Nick de Koning
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
1 HOIST: A System for Automatically Deriving Static Analyzers for Embedded Systems John Regehr Alastair Reid School of Computing, University of Utah.
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
Computer Organization and Architecture The CPU Structure.
Random Testing of Interrupt-Driven Software John Regehr University of Utah.
CS 104 Introduction to Computer Science and Graphics Problems
Lock Inference for Systems Software John Regehr Alastair Reid University of Utah March 17, 2003.
Run time vs. Compile time
OS Spring’03 Introduction Operating Systems Spring 2003.
Computer System Overview
Automatic Compaction of OS Kernel Code via On-Demand Code Loading Haifeng He, Saumya Debray, Gregory Andrews The University of Arizona.
High-Level Optimizations for Low-Level Software John Regehr University of Utah.
Recap – Our First Computer WR System Bus 8 ALU Carry output A B S C OUT F 8 8 To registers’ input/output and clock inputs Sequence of control signal combinations.
16.317: Microprocessor System Design I Instructor: Dr. Michael Geiger Spring 2012 Lecture 29: Microcontroller intro.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Software Development and Software Loading in Embedded Systems.
1 Programming Languages Examples: C, Java, HTML, Haskell, Prolog, SAS Also known as formal languages Completely described and rigidly governed by formal.
ECE/CS-352: Embedded Microcontroller Systems Embedded Microcontroller Systems.
Chapter 7 Evaluating the Instruction Set Architecture of H1: Part 1.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Programmer's view on Computer Architecture by Istvan Haller.
Microcontroller Presented by Hasnain Heickal (07), Sabbir Ahmed(08) and Zakia Afroze Abedin(19)
1 Efficient Type and Memory Safety for Tiny Embedded Systems John Regehr Nathan Cooprider Will Archer Eric Eide University of Utah School of Computing.
Storage Allocation for Embedded Processors By Jan Sjodin & Carl von Platen Present by Xie Lei ( PLS Lab)
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Complexity of Algorithms
RISC By Ryan Aldana. Agenda Brief Overview of RISC and CISC Features of RISC Instruction Pipeline Register Windowing and renaming Data Conflicts Branch.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
CML SSDM: Smart Stack Data Management for Software Managed Multicores Jing Lu Ke Bai, and Aviral Shrivastava Compiler Microarchitecture Lab Arizona State.
INTRODUCTION TO PIC MICROCONTROLLER. Overview and Features The term PIC stands for Peripheral Interface Controller. Microchip Technology, USA. Basically.
Lecture Topics: 11/24 Sharing Pages Demand Paging (and alternative) Page Replacement –optimal algorithm –implementable algorithms.
Static WCET Analysis vs. Measurement: What is the Right Way to Assess Real-Time Task Timing? Worst Case Execution Time Prediction by Static Program Analysis.
CSCI1600: Embedded and Real Time Software Lecture 33: Worst Case Execution Time Steven Reiss, Fall 2015.
COMPUTER ORGANIZATION AND ASSEMBLY LANGUAGE Lecture 21 & 22 Processor Organization Register Organization Course Instructor: Engr. Aisha Danish.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
NETW3005 Memory Management. Reading For this lecture, you should have read Chapter 8 (Sections 1-6). NETW3005 (Operating Systems) Lecture 07 – Memory.
Processor Organization
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.
DEPARTMENT OF ELECTRONICS ENGINEERING V-SEMESTER MICROPROCESSOR & MICROCONTROLLER 1 CHAPTER NO microcontroller & programming.
PROGRAMMABLE LOGIC CONTROLLERS SINGLE CHIP COMPUTER
C. K. Pithawalla College of Engineering and Technology, Surat
MCI PPT AVR MICROCONTROLLER Mayuri Patel EC-1 5th sem
William Stallings Computer Organization and Architecture 8th Edition
Introduction to complexity
DATA STRUCTURES AND OBJECT ORIENTED PROGRAMMING IN C++
A Closer Look at Instruction Set Architectures
Capriccio – A Thread Model
CSCI1600: Embedded and Real Time Software
Computer Organization & Compilation Process
Lecture Topics: 11/1 General Operating System Concepts Processes
Embedded System Development Lecture 7 2/21/2007
Introduction to Microprocessor Programming
Computer Organization & Compilation Process
Lecture 4: Instruction Set Design/Pipelining
CPU Structure and Function
CSCI1600: Embedded and Real Time Software
Presentation transcript:

Eliminating Stack Overflow by Abstract Interpretation John Regehr Alastair Reid Kirk Webb University of Utah

Contribution Preview  Evaluation of tradeoffs in stack depth analysis  Automatic reduction of stack requirements

Focus  Whole program analysis…  For legacy embedded systems in C and assembly…  Running on microcontrollers  AVR, PIC, 68HC11, Z80, etc.  Typically limited to on-chip memory

Goal: Bounded Stack Depth  There are lots of stacks out there  Stack region too big   Wasted $$, power  Stack region too small   OOM exceptions or data corruption  Stack region just right   Safe, efficient system

Using Stack Bounds STACK DATA, BSS MAX DEPTH STACK DATA, BSS SAFEUNSAFE

Analyze Source or Binary?  Hard to predict compiler behavior  Source for RTOS and libraries commonly not available  Executable systems have no dangling references

Analysis Overview  Take a compiled and linked system 1. Compute approximate control flow graph  Tricky: indirect branches 2. Find “longest” path through the control flow graph  Tricky: recursion, loads into stack pointer, interrupts

Interrupts  Asynchronous events that may arrive when  Enable bit is set AND  Global enable bit is set  Problem: Stack depth strongly affected by interrupt preemption relations  Solution: Compute static estimate of interrupt mask at each program point  Then, use algorithm from Brylow et al. (ICSE 2001)

Motivating Code in r24, 0x3f ; r24 <- CPU status register cli ; disable interrupts adc r24, r24 ; carry bit <- prev interrupt status eor r24, r24 ; r24 <- 0 adc r24, r24 ; r24 <- carry bit mov r18, r24 ; r18 <- r24... critical section... and r18, r18 ; test r18 for zero breq.+2 ; if zero, skip next instruction sei ; enable interrupts ret ; return from function

Abstract Machine Model  Storage locations contain tri-valued bits: 0, 1, ┴  {0,0,1,1} | {0,┴,1,┴} = {0,┴,1,1}  Only model registers  General-purpose + special purpose  Modeling main memory unnecessary and difficult

Analysis  Overview:  Initialize worklist with entry points  Reset, interrupts  Process worklist items until fixpoint is reached  Has context insensitive and sensitive modes  Implementation:  9000 lines of C (3000 generated)

Context (In)Sensitivity b() { c(); } a() { b(); c(); } a b c Context insensitive

Context (In)Sensitivity b() { c(); } a() { b(); c(); } a b c c Context sensitive

First Approach  No abstract interpretation  Just add up requirements of individual interrupts  Assumption: at most one outstanding instance of each interrupt  Cheap and easy  Can be written in ~400 lines of Perl!

Second Approach  Context insensitive analysis  Large implementation effort (relative to first approach)  Fast and memory-efficient

Third Approach  Context sensitive analysis  Small implementation effort (relative to second approach)  Relatively slow and memory- intensive  Requires exponential space / time in the worst case  In practice analysis took at most 4 seconds and 140 MB

Test Programs  64 programs from TinyOS and 1.0  “flybywire” from the Autopilot project  Up to 30,000 lines of C  All run on Atmel ATMega 103  8-bit architecture  4 KB RAM, 128 KB flash

Results

Validation 1. Are machine states from actual runs “within” the static analysis?  Yes 2. What fraction of instructions have statically known int. mask?  Content insensitive: 59%  Context sensitive: 98% 3. Can we observe system using worst-case stack depth?  Usually, for simple examples  No, for complex programs

Reducing Stack Size  Observation: Function inlining often decreases stack requirements  Avoids pushing registers, frame pointer, return address  Called code can be specialized  Strategy: Use stack tool output as input to global inlining tool

Challenges 1. Inlining causes code bloat  Solution: Minimize user-defined cost function that balances stack memory and code size 2. Inlining sometimes increases stack depth!  Solution: Trial compilations 3. Search space is exponential in number of static function calls  Solution: Heuristic search

Results

Related Projects  Bounding stack depth:  Brylow et al., ICSE 01  Chatterjee et al., SAS 03  StackAnalyzer from Absint  Automatically reducing stack depth:  No related work that we know of!  Observation: Bounding and minimizing memory use is a pretty open research area

Future Work  Multi-objective optimization  E.g. stack depth + code size + code speed  More uses for abs. int.  WCET estimation  Bug finding  Support more chips  Currently only Atmel AVR family  Working on automating this

Summary of Contributions  Evaluated tradeoffs in analysis complexity  35% tighter bounds for context sensitive  Effective automatic stack depth reduction  32% lower stack requirement compared to a smart non-stack- oriented inlining policy

More info and papers here: Stack tool code available soon!