1 HOIST: A System for Automatically Deriving Static Analyzers for Embedded Systems John Regehr Alastair Reid School of Computing, University of Utah.

Slides:



Advertisements
Similar presentations
An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
Advertisements

Representing Boolean Functions for Symbolic Model Checking Supratik Chakraborty IIT Bombay.
Mining Specifications Glenn Ammons, Dept. Computer Science University of Wisconsin Rastislav Bodik, Computer Science Division University of California,
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
1  1998 Morgan Kaufmann Publishers Summary:. 2  1998 Morgan Kaufmann Publishers How many cycles will it take to execute this code? lw $t2, 0($t3) lw.
CS357 Lecture: BDD basics David Dill 1. 2 BDDs (Boolean/binary decision diagrams) BDDs are a very successful representation for Boolean functions. A BDD.
CHIMAERA: A High-Performance Architecture with a Tightly-Coupled Reconfigurable Functional Unit Kynan Fraser.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)
Formal Methods of Systems Specification Logical Specification of Hard- and Software Prof. Dr. Holger Schlingloff Institut für Informatik der Humboldt.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
Eliminating Stack Overflow by Abstract Interpretation John Regehr Alastair Reid Kirk Webb University of Utah.
Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider.
Pointer and Shape Analysis Seminar Context-sensitive points-to analysis: is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University.
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
1 Homework Turn in HW2 at start of next class. Starting Chapter 2 K&R. Read ahead. HW3 is on line. –Due: class 9, but a lot to do! –You may want to get.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Random Testing of Interrupt-Driven Software John Regehr University of Utah.
Lock Inference for Systems Software John Regehr Alastair Reid University of Utah March 17, 2003.
High-Level Optimizations for Low-Level Software John Regehr University of Utah.
Establishing the overall structure of a software system
7/14/2015EECS 584, Fall MapReduce: Simplied Data Processing on Large Clusters Yunxing Dai, Huan Feng.
Formal verification Marco A. Peña Universitat Politècnica de Catalunya.
Software Testing and QA Theory and Practice (Chapter 4: Control Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
DART: Directed Automated Random Testing Koushik Sen University of Illinois Urbana-Champaign Joint work with Patrice Godefroid and Nils Klarlund.
Design Space Exploration
Identifying Reversible Functions From an ROBDD Adam MacDonald.
Architectural Design portions ©Ian Sommerville 1995 Establishing the overall structure of a software system.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
1 Efficient Type and Memory Safety for Tiny Embedded Systems John Regehr Nathan Cooprider Will Archer Eric Eide University of Utah School of Computing.
Automated Whitebox Fuzz Testing (NDSS 2008) Presented by: Edmund Warner University of Central Florida April 7, 2011 David Molnar UC Berkeley
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
Architectural Design lecture 10. Topics covered Architectural design decisions System organisation Control styles Reference architectures.
Volatiles Are Miscompiled, and What to Do about It Eric Eide and John Regehr University of Utah EMSOFT 2008 / October 22, 2008.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Architectural Design Yonsei University 2 nd Semester, 2014 Sanghyun Park.
Pointers OVERVIEW.
Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Pluggable Domains for C Dataflow Analysis Nathan Cooprider and John Regehr {coop, School of Computing, University of Utah.
1 Computer Organization & Design Microcode for Control Sec. 5.7 (CDROM) Appendix C (CDROM) / / pdf / lec_3a_notes.pdf.
Page 1 5/2/2007  Kestrel Technology LLC A Tutorial on Abstract Interpretation as the Theoretical Foundation of CodeHawk  Arnaud Venet Kestrel Technology.
Advanced Computer Architecture Lab University of Michigan Compiler Controlled Value Prediction with Branch Predictor Based Confidence Eric Larson Compiler.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Scalable Symbolic Execution: KLEE.
Reactive and Output-Only HKOI Training Team 2006 Liu Chi Man (cx) 11 Feb 2006.
Timo O. Korhonen, HUT Communication Laboratory 1 Convolutional encoding u Convolutional codes are applied in applications that require good performance.
The Instruction Set Architecture. Hardware – Software boundary Java Program C Program Ada Program Compiler Instruction Set Architecture Microcode Hardware.
Exploiting Instruction Streams To Prevent Intrusion Milena Milenkovic.
NETW3005 Memory Management. Reading For this lecture, you should have read Chapter 8 (Sections 1-6). NETW3005 (Operating Systems) Lecture 07 – Memory.
CSE 503: Software Engineering Winter 2010 Lecturer: Michael Ernst.
Efficiently Solving Computer Programming Problems Doncho Minkov Telerik Corporation Technical Trainer.
MOPS: an Infrastructure for Examining Security Properties of Software Authors Hao Chen and David Wagner Appears in ACM Conference on Computer and Communications.
INTRODUCTION TO COMPUTER PROGRAMMING(IT-303) Basics.
Slide 1 Chapter 8 Architectural Design. Slide 2 Topics covered l System structuring l Control models l Modular decomposition l Domain-specific architectures.
Software Engineering Algorithms, Compilers, & Lifecycle.
Nathan Cooprider and John Regehr University of Utah School of Computing Pluggable Abstract Domains for Analyzing Embedded Software.
Alternative Parallel Processing Approaches
Seyed K. Fayaz, Tushar Sharma, Ari Fogel
Efficient Software-Based Fault Isolation
Testing and Debugging PPT By :Dr. R. Mall.
Computer Organization & Design Microcode for Control Sec. 5
Volatiles Are Miscompiled, and What to Do about It
Performance Optimization for Embedded Software
Discrete Controller Synthesis
Practical Session 8, Memory Management 2
Lecture 6 Architecture Algorithm Defin ition. Algorithm 1stDefinition: Sequence of steps that can be taken to solve a problem 2ndDefinition: The step.
Artificial Intelligence
Practical Session 9, Memory Management continues
Presentation transcript:

1 HOIST: A System for Automatically Deriving Static Analyzers for Embedded Systems John Regehr Alastair Reid School of Computing, University of Utah

2 Hoist makes it significantly easier to do static analysis of embedded software –E.g. TinyOS Automatically derives transfer functions for analyzing object code –This is new –Hoisted transfer functions are maximally precise –Brute-force approach that works well for small architectures

3 SMTWTFS          SMTWTFS Before Hoist:After:

4 Use Static Analysis to Eliminate... Concurrency errors Deadline misses Stack overflow Language-level errors –Array bound violations –Null pointer dereferences –Numerical problems Everything else Jim Larus talked about!

5 and r0, r1 r0 = ??11?001 r1 = … &01? ? ?0?? r0 = ??11?001 r1 = ?00110?? r0 = ??11?001 r1 = ?001?00? Transfer Function

6 Abstract Transfer Functions Bitwise Interval ??11?001 & ?00110?? = ?001?00? ??11?001 + ?00110?? = … [ 3.. 6] + [36..60] = [39..66] [ 3.. 6] & [36..60] = …

7 Transfer Functions can be Hard Domain / operation mismatch Condition codes – input and output Hard to know where precision matters Lots of transfer functions: # domains * # instructions * # architectures Result: Wasted time, bugs, imprecision

8 Hoist Contributions Derive transfer functions with –Near-zero developer effort –Maximal precision –Sufficient performance –High confidence in correctness

9 Extract complete result table for instruction –Dest register + cond codes Ideas: –No high-level model of instruction –Brute force Extract results Hoist into abstract domain Encode as BDD Generate code Test

10 Generate complete abstract transfer function Ideas: –Recursive decomposition of abstract domain –Speedup through dynamic programming Extract results Hoist into abstract domain Encode as BDD Generate code Test

11 Binary decision diagrams can compactly represent many functions Encode transfer function as vector of BDDs Ideas: –Variable ordering matters –Operation ordering matters Extract results Hoist into abstract domain Encode as BDD Generate code Test

12 Turn BDD into code implementing the transfer function Extract results Hoist into abstract domain Encode as BDD Generate code Test

13 Probabilistically or exhaustively verify –Correctness –Maximal precision Original result table is ground-truth Extract results Hoist into abstract domain Encode as BDD Generate code Test

14 Hoisting Atmel AVR Architecture Up to 45 minutes to Hoist a bitwise operation Up to 34 hours to Hoist an interval operation Dominated by BDD library Parallelizes trivially across operations

15 Performance at Analysis Time Analyze programs that ship with TinyOS for worst-case stack depth –Analysis time increases from 8.3s to 8.9s for the program that takes longest to analyze

16 Precision in Bitwise Domain Fed random bitwise values to Hoisted and hand-written operations –59% more known bits in result register –130% more known bits in condition codes Analyzed 26 TinyOS programs –8% more known bits in result register –40% more known bits in condition codes Hand-written operations had been tuned for months

17 Twist #1: Pseudo-Unary Ops Problem: –xor 0?10??11, 0?10??11 == 0?00??00 However: –xor r3, r3 == –Oops! Maximal precision doesn’t help here Solution: Create a pseudo-unary version of each binary operation –E.g. xor 1, sub 1, and 1, or 1 –Without these, analysis fails miserably –Not fun to implement these by hand

18 Twist #2: Interacting Domains If a register contains [ ] and ???11011 We can show that it actually contains [ ] and In general: Use Hoist to create a reduced product of the interval and bitwise domains –[Cousot & Cousot 79] says this is impossible –For finite domains we can brute-force it –Maximally precise

19 Elephant in the Closet Hoist does not scale to machines bigger than 8 bits –8 bit is important: Many architectures, huge sales volume, used in critical systems Current work –Replace BDDs with high-level symbolic representation –Gain scalability but lose many other advantages of Hoist

20 Conclusions Reduce barriers to entry for analyzing embedded software Hoist generates transfer functions for interval and bitwise domains –Near-zero specification effort, maximal precision We use Hoisted operations in day-to-day development / use of our static analyzer –Biggest benefit is never wondering if the transfer functions are the problem