Secure Compiler Seminar 11/7 Survey: Modular Development of Certified Program Verifiers with a Proof Assistant Toshihiro YOSHINO (D1, Yonezawa Lab.)

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

.NET Technology. Introduction Overview of.NET What.NET means for Developers, Users and Businesses Two.NET Research Projects:.NET Generics AsmL.
Static and User-Extensible Proof Checking Antonis StampoulisZhong Shao Yale University POPL 2012.
Translation-Based Compositional Reasoning for Software Systems Fei Xie and James C. Browne Robert P. Kurshan Cadence Design Systems.
Comparing Semantic and Syntactic Methods in Mechanized Proof Frameworks C.J. Bell, Robert Dockins, Aquinas Hobor, Andrew W. Appel, David Walker 1.
Proofs and Programs Wei Hu 11/01/2007. Outline  Motivation  Theory  Lambda calculus  Curry-Howard Isomorphism  Dependent types  Practice  Coq Wei.
- Vasvi Kakkad.  Formal -  Tool for mathematical analysis of language  Method for precisely designing language  Well formed model for describing and.
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Foundational Certified Code in a Metalogical Framework Karl Crary and Susmit Sarkar Carnegie Mellon University.
Current Techniques in Language-based Security David Walker COS 597B With slides stolen from: Steve Zdancewic University of Pennsylvania.
March 4, 2005Susmit Sarkar 1 A Cost-Effective Foundational Certified Code System Susmit Sarkar Thesis Proposal.
1 Dependent Types for Termination Verification Hongwei Xi University of Cincinnati.
LIFE CYCLE MODELS FORMAL TRANSFORMATION
Nicholas Moore Bianca Curutan Pooya Samizadeh McMaster University March 30, 2012.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
VeriML: Revisiting the Foundations of Proof Assistants Zhong Shao Yale University MacQueen Fest May 13, 2012 (Joint work with Antonis Stampoulis)
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
Formal Methods in Software Engineering Credit Hours: 3+0 By: Qaisar Javaid Assistant Professor Formal Methods in Software Engineering1.
Strength Through Typing: A more powerful dependently-typed assembly language Matt Harren George Necula OSQ 2004.
Typed Assembly Languages COS 441, Fall 2004 Frances Spalding Based on slides from Dave Walker and Greg Morrisett.
CS 330 Programming Languages 09 / 18 / 2007 Instructor: Michael Eckmann.
MinML: an idealized programming language CS 510 David Walker.
A Type System for Expressive Security Policies David Walker Cornell University.
Describing Syntax and Semantics
School of Computer ScienceG53FSP Formal Specification1 Dr. Rong Qu Introduction to Formal Specification
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat.
Extensible Code Verification Kun Gao (Senior EECS) with Professor George Necula, Evan Chang, Robert Schneck, Adam Chlipala An individual receives code.
Propositional Calculus Math Foundations of Computer Science.
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
VeriML DARPA CRASH Project Progress Report Antonis Stampoulis October 5 th, 2012 A language-based, dependently-typed, user-extensible approach to proof.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
Comp 245 Data Structures Stacks. What is a Stack? A LIFO (last in, first out) structure Access (storage or retrieval) may only take place at the TOP NO.
Systems Architecture I1 Propositional Calculus Objective: To provide students with the concepts and techniques from propositional calculus so that they.
An Introduction to Software Architecture
Mathematical Modeling and Formal Specification Languages CIS 376 Bruce R. Maxim UM-Dearborn.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
Overview of Formal Methods. Topics Introduction and terminology FM and Software Engineering Applications of FM Propositional and Predicate Logic Program.
The Beauty and Joy of Computing Lecture #3 : Creativity & Abstraction UC Berkeley EECS Lecturer Gerald Friedland.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
Formal Verification Lecture 9. Formal Verification Formal verification relies on Descriptions of the properties or requirements Descriptions of systems.
Checking Reachability using Matching Logic Grigore Rosu and Andrei Stefanescu University of Illinois, USA.
Propositional Calculus CS 270: Mathematical Foundations of Computer Science Jeremy Johnson.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Secure Compiler Seminar 4/11 Visions toward a Secure Compiler Toshihiro YOSHINO (D1, Yonezawa Lab.)
Ch. 13 Ch. 131 jcmt CSE 3302 Programming Languages CSE3302 Programming Languages (notes?) Dr. Carter Tiernan.
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
Scientific Debugging. Errors in Software Errors are unexpected behaviors or outputs in programs As long as software is developed by humans, it will contain.
Lecture 5 1 CSP tools for verification of Sec Prot Overview of the lecture The Casper interface Refinement checking and FDR Model checking Theorem proving.
Types and Programming Languages Lecture 11 Simon Gay Department of Computing Science University of Glasgow 2006/07.
SAFE KERNEL EXTENSIONS WITHOUT RUN-TIME CHECKING George C. Necula Peter Lee Carnegie Mellon U.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
CMSC 330: Organization of Programming Languages Operational Semantics.
Prof. Necula CS 164 Lecture 171 Operational Semantics of Cool ICOM 4029 Lecture 10.
Chapter 1: Preliminaries Lecture # 2. Chapter 1: Preliminaries Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation.
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
The PLA Model: On the Combination of Product-Line Analyses 강태준.
1 Interactive Computer Theorem Proving CS294-9 October 5, 2006 Adam Chlipala UC Berkeley Lecture 7: Programming with Proofs.
1 Interactive Computer Theorem Proving CS294-9 October 19, 2006 Adam Chlipala UC Berkeley Lecture 9: Beyond Primitive Recursion.
1 Interactive Computer Theorem Proving CS294-9 September 7, 2006 Adam Chlipala UC Berkeley Lecture 3: Data structures and Induction.
Thoughts on Programming with Proof Assistants Adam Chlipala University of California, Berkeley PLPV Workshop.
Functional Programming
COSC 5V90 Functional Programming and Interactive Theorem Proving
About the Presentations
Relatively Complete Refinement Type System for Verification of Higher-Order Non-deterministic Programs Hiroshi Unno (University of Tsukuba) Yuki Satake.
An Introduction to Software Architecture
Outline Chapter 2 (cont) OS Design OS structure
Presentation transcript:

Secure Compiler Seminar 11/7 Survey: Modular Development of Certified Program Verifiers with a Proof Assistant Toshihiro YOSHINO (D1, Yonezawa Lab.)

Today’s Paper A. Chlipala (UC Berkeley). Modular Development of Certified Program Verifiers with a Proof Assistant. ICFP ’06.  Implementation can be downloaded from web site below: ⇒

Overview of the Paper Case study to develop a certified program verifier with Coq  Verifies memory safety of x86 machine code  Its soundness is machine-checked  Modular development by reusable functors Possible to create a new verifier based on another type system with low cost

Constructing Certified Verifiers Design and implement with Coq  Use “extraction” feature of Coq to obtain a working verifier A verifier can be formalized as:  load: program -> state loads a program The type program represents binary file format  safe: state -> Prop is the safety property we wish to verify for programs  [[P]] is notation for poption P option(O’Caml) or Maybe(Haskell) for domain Prop

Constructing Certified Verifiers Abstraction refinement by multiple stages  Each stage (component) is a functor which transforms target states into source states Later components reason at higher levels of abstraction  Use Coq’s module system to implement this modular design

Formalization of x86 Instruction Set PCC-style formalization  Subset of x86 instruction set + ERROR instruction mov, jcc, … Safety ≡ ERROR is unreachable  In combination with assertion, many properties can be proven  Can be formalized coinductively Cope with infinite derivation

Types and Extraction in Coq Basically Coq manipulates on terms of dependently-typed lambda calculus  A proposition is represented as a type, its proof as a term of that type Well known as Curry-Howard isomorphism  Proving step corresponds to type inference Given a goal, refine it interactively into subgoals, and eliminate holes Rules used for these steps are called tactics

Types and Extraction in Coq Program extraction from Coq code  In short, extraction is to erase terms of sorts other than Set  Brief example: isEven Definition isEven : forall (n:nat), poption (even n). refine (fix isEven (n:nat) : poption (even n) := match n return … with | O => PSome _ _ | S (S n) => … | _ => PNone _ end); auto. Qed. let rec isEven (n:nat) = match n with | O -> true | S (S n) -> isEven n | _ -> false Definition isEven : forall (n:nat), poption (even n). refine (fix isEven (n:nat) : poption (even n) := match n return … with | O => PSome _ _ | S (S n) => … | _ => PNone _ end); auto. Qed.

poption: “option” for Domain “Prop” Two constructors: PNone and PSome  PSome is given a proof of P Literately, PSome means “P holds and I have a proof for that” and PNone “I am not sure”  Can be used as failure-monad PNone >>= _= PNone PSome p >>= f= f p  In extraction, PSome corresponds to true, and PNone to false

soption soption extends poption with a parameter  Proposition about a term of domain T (of sort Set)  soption, too, can be used as failure monad In the paper’s theoretical part, written as {{ x : T | P }}

Coq’s Module System Used to build re-usable verification components  Frequent pattern: Module Type MACHINE. Parameter mstate : Set. Parameter minitState : mstate -> Prop. … End MACHINE. Record state : Set := { stRegs32 : regs32; … }. Inductive instr : Set := Arith : … | …. Inductive exec : … := …. Module M86 <: MACHINE. Definition mstate := state. Definition minstr := instr. … End M86.

Module ModelCheck Provides fundamental methods of model checking  Methods to prove theorems about infinite state systems through exhaustive exploration Refine the model in each of the following stages AbstractConcrete

Module ModelCheck Introduced Elements absState: a set of abstract states  An abstract state is managed with “hypotheses”, states that are known to be safe Hypothesis is used, for example, to formalize return pointer from a function describes correspondence between machine states and abstract states  Context(Γ) is deleted in extracting a verifier init is a set (actually a list) of initial states  It must be a set because one real machine state may correspond to multiple abstract states  There must be some elements in init that has no hypothesis

Module ModelCheck Introduced Elements step describes execution step  Execute an instruction from the specified state soption is used because the execution may get stuck  Progress and Preservation must hold Progress Preservation

MACHINE: Input to the module Module ModelCheck The Concept Illustrated State space of a real machine absState step Initial states

Module Reduction Translates x86 machine language into simpler RISC-style instruction set (SAL)  x86 machine language is too complex and not suitable for verification purposes One instruction may perform several basic operations The same basic operations show up in the working of many instructions Reduction module also provides model checking layer for SAL programs

Module Reduction SAL: Simplified Assembly Language Named after the language used in Proof- Carrying Code [Necula 1997] RISC-style instruction set  Arithmetics are extended to allow expressions with parentheses and infix operators Additional temporary registers TMP i

Module FixedCode Ensures that code region is not overwritten by the code itself  To simplify the verification framework Definition is in the form of ModelCheck  Additional check is performed only on storing to the memory

Module TypeSystem Support for a standard approach for type systems  A set of types is introduced and typing rules for values are described  Subtype relation is also introduced The definition in the figure suffices because Coq takes care of that part  And each register is associated with a type

Module TypeSystem viewShift represents shift of types’ view  Occurs at places a program crosses an abstraction boundary For example, in function calls when the stack frame changes Introducing existential is also a kind of view shift

Module WeakUpdate Introduces a type system of weak update  Each memory cell has a type associated and this type does not change during a run A cell can be overwritten only with a value of its type Dynamic memory management is out of the scope  In real setting, memory is frequently reclaimed and reused Garbage collector or malloc/free

The Rest of Modules Module StackTypes  Keeps track of types of stack slots Module SimpleFlags  Keeps track of flag values In x86 (too), no atomic instruction for conditional test and jump at one time  Crucial for assuring pointer is valid (not null) or checking array boundary

Case Study: A Verifier for Algebraic Datatypes Implemented the library and a sample verifier with Coq   Approx. 20K(+α) LoC Main implementation consists of only 600 LoC 7,000 LoC for implementing library components 10,000 for generic utility 1,000 for bitvectors and fixed-precision arithmetics 1,000 for a subset of x86 machine code  Auxiliary library from O’Caml implementation (not counted here) x86 binary parsing, etc.

Related Work Foundational PCC [Appel 2001]  Reduce TCB and also improve flexibility of PCC by constructing a system on some logical framework  However, efficiency is sacrificed by generality Theoretical issues seem to have priority to pragmatics Epigram [McBride, McKinna 2004], ATS [Chen, Xi 2005], RSP [Westbrook et al. 2005] and GADTs [Sheard 2004]  Incorporate dependent types into program languages  But the foundations of Coq’s implementation and metatheories are simpler than them

Summary (of the Paper) Designed a structure for modular certified verifiers  Components are reusable functors  Pipeline-style design Implemented library components with Coq  As a case study, memory safety verifier for x86 machine code is constructed

Relevance to My Research I have been studying a framework to build verifiers for low-level languages  First formalize the common language ADL  Verification is done on the translated program (in ADL) Trying to prove correctness of translation  Currently ongoing with Coq

Relevance to My Research Both very similar approach  ADL and SAL are both designed in a minimalist criteria  Verification logic is built on top of the common language’s semantics To achieve high portability and flexibility  From this viewpoint, my project is covered by his… (x_x) Correctness of translation is also proven by Coq in proofos Positively thinking, my direction was not so wrong

Relevance to My Research Comparison of two projects… proofos [Chlipala 06] L 3 Cover [Yoshino 06] Common Language SALADL Implementation CoqJava Parametrization ML-style module OO-style (inheritance)