Lectures on Proof-Carrying Code Peter Lee Carnegie Mellon University Lecture 1 (of 3) June 21-22, 2003 University of Oregon 2004 Summer School on Software.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Comparing Semantic and Syntactic Methods in Mechanized Proof Frameworks C.J. Bell, Robert Dockins, Aquinas Hobor, Andrew W. Appel, David Walker 1.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Foundational Certified Code in a Metalogical Framework Karl Crary and Susmit Sarkar Carnegie Mellon University.
David Evans CS655: Programming Languages University of Virginia Computer Science Lecture 20: Total Correctness; Proof-
Mobile Code Security Aviel D. Rubin, Daniel E. Geer, Jr. MOBILE CODE SECURITY, IEEE Internet Computing, 1998 Minkyu Lee
Dynamic Typing COS 441 Princeton University Fall 2004.
Design & Analysis of Algoritms
Nicholas Moore Bianca Curutan Pooya Samizadeh McMaster University March 30, 2012.
Types, Proofs, and Safe Mobile Code The unusual effectiveness of logic in programming language research Peter Lee Carnegie Mellon University January 22,
VIDE als voortzetting van Cocktail SET Seminar 11 september 2008 Dr. ir. Michael Franssen.
ISBN Chapter 3 Describing Syntax and Semantics.
Copyright © 2006 Addison-Wesley. All rights reserved. 3.5 Dynamic Semantics Meanings of expressions, statements, and program units Static semantics – type.
Lectures on Proof-Carrying Code Peter Lee Carnegie Mellon University Lecture 2 (of 3) June 21-22, 2003 University of Oregon 2004 Summer School on Software.
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
An Introduction to Proof-Carrying Code David Walker Princeton University (slides kindly donated by George Necula; modified by David Walker)
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
Code-Carrying Proofs Aytekin Vargun Rensselaer Polytechnic Institute.
CSE331: Introduction to Networks and Security Lecture 28 Fall 2002.
Proof-system search ( ` ) Interpretation search ( ² ) Main search strategy DPLL Backtracking Incremental SAT Natural deduction Sequents Resolution Main.
1 Enforcing Confidentiality in Low-level Programs Andrew Myers Cornell University.
Programmability with Proof-Carrying Code George C. Necula University of California Berkeley Peter Lee Carnegie Mellon University.
CS 330 Programming Languages 09 / 18 / 2007 Instructor: Michael Eckmann.
Language-Based Security Proof-Carrying Code Greg Morrisett Cornell University Thanks to G.Necula & P.Lee.
A Type System for Expressive Security Policies David Walker Cornell University.
1 Programming & Programming Languages Overview l Machine operations and machine language. l Example of machine language. l Different types of processor.
1 CSE 380 Computer Operating Systems Instructor: Insup Lee and Dianna Xu University of Pennsylvania Fall 2003 Lecture Note: Protection Mechanisms.
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Describing Syntax and Semantics
School of Computer ScienceG53FSP Formal Specification1 Dr. Rong Qu Introduction to Formal Specification
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat.
1 The Problem o Fluid software cannot be trusted to behave as advertised unknown origin (must be assumed to be malicious) known origin (can be erroneous.
20 February Detailed Design Implementation. Software Engineering Elaborated Steps Concept Requirements Architecture Design Implementation Unit test Integration.
Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University Lecture 1 Course Overview July 10, 2001 Lipari School on Foundations of.
GENERAL CONCEPTS OF OOPS INTRODUCTION With rapidly changing world and highly competitive and versatile nature of industry, the operations are becoming.
High level & Low level language High level programming languages are more structured, are closer to spoken language and are more intuitive than low level.
ECE 720T5 Winter 2014 Cyber-Physical Systems Rodolfo Pellizzoni.
SWE 316: Software Design and Architecture – Dr. Khalid Aljasser Objectives Lecture 11 : Frameworks SWE 316: Software Design and Architecture  To understand.
CS 430/530 Formal Semantics Paul Hudak Yale University Department of Computer Science Lecture 1 Course Overview September 6, 2007.
Containment and Integrity for Mobile Code Security policies as types Andrew Myers Fred Schneider Department of Computer Science Cornell University.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
Proof-Carrying Code & Proof-Carrying Authentication Stuart Pickard CSCI 297 June 2, 2005.
Dichotomies: Software Research vs Practice Peter Lee Carnegie Mellon University HCMDSS Workshop, June 2005 Peter Lee Carnegie Mellon University HCMDSS.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
© Andrew IrelandDependable Systems Group On the Scalability of Proof Carrying Code for Software Certification Andrew Ireland School of Mathematical & Computer.
Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University Lecture 2 July 11, 2001 Overview of PCC and Safety Policies Lipari School.
The TAOS Authentication System: Reasoning Formally About Security Brad Karp UCL Computer Science CS GZ03 / M th November, 2008.
Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University Lecture 1 Course Overview July 10, 2001 Lipari School on Foundations of.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Semantics In Text: Chapter 3.
An Axiomatic Basis for Computer Programming Robert Stewart.
Automated tactics for separation logic VeriML Reconstruct Z3 Proof Safe incremental type checker Certifying code transformation Proof carrying hardware.
Introduction Program File Authorization Security Theorem Active Code Authorization Authorization Logic Implementation considerations Conclusion.
CSCI1600: Embedded and Real Time Software Lecture 28: Verification I Steven Reiss, Fall 2015.
Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.
Computer Science at Carnegie Mellon Freshman IC Peter Lee Professor and Associate Dean.
CSE 311 Foundations of Computing I Lecture 28 Computability: Other Undecidable Problems Autumn 2011 CSE 3111.
SAFE KERNEL EXTENSIONS WITHOUT RUN-TIME CHECKING George C. Necula Peter Lee Carnegie Mellon U.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
This Week Lecture on relational semantics Exercises on logic and relations Labs on using Isabelle to do proofs.
CSE 60641: Operating Systems George C. Necula and Peter Lee, Safe Kernel Extensions Without Run-Time Checking, OSDI ‘96 –SIGOPS Hall of fame citation:
Agenda  Quick Review  Finish Introduction  Java Threads.
Secure Programming Dr. X
Types for Programs and Proofs
Secure Programming Dr. X
Certified Code Peter Lee Carnegie Mellon University
State your reasons or how to keep proofs while optimizing code
Tonga Institute of Higher Education IT 141: Information Systems
Lecture 19: Proof-Carrying Code Background just got here last week
Tonga Institute of Higher Education IT 141: Information Systems
Presentation transcript:

Lectures on Proof-Carrying Code Peter Lee Carnegie Mellon University Lecture 1 (of 3) June 21-22, 2003 University of Oregon 2004 Summer School on Software Security

“After a crew member mistakenly entered a zero into the data field of an application, the computer system proceeded to divide another quantity by that zero. The operation caused a buffer overflow, in which data leaked from a temporary storage space in memory, and the error eventually brought down the ship's propulsion system. The result: the USS Yorktown was dead in the water for more than two hours.”

According to CERT, buffer overflow attacks are the #1 exploit for network security attacks.

buffer overflow! +=

Automotive analogy “If the automobile had followed the same development as the computer, a Rolls-Royce would today cost $100, get a million miles per gallon, and...

Automotive analogy “If the automobile had followed the same development as the computer, a Rolls-Royce would today cost $100, get a million miles per gallon, and explode once a year killing everyone inside." - Robert Cringely

Cars in the Real World Problems at Mercedes: Forced to buy back 2000 copies of the latest E-Class sedan, due to problems with computing and telecommunications systems J.D.Power initial quality rankings have dropped to 15 th (even below Chevrolet!) in 2003  board member Jurgen Hubbert says this is directly related to the effort to be a “technology leader”

Observations Many failures are due to simple problems “in the details” Code reuse is necessary but perilous Updateable/mobile code is essential Performance matters a lot

Opportunities Progress depends fundamentally on our ability to reason about programs. The opportunities are great. Who will provide the technology for systems that work?

About these lectures The main topic is proof-carrying code, an example of certified code It won’t be possible to go through all aspects in complete detail But I hope to provide background to make it easier to get started with research

The Code Safety Problem

Please install and execute this.

“Applets, Not Craplets” - Luca Cardelli, 1996

“If you have ‘process’ without ‘inspiration,’ all you end up with is well- documented crap.” — Dr. John C. Sommerer, CTO, Johns Hopkins Advanced Physics Lab

Code Safety CPU Code Trusted Host Is this safe to execute?

Approach 1 Trust the code producer CPU Code Trusted Host sig Trusted 3rd Party PK1 PK2 Trust is based on personal authority, not program properties Scaling problems?

Approach 2 Baby-sit the program CPU Code Trusted Host Execution monitor Expensive E.g., Software Fault Isolation [Wahbe & Lucco], Inline Reference Monitors [Schneider]

Approach 3 Java CPU Code Trusted Host Interp/ JIT Expensive and/or big Limited in expressive power Verifier

Theorem Prover Approach 4 Formal verification CPU Code Flexible and powerful. Trusted Host But really really really hard and must be correct.

A key idea: Checkable certificates Certifying Prover CPU Proof Checker Code Proof Trusted Host

A key idea: Checkable certificates Certifying Prover CPU Code Proof Checker

Proof-Carrying Code [Necula & Lee, OSDI’96] A B

Five Frequently Asked Questions

Question 1 How are the proofs represented and checked?

Formal proofs Write “x is a proof of predicate P” as x:P. What do proofs look like?

Example inference rule If we have a proof x of P and a proof y of Q, then x and y together constitute a proof of P  Q. Or, in ASCII:  Given x:P, y:Q then (x,y):P*Q.

More inference rules Assume we have a proof x of P. If we can then obtain a proof b of Q, then we have a proof of P  Q. Given [x:P] b:Q then fn (x:P) => b : P  Q. More rules:  Given x:P*Q then fst(x):P  Given y:P*Q then snd(y):Q

Types and proofs So, for example: fn (x:P*Q) => (snd(x), fst(x)) : P*Q  Q*P This is an ML program! Also, typechecking provides a “smart” blackboard!

Curry-Howard Isomorphism In a logical framework language, predicates can be represented as types and proofs as programs (i.e., expression terms). Furthermore, under certain conditions typechecking is sufficient to ensure the validity of the proofs.

“Proofs as Programs” “Propositions as Types”

LF The Edinburgh Logical Framework language, or LF, provides an expressive language for proofs-as- programs. Furthermore, its use of dependent types allows, among other things, the axioms and rules of inference to be specified as well

Oracle strings A B rlrrllrrllrlrlrllrlrrllrrll…

Question 2 How well does this work in practice?

The Necula-Lee experiments Certifying Prover CPU Code Proof Simple, small (<52KB), and fast. No longer need to trust this component. Proof Checker Reasonable in size (0-10%).

Crypto test suite results sec

Question 3 Aren’t the properties we’re trying to prove undecideable? How on earth can we hope to generate the proofs?

How to generate the proofs? Proving theorems about real programs is indeed hard Most useful safety properties of low-level programs are undecidable Theorem-proving systems are unfamiliar to programmers and hard to use even for experts

The role of programming languages Civilized programming languages can provide “safety for free” Well-formed/well-typed  safe Idea: Arrange for the compiler to “explain” why the target code it generates preserves the safety properties of the source program

Certifying Compilers [Necula & Lee, PLDI’98] Intuition: Compiler “knows” why each translation step is semantics-preserving So, have it generate a proof that safety is preserved This is the planned topic for tomorrow’s lecture

Certifying compilation Certifying Compiler CPU Looks and smells like a compiler. % spjc foo.java bar.class baz.c -ljdk1.2.2 Source code Proof Object code Certifying Prover Proof Checker

Java Java is a worthwhile subject of research. However, it contains many outrageous and mostly inexcusable design errors. As researchers, we should not forget that we have already done much better, and must continue to do better in the future.

Question 4 Just what, exactly, are we proving? What are the limits? And isn’t static checking inherently less powerful than dynamic checking?

Semantics Define the states of the target machine S = (, , pc) and a transition function Step(S). Define also the safe machine states via the safety policy SP(S). program register state program counter

Semantics, cont’d Then we have the following predicate for safe execution: Safe(S) = n:Nat. SP(Step n (S)) and proof-carrying code: PCC = (S 0 :State, P:Safe(S 0 ))

Reference Interpreters A reference interpreter (RI) is a standard interpreter extended with instrumentation to check the safety of each instruction before it is executed, and abort execution if anything unsafe is about to happen. In other words, an RI is capable only of safe execution.

Reference Interpreters cont’d The reference interpreter is never actually implemented. The point will be to prove that execution of the code on the RI never aborts, and thus execution on the real hardware will be identical to execution on the RI.

Question for you Suppose that we require the code to execute no more than N instructions. Is such a safety property enforceable by an RI?

Question for you Suppose we require the code to terminate eventually. Is such a safety property enforceable by an RI?

What can’t be enforced? Informally: Safety properties  Yes “No bad thing will happen” Liveness properties  Not yet “A good thing will eventually happen”

Static vs dynamic checking PCC provides a basis for static enforcement of safety conditions However, PCC is not just for static checking PCC can be used, for example, to verify that necessary dynamic checks are carried out properly

Question 5 Even if the proof is valid, how do we know that it is a safety proof of the given program?

Please install and execute this. OK, but let me quickly look over the instructions first. Code producerHost

Code producerHost

This store instruction is dangerous! Code producerHost

Can you prove that it is always safe? Code producerHost

Can you prove that it is always safe? Yes! Here’s the proof I got from my certifying Java compiler! Code producerHost

Your proof checks out. I believe you because I believe in logic. Code producerHost

The safety policy We need a method for identifying the dangerous instructions, and generating logical predicates whose validity implies that the instruction is safe to execute In practice, we will also need specifications (pre/post-conditions) for each required entry point in the code, as well as the trusted API.

High-level architecture Explanation Code analyzer Checker Safety policy Agent Host

High-level architecture Proof Code Verification condition generator Proof checker Proof rules Agent Host

VCgen The job of identifying dangerous instructions and generating predicates for them is performed via an old method: verification-condition generation

A Case Study

A case study As a case study, let us consider the problem of verifying that programs do not use more than a specified amount of some resource. s ::= skip | i := e | if e then s else s | while e do s | s ; s | use e e ::= n | i | read() | e + e | e – e | … Denotes the use of n pieces of the resource, where e evaluates to n

Case study, cont’d Under normal circumstances, one would implement the statement: use e; in such a way that every time it is executed, a run-time check is performed in order to determine whether n pieces of the resource are available (assuming e evaluates to n).

Case study, cont’d However, this stinks because many times we should be able to infer that there are definitely available resources. … if … then use 4; else use 5; use 4; … If somehow we know that there are 9 available here… …then certainly there is no need to check any of these uses!

An easy (well, probably) case Program Static i := 0 while i < use 1 i := i + 1 We ought to be able to prove statically whether the uses are safe

A hopeless case Program Dynamic while read() != 0 use 1

An interesting case Program Interesting N := read() i := 0 while i < N use 1 i := i + 1 In principle, with just a single dynamic check, static proof ought to be possible

Also interesting Program AlsoInteresting while read() != 0 i := 0 while i < 100 use 1 i := i + 1

A core principle of PCC In the code, the implementation of a safety- critical operation should be separated from the implementation of its safety checks

Separating use from check So, what we would like to do is to separate the safety check from the use. We do this by introducing a new construct, acquire acquire requests n amount of resource; use no longer does any checking s ::= skip | i := e | if e then s else s | while e do s | s ; s | use e | acquire e

Separation permits optimization The point of acquire is to allow the programmer (or compiler) to hoist and coalesce the checks … acquire 9; if … then use 4; else use 5; use 4; … acquire n; i := 0; while (i++ < n) do { … use 1; … } It will be up to PCC to verify that each use is definitely safe to execute

High-level architecture Proof Code Verification condition generator Proof checker Proof rules Agent Host