A Trustworthy Proof Checker

A Trustworthy Proof Checker
. 2/5/2019 A Trustworthy Proof Checker Andrew W. Appel Aaron Stump Neophytos G. Michael Stanford University Roberto Virga Princeton University FCS & VERIFY, July 2002 A trustworthy proof checker for proofs of properties of machine-code programs. 2/5/2019

Trusted Computing Base
. 2/5/2019 Trusted Computing Base Theorem: Operating System: an + bn  cn gcc emacs Proof netscape rogomatic make Axioms Kernel Trusted Base 2/5/2019

The problem: Mobile Code Security
. 2/5/2019 The problem: Mobile Code Security Code Producer Code Consumer Code Source Program Compiler Execute load r3, 4(r2) add r2,r4,r1 store 1, 0(r7) store r1, 4(r7) add r7,0,r3 add r7,8,r7 beq r3, .-20 ? Private files Network access Launch control etc. 2/5/2019

Existing Practice: Hardware VM protection
. 2/5/2019 Existing Practice: Hardware VM protection Code Producer Code Consumer Machine Code Machine Code Source Program Compiler Execute load r3, 4(r2) add r2,r4,r1 store 1, 0(r7) store r1, 4(r7) add r7,0,r3 add r7,8,r7 beq r3, .-20 Operating System virtual memory Protected resources Disadvantages: Large trusted code base of O.S. Clumsy, slow interfaces between trusted & untrusted code 2/5/2019

Existing Practice: Bytecode Verification
. 2/5/2019 Existing Practice: Bytecode Verification Code Producer Code Consumer ByteCode Java Program Bytecode Verifier Compiler load r3, 4(r2) add r2,r4,r1 store 1, 0(r7) store r1, 4(r7) add r7,0,r3 add r7,8,r7 beq r3, .-20 Trusted Computing Base Advantage: Clean, fast, O-O interface between trusted & untrusted code Disadvantage: Huge trusted computing base: JIT OK Just-in-time Compiler Native code Execute 2/5/2019

Foundational Proof-Carrying Code
. 2/5/2019 Foundational Proof-Carrying Code Code Producer Code Consumer Native Code Source Program Compiler Execute load r3, 4(r2) add r2,r4,r1 store 1, 0(r7) store r1, 4(r7) add r7,0,r3 add r7,8,r7 beq r3, .-20 Hints Trusted Computing Base Machine Spec + Policy Machine Spec + Policy Safety Proof $-i( -i(... -r ( ...) ) Prover Checker OK 2/5/2019

Trusted Computing Base
. 2/5/2019 Trusted Computing Base The minimal set of code that must be trusted Our goal: make TCB as small as possible TCB consists of two pieces: The safety policy (a predicate in Higher-Order Logic that characterizes whether a program is safe to execute) The proof-checker (a small C program that checks safety proofs) 2/5/2019

Trusted Computing Base (cont.)
2/5/2019 Trusted Computing Base (cont.) Safety Policy Choose a logical framework (programming language for logic) Choose an object logic (axioms, inference rules) Represent our theorem in the object logic Proof Checking Build a proof-checker for the logical framework Safety Policy We choose LF We choose Higher-Order Logic We will explain... Proof Checking We use Twelf to prove theorems, but for checking we want something smaller and simpler . . . 2/5/2019

LF, Twelf, and Higher Order Logic
. 2/5/2019 Harper et al. 1993 LF, Twelf, and Higher Order Logic What is LF? A Logical Framework for defining and presenting logics Based on a general treatment of syntax, rules, and proofs by means of a typed first-order -calculus Its type system has three levels of terms: Objects Types that classify objects Kinds that classify families of types. Equality is taken as -conversion The judgments-as-types principle We use the Twelf implementation of LF (Pfenning et al. 99) We implement a standard HOL with arithmetic 2/5/2019

Programming in Twelf Define formula constructors (an LF signature):
. 2/5/2019 Programming in Twelf Define formula constructors (an LF signature): num : type. form : type. imp : form -> form -> form. Define proof constructors (axioms): pf : form -> type. imp_i : (pf A -> pf B) -> pf (A imp B). imp_e : pf (A imp B) -> pf A -> pf B. 2/5/2019

Theorems, proof checking in HOL
. 2/5/2019 Theorems, proof checking in HOL Proof of logical transitivity: imp_trans: pf (A imp B) -> pf (B imp C) -> pf (A imp C) = [p1 : pf (A imp B)] [p2 : pf (B imp C)] imp_i [p3 : pf A] imp_e p2 (imp_e p1 p3). This shows the general form of a Twelf definition: name :  = exp. 2/5/2019

The safety policy “This program accesses memory only in range 0-1000”
. 2/5/2019 The safety policy “This program accesses memory only in range ” “This program never executes an illegal instruction.” Step I: define access predicates readable(x) = 0  x  1000 writable(x) = 0  x  1000 Step II: define legal instructions . . . 2/5/2019

Machine states, step relation
. 2/5/2019 Machine states, step relation Machine State = Register bank + memory (r,m)  (r’,m’ ) : the step relation is a map between machine states 1 2 3 psr pc r m 1 2 3 psr pc r’ m’ 7 8  2/5/2019

Machine instruction = step relation
. 2/5/2019 Machine instruction = step relation add r1:=r2+r3  m’=m, r’(1)=r(2)+r(3), r’(pc)=1+r(pc), i i  1  i  pc  r’(i)=r(i) 1 2 3 psr pc r m 1 2 3 psr pc r’ m’ 7 2 6 8 2 6  2/5/2019

Instruction decoding; memory policy
. 2/5/2019 Instruction decoding; memory policy (r,m)  (r’,m’ )   w,i,j,k m (r (pc)) = w  w = 3212 + i28 + j24 + k  m’ = m  readable (r ( j) + k )  r’ (i) = m (r ( j)+ k)  r’ (pc) = 1+ r’ (pc)  x xi  xpc  r’ (x)=r (x) load ri := m(rj+k)  ( )  ( )  . . . op d s1 s2 w = i j k 1 2 3 psr pc r m 7 w 2/5/2019

Making the specification concise & trustworthy
Described in [Michael & Appel 2000] Separate syntax from semantics Factor the semantics Use “New Jersey Machine-Code Toolkit” to describe syntax Automatically translate NJMCT descriptions into concise and readable higher-order logic 2/5/2019

Specifying safe execution
. 2/5/2019 Specifying safe execution  relation includes only the legal instructions Safety means, “no matter how many instructions you execute, the next instruction is legal” The program is meant to be loaded at some start address loaded(m,start,prog) = i dom(prog). m(start+i) = prog(i) Example: loaded(m,100, (9017;4214;8099;4010;6231;1008)) 9017 4214 8099 4010 6231 1008 100: 2/5/2019

Safety theorem safe(prog) = r,m,start.
2/5/2019 Safety theorem safe(prog) = r,m,start. loaded(m,start,prog)  r(pc)=start  r’,m’. r,m  r’,m’   r’’,m’’. r’,m’  r’’,m’’ Trusted Computing Base r m start: 9017 4214 8099 4010 6231 1008 ? Theorem to be proved: safe(9017;4214;8099;4010;6231;1008) pc: start 2/5/2019

Size of Safety Specification (Sparc)
. 2/5/2019 Size of Safety Specification (Sparc) 2/5/2019

Representation Issues in the Specification
. 2/5/2019 Representation Issues in the Specification Eliminating Redundancy in LF terms Dealing with Arithmetic Representation of Axioms and Trusted Definitions: Encoding Higher-Order Logic in LF Polymorphic programming in Twelf Explicit versus implicit programming in Twelf - Avoiding term reconstruction 2/5/2019

Eliminating Redundancy
. 2/5/2019 Eliminating Redundancy LF signatures contain lots of redundant information imp_i : {A: form}{B: form} (pf A -> pf B) -> pf (A imp B). Twelf’s answer: parameters can be “declared” implicit imp_i : (pf A -> pf B) -> pf (A imp B). Implicit parameters in the TCB means type reconstruction in the checker Algorithm is large and complex It relies on higher-order unification which is undecidable (some valid proofs may fail) 2/5/2019

Eliminating Redundancy (cont.)
2/5/2019 Eliminating Redundancy (cont.) On the TCB side: We write axioms & trusted definitions in fully explicit style On the proving side: Implicit versus explicit LF term sizes Other approaches to this problem: Necula’s LFi, Oracle based checking We represent proofs as DAGs with structure sharing of common sub-expressions Proof-size blowup is avoided The checker does not need to parse proofs But constant factor is not so good, though A tradeoff: TCB size versus Proof Size 2/5/2019

Term Reconstruction in the Prover
Twelf’s term reconstruction algorithm (a.k.a. “type inference”) is extremely useful in writing proofs Outside TCB, write “compatibility lemmas” to interface with proofs that are written in implicit style. 2/5/2019

The Proof Checker A small C program (~ 803 lines, 1/3 of the TCB)
. 2/5/2019 The Proof Checker A small C program (~ 803 lines, 1/3 of the TCB) Type checks explicit LF proofs and loads and executes only safe programs Makes no use of libraries except: read, and _exit 2/5/2019

Why do we need a parser? Not for proofs -- they are transmitted to checker in DAG form For axioms! Humans can’t read axioms and trusted definitions in DAG form, therefore can’t trust them. (see Pollack ‘98, “How to believe a machine-checked proof”) 2/5/2019

DAG representation of proofs & types
Each DAG node is 5 words Entire DAG is transmitted as a single block op arg1 arg2 type match opcode left child right child computed type weak head normal form op arg1 arg2 type match op arg1 arg2 type match 2/5/2019

Proof-checking measurements
In the paper, we report a time of 74 seconds to check a benchmark proof (~ 6,000 lines) We have improved this to 0.48 seconds Checker marks closed terms Avoid traversing closed terms during substitutions Adds 20 lines to the Proof Checker op cl arg1 arg2 type match 2/5/2019

Smallest possible TCB Open-source JVM, Highly optimizing
. 2/5/2019 Smallest possible TCB Open-source JVM, non-optimizing JIT Highly optimizing Java Compiler optimizing compiler PCC system, Foundational PCC Our System: 2/5/2019

Future Work Machine Descriptions for other CPUs (Mips, Sparc so far)
. 2/5/2019 Future Work Machine Descriptions for other CPUs (Mips, Sparc so far) TCB is really small but proof sizes are large. Work on finding the right tradeoff between TCB size and proof size Compress DAG in some way Use another compressed form of the LF syntactic notation Add a simple Prolog interpreter to the TCB that “rediscovers” the proof based on the sequence of TAL instructions given to the checker TCB no longer minimal but proof sizes greatly reduced 2/5/2019

A Trustworthy Proof Checker

Similar presentations

Presentation on theme: "A Trustworthy Proof Checker"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Trustworthy Proof Checker

Similar presentations

Presentation on theme: "A Trustworthy Proof Checker"— Presentation transcript:

Similar presentations

About project

Feedback