Secure Compiler Seminar 4/11 Visions toward a Secure Compiler Toshihiro YOSHINO (D1, Yonezawa Lab.)

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Comparing Semantic and Syntactic Methods in Mechanized Proof Frameworks C.J. Bell, Robert Dockins, Aquinas Hobor, Andrew W. Appel, David Walker 1.
Alan Shaffer, Mikhail Auguston, Cynthia Irvine, Tim Levin The 7th OOPSLA Workshop on Domain-Specific Modeling October 21-22, 2007 Toward a Security Domain.
Certified Typechecking in Foundational Certified Code Systems Susmit Sarkar Carnegie Mellon University.
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Foundational Certified Code in a Metalogical Framework Karl Crary and Susmit Sarkar Carnegie Mellon University.
CHAPTER 2 GC101 Program’s algorithm 1. COMMUNICATING WITH A COMPUTER  Programming languages bridge the gap between human thought processes and computer.
March 4, 2005Susmit Sarkar 1 A Cost-Effective Foundational Certified Code System Susmit Sarkar Thesis Proposal.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
Mobile Code Security Aviel D. Rubin, Daniel E. Geer, Jr. MOBILE CODE SECURITY, IEEE Internet Computing, 1998 Minkyu Lee
Nicholas Moore Bianca Curutan Pooya Samizadeh McMaster University March 30, 2012.
Ashish Kundu CS590F Purdue 02/12/07 Language-Based Information Flow Security Andrei Sabelfield, Andrew C. Myers Presentation: Ashish Kundu
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
An Introduction to Proof-Carrying Code David Walker Princeton University (slides kindly donated by George Necula; modified by David Walker)
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
Code-Carrying Proofs Aytekin Vargun Rensselaer Polytechnic Institute.
Malicious Logic What is malicious logic Types of malicious logic Defenses Computer Security: Art and Science © Matt Bishop.
Coolaid: Debugging Compilers with Untrusted Code Verification Bor-Yuh Evan Chang with George Necula, Robert Schneck, and Kun Gao May 14, 2003 OSQ Retreat.
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB JavaForum.
Extensible Verification of Untrusted Code Bor-Yuh Evan Chang, Adam Chlipala, Kun Gao, George Necula, and Robert Schneck May 14, 2004 OSQ Retreat Santa.
Typed Assembly Languages COS 441, Fall 2004 Frances Spalding Based on slides from Dave Walker and Greg Morrisett.
Automatically Proving the Correctness of Compiler Optimizations Sorin Lerner Todd Millstein Craig Chambers University of Washington.
Programmability with Proof-Carrying Code George C. Necula University of California Berkeley Peter Lee Carnegie Mellon University.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
Introduction to Computers and Programming. Some definitions Algorithm: –A procedure for solving a problem –A sequence of discrete steps that defines such.
Language-Based Security Proof-Carrying Code Greg Morrisett Cornell University Thanks to G.Necula & P.Lee.
8/14/03ALADDIN REU Symposium Implementing TALT William Lovas with Karl Crary.
MinML: an idealized programming language CS 510 David Walker.
A Type System for Expressive Security Policies David Walker Cornell University.
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat.
4/6/08Prof. Hilfinger CS164 Lecture 291 Code Generation Lecture 29 (based on slides by R. Bodik)
1 Assembly Language: Overview. 2 If you’re a computer, What’s the fastest way to multiply by 5? What’s the fastest way to divide by 5?
1 The Problem o Fluid software cannot be trusted to behave as advertised unknown origin (must be assumed to be malicious) known origin (can be erroneous.
Extensible Code Verification Kun Gao (Senior EECS) with Professor George Necula, Evan Chang, Robert Schneck, Adam Chlipala An individual receives code.
1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.
Java Security. Topics Intro to the Java Sandbox Language Level Security Run Time Security Evolution of Security Sandbox Models The Security Manager.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
Secure Compiler Seminar 11/7 Survey: Modular Development of Certified Program Verifiers with a Proof Assistant Toshihiro YOSHINO (D1, Yonezawa Lab.)
Java Introduction Lecture 1. Java Powerful, object-oriented language Free SDK and many resources at
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Lecture 1 Introduction Figures from Lewis, “C# Software Solutions”, Addison Wesley Richard Gesick.
Towards Automatic Verification of Safety Architectures Carsten Schürmann Carnegie Mellon University April 2000.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
A Certifying Compiler and Pointer Logic Zhaopeng Li Software Security Lab. Department of Computer Science and Technology, University of Science and Technology.
Writing Systems Software in a Functional Language An Experience Report Iavor Diatchki, Thomas Hallgren, Mark Jones, Rebekah Leslie, Andrew Tolmach.
Looping and Counting Lecture 3 Hartmut Kaiser
CS 127 Introduction to Computer Science. What is a computer?  “A machine that stores and manipulates information under the control of a changeable program”
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
Concurrency Properties. Correctness In sequential programs, rerunning a program with the same input will always give the same result, so it makes sense.
Introduction Program File Authorization Security Theorem Active Code Authorization Authorization Logic Implementation considerations Conclusion.
SAFE KERNEL EXTENSIONS WITHOUT RUN-TIME CHECKING George C. Necula Peter Lee Carnegie Mellon U.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
The single most important skill for a computer programmer is problem solving Problem solving means the ability to formulate problems, think creatively.
CSE 60641: Operating Systems George C. Necula and Peter Lee, Safe Kernel Extensions Without Run-Time Checking, OSDI ‘96 –SIGOPS Hall of fame citation:
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Lecture #1: Introduction to Algorithms and Problem Solving Dr. Hmood Al-Dossari King Saud University Department of Computer Science 6 February 2012.
PROGRAMMING FUNDAMENTALS INTRODUCTION TO PROGRAMMING. Computer Programming Concepts. Flowchart. Structured Programming Design. Implementation Documentation.
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
Software Engineering Algorithms, Compilers, & Lifecycle.
Credible Compilation With Pointers Martin Rinard and Darko Marinov Laboratory for Computer Science Massachusetts Institute of Technology.
A Single Intermediate Language That Supports Multiple Implemtntation of Exceptions Delvin Defoe Washington University in Saint Louis Department of Computer.
Advanced Computer Systems
John Backes, Rockwell Collins Dan DaCosta, Rockwell Collins
Lecture 1 Introduction Richard Gesick.
Lecture 1: Introduction to JAVA
Security in Java Real or Decaf? cs205: engineering software
Language-based Security
John Backes, Rockwell Collins Dan DaCosta, Rockwell Collins
Presentation transcript:

Secure Compiler Seminar 4/11 Visions toward a Secure Compiler Toshihiro YOSHINO (D1, Yonezawa Lab.)

Talk Agenda Brief Introduction about TAL and PCC Introduction of my Master Thesis Visions toward a Secure Compiler

Brief Introduction about TAL and PCC

Background Program verification = Mathematically assure a program has certain properties Useful for security Memory access safety, information flow analysis, … Verifying low-level code directly reduces TCB TCB: Trusted Computing Base High-level code must be compiled after verified ⇒ We must trust the compiler Assemblers are much simpler than compilers

Current Techniques and Problems Code signing Based on public key cryptography Can prove the genuineness of code Cannot prove the safety by itself Signature matching Use a dictionary of malicious patterns and match target programs against it Employed in many antivirus systems Pass does NOT mean safety Often unable to detect very new virus

Proof-Carrying Code [Necula et al. 1997] Technique for safe execution of untrusted code Code consumer does not need to trust the producer Code distributed with the proof of its safety Producer creates a proof Consumer verifies the proof against his security policy

Proof-Carrying Code [Necula et al. 1997] Low consumer’s cost Consumer has only to verify the proof For example, by typechecking Tamper-proof If passed the check, code does NOT harm even if modified If modification makes the code fail the check, the code will not run and it is safe Otherwise code still obeys the consumer’s security policy

Typed Assembly Language [Morrisett et al. 1999] Extends a conventional assembly language with static type checking An instance of Proof-Carrying Code By type checking, it can guarantee Memory access safety Program never accesses outside the memory area allocated for it Interface consistency Type agreement of arguments / return value of functions etc.

TAL System Illustrated Type Checker Assembler Linker TAL System Code with type information Code Consumer

A Brief Example of TAL Program fact: movl%eax, %ecx movl$1, %eax loop: mull%ecx decl%ecx cmpl$0, %ecx jgloop end: {eax: B4} {eax: B4, ecx: B4} {eax: B4} Program Code (Same as conventional assembly languages) Type Information (Used to typechecking a program)

Related Work: TALK, TOS [Maeda, 2005] TALK: TAL for Kernel Morrisett et al. uses garbage collector for memory management in TAL For OS, GC cannot be assumed Must implement memory management (malloc/free) TOS: Typed Operating System An experimental OS written in TALK

Introduction of My Master Thesis

My Work for Master Thesis “A Framework Using a Common Language to Build Program Verifiers for Low-Level Languages” To help developers of program verifiers To be a common basis for verification of low-level programs Such as assembly and machine languages

Motivation: Verifiers are Hard to Develop Especially in low-level languages… Complex semantics Semantics of each instruction is complex There are many instructions in a language Low portability Low-level languages heavily depend on the underlying architecture Accordingly, entire verifier also depends on the underlying architecture

Our Idea Split a verifier into three parts 1. Design a common language, 2. Translate the target program into that language, and 3. Verify the translated program These parts are explicitly independent from each other Thus we can replace them easily

Our Idea Translated Program Target Program Translator Semantics of Common Language Result Success /Fail Verifier (1) (2) (3) Verification Logic

How Do We Solve the Problems? Coping with complex semantics Only translators care the semantics of the source language Translator is reusable Once description is done, we can reuse it Improving portability Verification logic is also reusable Once implemented, it can be used for other architectures simply by replacing translators

How Do We Solve the Problems? Translated Program Translator Semantics of Common Language Result Success /Fail Verifier Verification Logic Target Program Translator Program in Another Language

Overview of the Work Designed a framework to build program verifier Designed a common language ADL Discussed the correctness of translators Proved that the properties assured are preserved throughout translation Implemented the framework using Java

ADL: A Common Language Translated Program Target Program Translator Semantics of Common Language Result Success /Fail Verifier Verification Logic

ADL: A Common Language Design Concept ADL: Architecture Description Language From observation of many architectures Data is stored in registers and memory, and manipulates it according to program Only jumps are sufficient for control flow structure Expressiveness Arithmetics, logical operations, … C-like expressions Conservative semantics No need to describe indecent programs To simplify semantics

ADL: A Common Language Overview of the Language Imperative language which manipulates registers and memory 5 kinds of commands nop, error, assignment, goto, if-then-else Much like C than assembly Infix operators, parenthesized formulae Conditional execution by arbitrary condition using if command Only goto modifies control flow Unconditional branch

ADL: A Common Language A Brief Example data:... main: %ebx = &data; %eax = 0; goto &lp; lp: %eax = %eax + *[4](%ebx); %ebx = *[4](%ebx + 4); if %ebx == &null then goto &end else goto &lp; end: goto &end; data:... main: movl $data, %ebx movl $0, %eax lp: addl 0(%ebx), %eax movl 4(%ebx), %ebx cmpl $0, %ebx je end jmp lp end: jmp end ADLx86

ADL: A Common Language Restrictions ADL has a few restrictions by design Code and data are completely separated We assume NOTHING about memory layout of a program To simplify the semantics Some programs cannot be expressed However, most of decent programs can be written even under these restrictions To be discussed in the next slide

ADL: A Common Language > Restrictions Separation of Code and Data Do not treat code as data ADL programs cannot read / write code We cannot express the programs which uses dynamic code generation But, patterns of the generated code is fixed in many cases ⇒ Other solution is possible For example, prepare a function for each pattern of code

ADL: A Common Language > Restrictions Not Assume Memory Layout Casting is prohibited ADL distinguishes integers and pointers In real architectures, pointers are not distinguished from integers Pointer arithmetic is restricted Only pointer+integer, pointer-pointer are defined Other operations returns ‘undetermined’ Sufficient for array/structure operations and offset calculation

Program Translator Translated Program Target Program Translator Semantics of Common Language Result Success /Fail Verifier Verification Logic

Program Translator Translates low-level programs into ADL We must assure that program translators are correct Otherwise, we cannot trust the entire verifier Correctness is defined in the following discussion

State Program Translator What Is Correctness of Program Translation? Instruction = Function over machine states Correctness = Correspondence between states of two machines are preserved in translation Original Program Translated Program State State ’ State State ’

Program Translator How to Confirm Correctness of Translation Any programs result in corresponding states for any input ⇒ Correctness Total inspection is NOT realistic Theorem prover would be useful Automatic proving is one of future work But how to confirm the correctness of the description of the source language? At this time, we take empirical approach Test several cases using an interpreter

Verification Logic Translated Program Target Program Translator Semantics of Common Language Result Success /Fail Verifier Verification Logic

Verifies the properties of translated programs Function that takes a program and returns success or fail Soundness must be assured This is the task for the creator of a verification logic Here we do not discuss any further Definition: Soundness of a verification logic Verification logic V: State → Bool The set {S | V(S)} is closed about step execution If V(S), execution never falls into error state, and If V(S) and S→T (→ means step execution), then V(T)

Verification Logic Soundness of Verification Logic Machine States S such that V(S) Soundness = V(S) ∧ S→T then V(T)

Verification Logic Program Translation and Verification We proved the following theorem If program translator is correct, and Verification logic is sound, then ⇒ Verification on original program and translated program are equivalent Closed subset can be defined on the states of translation source language

Implementation Framework ADL data structures ADL interpreter Used to confirm the correctness of translators Translator, verification logic interfaces Translation rule compiler Compiles translation rule into Java implementation of a translator And for proof of concept, Translator from Intel x86 and SPARC A simple type checker

Related Works Foundational TAL [Crary, 2003] TAL type checker is still large TALx86 type checker consists of approx. 23k LoC in O’Caml (!) TCB is reduced by using a logical framework Designed a language called TALT on Twelf logical framework [Pfenning et al., 1999] Proved GC safety of TALT by machine Correspondence between TALT and realistic architectures are not discussed TALT type system is fixed Our work allows replacement of verification logics

Future Work Automatically confirm the correctness of translation Automatic testing Cooperating with emulators or debuggers Or, build a model and use a theorem prover Support dynamic memory allocation Currently all memory must be allocated statically Support concurrent programs Concurrency is not taken into consideration To apply for OSes, etc., concurrency takes an important role

Visions toward a Secure Compiler

What Is Secure Compiler? A compiler which produces certified code For example, TAL code as output Like Popcorn compiler in TALx86 Safe dialect of C → TALx86 A compiler which assures correct compilation (optionally) Like credible compiler [Rinard, 1999] Reduces TCB

Motivation Infrastructure has been built TALK, TOS [Maeda, 2005] Verifier framework [Yoshino, 2006] Next we have to build a house on it! Most people do not want to write low- level code directly ⇒ Secure Compiler

Toward Secure World If we built a secure compiler… Memory-error-free systems Prevent memory-error-based attacks OS kernel, core libraries, network server… Writing secure code Vulnerable code will result in verification failure So code security will be improved Rest to be discovered…

Tasks to Do Determine what properties to assure Memory access safety? Information flow? Must be mechanically checkable Design the verification logic Use verifier framework? Design the language Target: TAL-base? ADL? ADL can be used as certified language Register allocation is done, so simple mapping will be possible… Source: ???