Presentation is loading. Please wait.

Presentation is loading. Please wait.

Provably Secure Program Protection

Similar presentations


Presentation on theme: "Provably Secure Program Protection"— Presentation transcript:

1 Provably Secure Program Protection
Lt Col Todd McDonald AFIT/ENG x4639

2 Research Interests Program Encryption
Program protection / secure coding Obfuscation / tamperproofing Mobile agent security / mobile code Information / database security Multi-agent architectures Trust-based computing

3 Three Focus Areas Semantic Transformation
Random Program Security Model / Randomizing Obfuscators Perfectly Secure White Box Obfuscators

4 Program Scenario

5 Program Protection Adversarial Observation: Black Box Analysis
If the adversary cannot determine the function/intent of the device by input/ output analysis, we say it is black-box protected Adversarial Observation: Black Box Analysis White Box Analysis If the adversary cannot determine the function/intent of the device by analyzing the structure of the code, we say it is white-box protected Intent Protected: Combined black-box and white-box protection does not reveal the function/intent of the program

6 Definitions “The goal of program obfuscation is to make a program unintelligible while preserving its functionality” Virtual black box (VBB): anything one can compute from the obfuscated program could also be computed from input-output behavior of the original program

7 Formally… (yuck) An obfuscator is an efficient compiler O that takes input program P and produces semantically equivalent program P’: P’ = O(P) functionality, x, P(x) = P’(x), where P’=O(P) polynomial slowdown, which says O(P) is at most polynomially slower than P (for circuits the requirement is that the size of O(P) is at most polynomially greater than P) virtual black box (VBB) property: The generalized VBB property mathematically states that you should not be able to learn more from the obfuscated version of a program (O(M)) than from a simulator (S<M>) for the original program with oracle access. It is formulated as follows:

8 Totally Unobfuscatable Functions under VBB
1) Given any program that computes f  F the value (f) can be efficiently computed 2) Given oracle access to a (randomly selected) function f  F no efficient algorithm can compute (f) much better than random guessing Property : F  {0,1} Family of functions F This family is constructed from any one-way function This family of functions is UNOBFUSCATABLE if (1) and (2) are true

9 VBB Proof Methodology Because a family of (contrived) functions can be shown to be unobfuscatable Therefore, general, efficiently, secure obfuscators do not exist (a) (b) (c)

10 Where are we at? Known methods of obfuscation are reverse of good software engineering None guarantee impossibility of retrieving sensitive information or algorithms A determined specialist given enough time and resources is able to deobfuscate any obfuscated program In spite of VBB: does not imply there is no method for making programs “unintelligible” in some meaningful and precise way

11 How to Define Security Explicitly Implicitly
Define adversary task and require that it is computationally difficult Disadvantage: lot of threats/some are difficult to formulate in terms of computational problems Implicitly Define ideal security model and require our case is nearly as good as ideal one Disadvantage: Barak et al. result shows this is impossible based on VBB

12 Going against the Gold Standard
“Anything that can be efficiently computed from O(P) can be efficiently computed given oracle access to P” Who died and made them the boss (Barek et al.)? The intent: Can you eliminate any advantage to seeing the obfuscated source code beyond getting black box access to the original program? Our contention and intuition: You will always have an advantage for learning something about the original program if given (obfuscated) source code above what you could learn if given just black box access

13 Three Focus Areas Semantic Transformation
Random Program Security Model / Randomizing Obfuscators Perfectly Secure White Box Obfuscators

14 Properties of Random Program Obfuscators
Black Box Protection Y and A are semantically different A has input/output consistent with the function of the program Y has input/output consistent with a family of one-way function circuits Y = ORAND(A,K)

15 Properties of Random Program Obfuscators
Black Box Protection

16 Semantic Encryption Transformation

17 Program/Circuit P Py1 Px1 P Pxn Pyn

18 Strongly Pseudorandom Data Ciphers
K 232 ~256 Truth Tables

19 Semantically Secure Black Box Protection

20 Semantically Secure Black Box Protection
P’ = O(P)

21 Things to Be Done: P + E Living under Kerckhoff's Principle
Program encryption generation engine Unique encryption ciphers / key-based Security characterizations Number of E’s Input sizes Practical implementation issues

22 White Box Protection ?? Circuit P’

23 Three Focus Areas Semantic Transformation
Random Program Security Model / Randomizing Obfuscators Perfectly Secure White Box Obfuscators

24 IEEE International Symposium on Circuits and Systems (ISCAS) Format
# Comment: Inputs INPUT(WIRE) # Comment: Outputs OUTPUT(WIRE) # Comment: Gate Specifications GATE = FUNCTION (OPERAND, OPERAND) OPERAND = {GATE} U {WIRE} FUNCTION = {AND, OR, NOT, XOR, NXOR, NAND, NOR}

25 BENCH and BED Formats C17.gif C17.bench
Binary Expression Diagram (BED)

26 BENCH Workflow Graphics Format c1000.bench ISCAS Format C1000.gif
WINDOWS bench graphviz DOT Format Executable BED C Program C1000.dot C1000 C1000.c c1000 gcc LINUX

27 We Need a Better Security Model… and Provably Security Under that Model

28 Obfuscation under RPM ? CAL’ CL Y = ORAND(A,K) Circuit Y Circuit X
Circuit A

29 Random Programs/Circuits

30 Random Programs

31 White Box Understandable base on Random Program Oracles

32 White Box Understandable base on Random Program Oracles

33 ? ATLANTIC Gulf of OCEAN Mexico 0.69%
486,800,000,000,000,000,000 teaspoons 70,940,000,000,000,000,000,000 teaspoons

34 Correlating Program and Data Encryption
Randomizing Obfuscators

35 Generating a Circuit Library
1) # of INPUTS 2) # of OUTPUTS 3) CIRCUIT SIZE + AND, OR, NAND, NOR, XOR, NXOR All Possible Combinations

36 Correlating Program and Data Encryption
CIRCUIT REPRESENTATION HLL or ASM PROGRAM HLL or ASM PROGRAM SUB-CIRCUIT SELECTION SUB-CIRCUIT REPLACEMENT Linear cryptanalysis was first openly published as a means for attacking DES by Mitsuru Matsui in EUROCRYPT’93.6 His method attempts to find a linear relation among the plaintext, ciphertext, and keys as they pass through the s-boxes. With enough known plaintext/ciphertext pairs as data, a relation with a high enough probability can be used to find the key. Matsui generated linear approximation tables for the 8 DES s-boxes and found the strongest linearity in S5 (the fifth s-box). The tables were created by analyzing all the combinations of the input and output bits of the s-boxes. Since there are 6 input bits and 4 output bits, there are 1024 (= 26 · 24) entries in his tables for every s-box. A linear approximation is stronger if it is significantly greater or less Eli Biham took this one step further to help define restrictions on s-boxes to make them more resistant to linear cryptanalysis.8 He found that increasing the number of output bits of an s-box can endanger the s-box significantly to linear cryptanalysis. More precisely, he found that in an m·n s-box, where m is the number of input bits and n is the number of output bits, if n • 2m-m, the s-box must have a linear property of the input and output bits. With the primary modes of attack on DES-like algorithms defined, rules can be established for how s-boxes are designed and used. Researchers can also examine how other’s s-box designs match up against those cryptanalysis techniques. There are several ways of making better s-boxes than the ones specified in DES, however, Schneider states that “… blindingly choosing new sboxes isn’t a good idea.”15 Among the common and well-documented features of s-boxes that are considered viable are those that permit the algorithm to follow the Strict Avalanche Criteria (SAC). The avalanche effect was first published in the cryptography world by Horst Feistel.16 In that study, it was determined that when an input bit goes through the system, an equal number of 1’s and 0’s on average are the resultant output. This was taken one step further by Webster and Tavares17, requiring exactly half of the output bits to change when one input bit changes. Another consideration is the size of the s-box. From the above discussions on cryptanalysis, a large box would be better than a small one. A large number of output bits are needed to protect against differential attacks; however, a corresponding large number of input bits are also needed to protect against linear cryptanalysis. Obviously, a balance of the two is needed. Finally, there are three requirements regarding the values in the s-box. First, the distributions of outputs must be checked for uniformity to protect against the Davies’ Attack. Second, the outputs must have no linearity in their function to the input. Third, there must be unique values in every row of the s-box. There are several other requirements; however they are beyond the scope of this paper. S-Box Selection Iterative Rounds

37 Three Focus Areas Semantic Transformation
Random Program Security Model / Randomizing Obfuscators Perfectly Secure White Box Obfuscators

38 Perfect White Box Protection
main (int argc, char *argv) { int x,y; /* Get input from the user */ x = argv[1]; /* Super secret algorithm */ …….. /* Output the result */ cout << y; }

39 Perfect White Box Protection
What is the best we can hope for to protect the “structure” of the code that performs the secret algorithm? We want the program to act just like an oracle would We want the program to be a “black-box” implementation

40 Perfect White Box Protection = Black Box Implementation
main (int argc, char *argv) { int x,y; /* Get input from the user */ x = argv[1]; /* Super secret algorithm */ if (x == 1) y = ; else if (x == 2) y = 23; else if (x == 3) y = ; …. /* Output the result */ cout << y; }

41 Perfect White Box Protection
Problems with this approach: You have to know all inputs/outputs Therefore, the algorithm could never be efficient for all size input n Therefore, the algorithm could never be general for all programs Which lends support to what Barak was saying…

42 Perfect White Box Protection
But: Mobile code programs are targeted for small programs Input size might be limited You may not care about the full range of possible inputs, only some…

43 Perfect White Box Protection
Regardless of efficiency: We can define a methodology for perfect white box protection We could apply that method for programs of small input size n (which is defined only by the amount of time or resources you want to apply to get the result) Those programs would be perfectly white box protected

44 Circuits Structural view of P: Consider circuit P 3 representations:
Algebraically (Boolean function) Structurally (circuit diagram) Truth table (input/output behavior) Structural view of P: INPUT(3) INPUT(2) INPUT(1) OUTPUT(7) OUTPUT(6) 4 = AND(3,2) 5 = OR(4,1) 6 = XOR(4,3) 7 = NAND(5,6)

45 Circuits Behavioral view of P:

46 Circuits Functional view of P: fP Derive it from structure
y6 = (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’ y7 = ((x3x2 + x1) (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’)’ Derive it from truth table y6 = x1’x2’x3 + x1x2’x3 y7 = x1’x2’x3’ + x1’x2’x3 + x1’x2x3’ + x1’x2x3 + x1x2’x3’ + x1x2x3’ + x1x2x3

47 Circuits There are many different but equivalent realizations of the same function (including minterm and maxterm realizations) There is no “right” realization given any function If a Boolean expression is written in a certain form, it will always be obvious, given two expressions, whether we are dealing with the same function or different functions

48 Circuits Such forms are termed “canonical forms”
Canonical forms are “official” forms for writing the algrabraic expression of a given type (such as Boolean algebraic expressions)

49 Circuits There is one and only one canonical realization for each function It is (should be) impossible to have different canonical realizations of the same function, only with exceptions based on commutativity:: abc’ + b’c  cb’ + c’ba There is only 1 minterm realization of any function

50 Circuits Take these 2 functions for example: b’c’ + bc + a’b
b’c’ + bc + a’c’ These two functions are equivalent, yet neither can be simplified any further

51 Circuits Blake canonical form (BCF)
produced by taking a Boolean function in SOP form perform a sequence of simplification steps result is a form that produces a unique and compact representation of the original circuit b’c’ + bc + a’b b’c’ + bc + a’c’ The BCF form of the 2 above equivalent circuits is given by: b’c’ + bc + a’b + a’c’

52 Circuits I reduced it by hand: Functional view of P: fP
Derive it from structure y6 = (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’ y7 = ((x3x2 + x1) (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’)’ I reduced it by hand: y6 = x3x2’ y7 = x1’ + x3’ + x2

53 Circuits No really I did:

54 Circuits This is in complete-SOP form already
Functional view of P: fP Derive it from truth table y6 = x1’x2’x3 + x1x2’x3 y7 = x1’x2’x3’ + x1’x2’x3 + x1’x2x3’ + x1’x2x3 + x1x2’x3’ + x1x2x3’ + x1x2x3 This is in complete-SOP form already I applied Blake’s method to get: y6 = x2’x3 y7 = x1’ + x3’ + x2

55 So what does canonical minimization do?
All you need is the truth table or behavioral view to get an SOP form

56 So what does canonical minimization do for us?
This is what an oracle for P would “use” when asked questions about P … Any circuit that implements this truth table would then be a “black box implementation” of P

57 The “Logic” of Canonical P
if (x1 == 0) && (x2 ==0) & (x3==0) y6 = 1 y7 = 0 else if ((x1==0) && (x2==0) && (x3==1) y7 = 1

58 Can I ever recover the structure of the original P from canonical P?

59 Can I ever recover the structure of the original P from canonical P?
y6 = x3x2’ y7 = x1’ + x3’ + x2 BOTH are forward derivations y6 = x1’x2’x3 + x1x2’x3 y7 = x1’x2’x3’ + x1’x2’x3 + x1’x2x3’ + x1’x2x3 + x1x2’x3’ + x1x2x3’ + x1x2x3 y6 = (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’ y7 = ((x3x2 + x1) (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’)’ This would revel the gate structure

60 Perfect White Box Obfuscators
Algorithm O Truth Table TP Algorithm A: Complete Sum-of-Products Algorithm B: Minimal Sum-of-Products Circuit P’

61 For Designing Catenation-Based Obfuscators : P’ = P + E
HIGH LOW Efficiency HIGH LOW Security Perfect White Box P + E Randomization Circuit Splicing Subcircuit-Canonical Minimization

62 End-to-End Program Protection Architecture

63 Questions ???


Download ppt "Provably Secure Program Protection"

Similar presentations


Ads by Google