Download presentation
Presentation is loading. Please wait.
1
Provably Secure Program Protection
Lt Col Todd McDonald AFIT/ENG x4639
2
Research Interests Program Encryption
Program protection / secure coding Obfuscation / tamperproofing Mobile agent security / mobile code Information / database security Multi-agent architectures Trust-based computing
3
Three Focus Areas Semantic Transformation
Random Program Security Model / Randomizing Obfuscators Perfectly Secure White Box Obfuscators
4
Program Scenario
5
Program Protection Adversarial Observation: Black Box Analysis
If the adversary cannot determine the function/intent of the device by input/ output analysis, we say it is black-box protected Adversarial Observation: Black Box Analysis White Box Analysis If the adversary cannot determine the function/intent of the device by analyzing the structure of the code, we say it is white-box protected Intent Protected: Combined black-box and white-box protection does not reveal the function/intent of the program
6
Definitions “The goal of program obfuscation is to make a program unintelligible while preserving its functionality” Virtual black box (VBB): anything one can compute from the obfuscated program could also be computed from input-output behavior of the original program
7
Formally… (yuck) An obfuscator is an efficient compiler O that takes input program P and produces semantically equivalent program P’: P’ = O(P) functionality, x, P(x) = P’(x), where P’=O(P) polynomial slowdown, which says O(P) is at most polynomially slower than P (for circuits the requirement is that the size of O(P) is at most polynomially greater than P) virtual black box (VBB) property: The generalized VBB property mathematically states that you should not be able to learn more from the obfuscated version of a program (O(M)) than from a simulator (S<M>) for the original program with oracle access. It is formulated as follows:
8
Totally Unobfuscatable Functions under VBB
1) Given any program that computes f F the value (f) can be efficiently computed 2) Given oracle access to a (randomly selected) function f F no efficient algorithm can compute (f) much better than random guessing Property : F {0,1} Family of functions F This family is constructed from any one-way function This family of functions is UNOBFUSCATABLE if (1) and (2) are true
9
VBB Proof Methodology Because a family of (contrived) functions can be shown to be unobfuscatable Therefore, general, efficiently, secure obfuscators do not exist (a) (b) (c)
10
Where are we at? Known methods of obfuscation are reverse of good software engineering None guarantee impossibility of retrieving sensitive information or algorithms A determined specialist given enough time and resources is able to deobfuscate any obfuscated program In spite of VBB: does not imply there is no method for making programs “unintelligible” in some meaningful and precise way
11
How to Define Security Explicitly Implicitly
Define adversary task and require that it is computationally difficult Disadvantage: lot of threats/some are difficult to formulate in terms of computational problems Implicitly Define ideal security model and require our case is nearly as good as ideal one Disadvantage: Barak et al. result shows this is impossible based on VBB
12
Going against the Gold Standard
“Anything that can be efficiently computed from O(P) can be efficiently computed given oracle access to P” Who died and made them the boss (Barek et al.)? The intent: Can you eliminate any advantage to seeing the obfuscated source code beyond getting black box access to the original program? Our contention and intuition: You will always have an advantage for learning something about the original program if given (obfuscated) source code above what you could learn if given just black box access
13
Three Focus Areas Semantic Transformation
Random Program Security Model / Randomizing Obfuscators Perfectly Secure White Box Obfuscators
14
Properties of Random Program Obfuscators
Black Box Protection Y and A are semantically different A has input/output consistent with the function of the program Y has input/output consistent with a family of one-way function circuits Y = ORAND(A,K)
15
Properties of Random Program Obfuscators
Black Box Protection
16
Semantic Encryption Transformation
17
Program/Circuit P Py1 Px1 P Pxn Pyn
18
Strongly Pseudorandom Data Ciphers
K 232 ~256 Truth Tables
19
Semantically Secure Black Box Protection
20
Semantically Secure Black Box Protection
P’ = O(P)
21
Things to Be Done: P + E Living under Kerckhoff's Principle
Program encryption generation engine Unique encryption ciphers / key-based Security characterizations Number of E’s Input sizes Practical implementation issues
22
White Box Protection ?? Circuit P’
23
Three Focus Areas Semantic Transformation
Random Program Security Model / Randomizing Obfuscators Perfectly Secure White Box Obfuscators
24
IEEE International Symposium on Circuits and Systems (ISCAS) Format
# Comment: Inputs INPUT(WIRE) # Comment: Outputs OUTPUT(WIRE) # Comment: Gate Specifications GATE = FUNCTION (OPERAND, OPERAND) OPERAND = {GATE} U {WIRE} FUNCTION = {AND, OR, NOT, XOR, NXOR, NAND, NOR}
25
BENCH and BED Formats C17.gif C17.bench
Binary Expression Diagram (BED)
26
BENCH Workflow Graphics Format c1000.bench ISCAS Format C1000.gif
WINDOWS bench graphviz DOT Format Executable BED C Program C1000.dot C1000 C1000.c c1000 gcc LINUX
27
We Need a Better Security Model… and Provably Security Under that Model
28
Obfuscation under RPM ? CAL’ CL Y = ORAND(A,K) Circuit Y Circuit X
Circuit A
29
Random Programs/Circuits
30
Random Programs
31
White Box Understandable base on Random Program Oracles
32
White Box Understandable base on Random Program Oracles
33
? ATLANTIC Gulf of OCEAN Mexico 0.69%
486,800,000,000,000,000,000 teaspoons 70,940,000,000,000,000,000,000 teaspoons
34
Correlating Program and Data Encryption
Randomizing Obfuscators
35
Generating a Circuit Library
1) # of INPUTS 2) # of OUTPUTS 3) CIRCUIT SIZE + AND, OR, NAND, NOR, XOR, NXOR All Possible Combinations
36
Correlating Program and Data Encryption
CIRCUIT REPRESENTATION HLL or ASM PROGRAM HLL or ASM PROGRAM SUB-CIRCUIT SELECTION SUB-CIRCUIT REPLACEMENT Linear cryptanalysis was first openly published as a means for attacking DES by Mitsuru Matsui in EUROCRYPT’93.6 His method attempts to find a linear relation among the plaintext, ciphertext, and keys as they pass through the s-boxes. With enough known plaintext/ciphertext pairs as data, a relation with a high enough probability can be used to find the key. Matsui generated linear approximation tables for the 8 DES s-boxes and found the strongest linearity in S5 (the fifth s-box). The tables were created by analyzing all the combinations of the input and output bits of the s-boxes. Since there are 6 input bits and 4 output bits, there are 1024 (= 26 · 24) entries in his tables for every s-box. A linear approximation is stronger if it is significantly greater or less Eli Biham took this one step further to help define restrictions on s-boxes to make them more resistant to linear cryptanalysis.8 He found that increasing the number of output bits of an s-box can endanger the s-box significantly to linear cryptanalysis. More precisely, he found that in an m·n s-box, where m is the number of input bits and n is the number of output bits, if n • 2m-m, the s-box must have a linear property of the input and output bits. With the primary modes of attack on DES-like algorithms defined, rules can be established for how s-boxes are designed and used. Researchers can also examine how other’s s-box designs match up against those cryptanalysis techniques. There are several ways of making better s-boxes than the ones specified in DES, however, Schneider states that “… blindingly choosing new sboxes isn’t a good idea.”15 Among the common and well-documented features of s-boxes that are considered viable are those that permit the algorithm to follow the Strict Avalanche Criteria (SAC). The avalanche effect was first published in the cryptography world by Horst Feistel.16 In that study, it was determined that when an input bit goes through the system, an equal number of 1’s and 0’s on average are the resultant output. This was taken one step further by Webster and Tavares17, requiring exactly half of the output bits to change when one input bit changes. Another consideration is the size of the s-box. From the above discussions on cryptanalysis, a large box would be better than a small one. A large number of output bits are needed to protect against differential attacks; however, a corresponding large number of input bits are also needed to protect against linear cryptanalysis. Obviously, a balance of the two is needed. Finally, there are three requirements regarding the values in the s-box. First, the distributions of outputs must be checked for uniformity to protect against the Davies’ Attack. Second, the outputs must have no linearity in their function to the input. Third, there must be unique values in every row of the s-box. There are several other requirements; however they are beyond the scope of this paper. S-Box Selection Iterative Rounds
37
Three Focus Areas Semantic Transformation
Random Program Security Model / Randomizing Obfuscators Perfectly Secure White Box Obfuscators
38
Perfect White Box Protection
main (int argc, char *argv) { int x,y; /* Get input from the user */ x = argv[1]; /* Super secret algorithm */ …….. /* Output the result */ cout << y; }
39
Perfect White Box Protection
What is the best we can hope for to protect the “structure” of the code that performs the secret algorithm? We want the program to act just like an oracle would We want the program to be a “black-box” implementation
40
Perfect White Box Protection = Black Box Implementation
main (int argc, char *argv) { int x,y; /* Get input from the user */ x = argv[1]; /* Super secret algorithm */ if (x == 1) y = ; else if (x == 2) y = 23; else if (x == 3) y = ; …. /* Output the result */ cout << y; }
41
Perfect White Box Protection
Problems with this approach: You have to know all inputs/outputs Therefore, the algorithm could never be efficient for all size input n Therefore, the algorithm could never be general for all programs Which lends support to what Barak was saying…
42
Perfect White Box Protection
But: Mobile code programs are targeted for small programs Input size might be limited You may not care about the full range of possible inputs, only some…
43
Perfect White Box Protection
Regardless of efficiency: We can define a methodology for perfect white box protection We could apply that method for programs of small input size n (which is defined only by the amount of time or resources you want to apply to get the result) Those programs would be perfectly white box protected
44
Circuits Structural view of P: Consider circuit P 3 representations:
Algebraically (Boolean function) Structurally (circuit diagram) Truth table (input/output behavior) Structural view of P: INPUT(3) INPUT(2) INPUT(1) OUTPUT(7) OUTPUT(6) 4 = AND(3,2) 5 = OR(4,1) 6 = XOR(4,3) 7 = NAND(5,6)
45
Circuits Behavioral view of P:
46
Circuits Functional view of P: fP Derive it from structure
y6 = (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’ y7 = ((x3x2 + x1) (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’)’ Derive it from truth table y6 = x1’x2’x3 + x1x2’x3 y7 = x1’x2’x3’ + x1’x2’x3 + x1’x2x3’ + x1’x2x3 + x1x2’x3’ + x1x2x3’ + x1x2x3
47
Circuits There are many different but equivalent realizations of the same function (including minterm and maxterm realizations) There is no “right” realization given any function If a Boolean expression is written in a certain form, it will always be obvious, given two expressions, whether we are dealing with the same function or different functions
48
Circuits Such forms are termed “canonical forms”
Canonical forms are “official” forms for writing the algrabraic expression of a given type (such as Boolean algebraic expressions)
49
Circuits There is one and only one canonical realization for each function It is (should be) impossible to have different canonical realizations of the same function, only with exceptions based on commutativity:: abc’ + b’c cb’ + c’ba There is only 1 minterm realization of any function
50
Circuits Take these 2 functions for example: b’c’ + bc + a’b
b’c’ + bc + a’c’ These two functions are equivalent, yet neither can be simplified any further
51
Circuits Blake canonical form (BCF)
produced by taking a Boolean function in SOP form perform a sequence of simplification steps result is a form that produces a unique and compact representation of the original circuit b’c’ + bc + a’b b’c’ + bc + a’c’ The BCF form of the 2 above equivalent circuits is given by: b’c’ + bc + a’b + a’c’
52
Circuits I reduced it by hand: Functional view of P: fP
Derive it from structure y6 = (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’ y7 = ((x3x2 + x1) (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’)’ I reduced it by hand: y6 = x3x2’ y7 = x1’ + x3’ + x2
53
Circuits No really I did:
54
Circuits This is in complete-SOP form already
Functional view of P: fP Derive it from truth table y6 = x1’x2’x3 + x1x2’x3 y7 = x1’x2’x3’ + x1’x2’x3 + x1’x2x3’ + x1’x2x3 + x1x2’x3’ + x1x2x3’ + x1x2x3 This is in complete-SOP form already I applied Blake’s method to get: y6 = x2’x3 y7 = x1’ + x3’ + x2
55
So what does canonical minimization do?
All you need is the truth table or behavioral view to get an SOP form
56
So what does canonical minimization do for us?
This is what an oracle for P would “use” when asked questions about P … Any circuit that implements this truth table would then be a “black box implementation” of P
57
The “Logic” of Canonical P
if (x1 == 0) && (x2 ==0) & (x3==0) y6 = 1 y7 = 0 else if ((x1==0) && (x2==0) && (x3==1) y7 = 1 …
58
Can I ever recover the structure of the original P from canonical P?
59
Can I ever recover the structure of the original P from canonical P?
y6 = x3x2’ y7 = x1’ + x3’ + x2 BOTH are forward derivations y6 = x1’x2’x3 + x1x2’x3 y7 = x1’x2’x3’ + x1’x2’x3 + x1’x2x3’ + x1’x2x3 + x1x2’x3’ + x1x2x3’ + x1x2x3 y6 = (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’ y7 = ((x3x2 + x1) (x3x2(x3x2x3)’)’(((x3(x3x2x3)’)’)’)’ This would revel the gate structure
60
Perfect White Box Obfuscators
Algorithm O Truth Table TP Algorithm A: Complete Sum-of-Products Algorithm B: Minimal Sum-of-Products Circuit P’
61
For Designing Catenation-Based Obfuscators : P’ = P + E
HIGH LOW Efficiency HIGH LOW Security Perfect White Box P + E Randomization Circuit Splicing Subcircuit-Canonical Minimization
62
End-to-End Program Protection Architecture
63
Questions ???
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.