Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Graph Game Model for Software Tamper Protection Information Hiding ‘07 June 11-13, 2007 Mariusz Jakubowski Ramarathnam Venkatesan Microsoft Research.

Similar presentations


Presentation on theme: "A Graph Game Model for Software Tamper Protection Information Hiding ‘07 June 11-13, 2007 Mariusz Jakubowski Ramarathnam Venkatesan Microsoft Research."— Presentation transcript:

1 A Graph Game Model for Software Tamper Protection Information Hiding ‘07 June 11-13, 2007 Mariusz Jakubowski Ramarathnam Venkatesan Microsoft Research Nenad Dedić Boston University

2 Information Hiding ’07June 11-13, 20072 Overview Introduction Past work on software protection Definitions of tamper-resistance Anti-tampering transformations Security analysis Conclusion Modeling of software tamper-resistance

3 Information Hiding ’07June 11-13, 20073 Software Protection Obfuscation –Making programs “hard to understand” Tamper-resistance –Making programs “hard to modify” Obfuscation  tamper-resistance Tamper-resistance  obfuscation?

4 Information Hiding ’07June 11-13, 20074 Formal Obfuscation Impossible in general –Black-box model (Barak et al.):  “Source code” doesn’t help adversary who can examine input-output behavior. –Worst-case programs and poly-time attackers Possible in specific limited scenarios –Secret hiding by hashing (Lynn et al.) –Point functions (Wee, Kalai et al.) Results difficult to use in practice.

5 Information Hiding ’07June 11-13, 20075 Tamper-resistance Many techniques used in practice – e.g.: –Code-integrity checksums –Anti-debugging and anti-disassembly methods –Virtual machines and interpreters –Polymorphic and metamorphic code Never-ending battle on a very active field –Targets: DRM, CD/DVD protection, games, dongles, licensing, etc. –Defenses: Binary packers and “cryptors,” special compilers, transformation tools, programming strategies, etc. Current techniques tend to be “ad hoc:” –No provable security –No analysis of time required to crack protected instances

6 Information Hiding ’07June 11-13, 20076 Problem Definition We would like an algorithm Protect roughly with following properties: For any program P, Protect(P) outputs a new program Q: Q uses almost same resources as P. For any attacker A, if A(Q) outputs Q’, then either: o For any input x, Q’(x) = Q(x). o Q’ “crashes.” Informally, tamper-protected P either works exactly like P or fails.

7 Information Hiding ’07June 11-13, 20077 Problems with the Definition For any program P, Protect(P) outputs a new program Q: Q uses almost same resources as P For any attacker A, if A(Q) outputs Q’, then either: o For any input x, Q’(x) = Q(x). o Q’ “crashes.” It is unattainable. Example “attack:” A(Q) = “run Q; append 0 to output”. Definition imprecise, but there is a bigger problem: “Attack” is harmless, but breaks the definition. No easy way out!

8 Information Hiding ’07June 11-13, 20078 Towards a Realistic Model Give up on complete protection of P. o Protect mainly some critical code portion L. o Protect other parts to deflect attention away from L. Model restricted (but realistic) attackers. Make engineering assumptions about security: o Code transformations o Tamper detection o Dataflow o Control flow

9 Information Hiding ’07June 11-13, 20079 Known Techniques and Attacks Main scenario: Program P contains some security-critical code L. For example: L verifies that P is licensed software. L verifies that P has a license for rendering content. L contains important data (e.g., keys and credentials). … Next : Survey of known techniques and attacks to motivate the model and analysis.

10 Information Hiding ’07June 11-13, 200710 Single-Point Check P L L is called from P: if (L returns 1) then proceed; else terminate; Attack: Control-flow analysis can help identify L. Calls to L can then be patched.

11 Information Hiding ’07June 11-13, 200711 Distributed Check P L is called from P: if (L returns 1) then proceed; else terminate; Attacks – based on flow graph: 1.L is typically weakly connected to rest of P. 2.Guess position of one copy of L. Use graph-diffing to find other copies (subgraph matching). L is broken up into pieces, and/or individualized copies are replicated.

12 Information Hiding ’07June 11-13, 200712 Code Checksums P Attack: Reading code segment can be trapped (some hardware or VM support may be needed). Correct code segment can then be supplied by cracked program or VM. To prevent tampering, compute checksums C 1,…,C k of code segments. C1C1 C2C2 CkCk During execution, compare checksums of loaded code segments with pre-computed values.

13 Information Hiding ’07June 11-13, 200713 Oblivious Hashing P Attacks: Precomputed hash values could be discovered. Code-replica scheme could be attacked using program analysis (addressed in this work). Main idea of OH: Compute hashes H 1,…,H k of execution traces. Update hashes with values of assigned variables and identifiers based on control flow. Correct hashes can be precomputed and used to encrypt some data. Individualized code replicas can be created; OH values from each replica should be equal. H1H1 HkHk H2H2

14 Information Hiding ’07June 11-13, 200714 Anti-disassembly Attack: Vulnerable to attacks which do not use low-level details. E.g. “copy-attack”: To find out if branch B is causing crash, save state before B and try multiple paths. Disassembling can be made difficult by virtualization and individualization. Idea is to convert P into instances I=(V I,P I ). V I - virtual machine. P I - implementation of P for V I. Different instances I, J can have V I  V J. So disassembling I is of little help to disassemble J. P V1V1 P1P1 V2V2 P2P2 VnVn PnPn

15 Information Hiding ’07June 11-13, 200715 Defense Against Copy Attack 1. Crash only after multiple tampering changes detected. 2. Have many possible crash locations. 3. Delay the crash. 4. Randomize execution paths. Somewhat achievable using known techniques, e.g.: Use redundant copies of encrypted data. Make many code fragments dependent on checks. Overlap code sections for context-dependent semantics. Create multiple individualized copies of code.

16 Information Hiding ’07June 11-13, 200716 Defense Against Program Analysis Basic notion: “local indistinguishability” Ideally, local observations of code/data/execution should give no useful information to attacker. In practice, try to satisfy this as much as possible. 1.Small code fragments all look alike. (E.g., use semantics-preserving peephole transformations.) 2.Control-flow graph looks like a complete graph. (E.g., use computed jumps and opaque predicates.) 3.Dataflow graph looks like a complete graph. (E.g., use lightweight encryption and temporary variable corruption.)

17 Information Hiding ’07June 11-13, 200717 Detecting Unusual Data/Code Security-related data/code can look unusual and rare (e.g., XOR used for encryption and random data used for crypto keys both stand out and can be detected). To mitigate, can use peephole transformations, near- clear encryption, data scattering, etc.

18 Information Hiding ’07June 11-13, 200718 Assortment of tools are available. How to combine them effectively? How much security can we get? How to quantify security?

19 Information Hiding ’07June 11-13, 200719 Basic Model Program: –A graph G Execution: –A “random” walk on G Integrity check: –Group of nodes in G responsible for monitoring a set of code fragments (probabilistically) Check failure: –Tampering flagged on all code fragments in a check’s set Tamper response –An action taken when a “sufficient” number of checks have failed Abstraction of software tamper-resistance

20 Information Hiding ’07June 11-13, 200720 Elements of Model Local tamper check: C=InsertCheck(F 1,…,F s ) Check C of size s specified by s code fragments F 1, …, F s. Each F i detects tampering with probability p. Check fails if each F i detects tampering. ( Can have many checks C 1,…,C k.) Tamper response: InsertResponse(P, (C 1,…,C k ), f ) P “crashes” if at least f checks fail (f is the threshold). (“Crash” could be any other form of response: slowdown, graceful degradation, loss of features, self-correction, etc.)

21 Information Hiding ’07June 11-13, 200721 Elements of Model Graph transformations: (V,E)=GraphTransform(P, n) P is transformed into an equivalent program Q with flow graph G=(V,E) containing n nodes. G is random-looking. (rapid mixing of random walks). Execution of Q induces a random-looking walk on G. Critical-code embedding: F’=CodeEntangle(F, L) Critical code L is embedded into code fragment F, yielding F’. F’ is equivalent to “ if L returns 1 then execute F ”. Desirable to make embedded code hard to remove.

22 Information Hiding ’07June 11-13, 200722 The Algorithm Harden(P, L, l, n, k, s, f): let G = (V,E) = GraphTransform(P, n) for i = 1 to l do select at random v  V v = CodeEntangle(L, v) for i = 1 to k do select at random (v 1,…,v s )  V C i = InsertCheck(v 1,…,v s ) InsertResponse(G, (C 1,…,C k ), f ) Main ideas: 1. Transform the flow graph into a “random” one. 2. Replicate critical code in l random nodes. 3. Randomly insert k checks of size s. 4. Create check response with threshold f.

23 Information Hiding ’07June 11-13, 200723 The Algorithm Programmer assistance can help in algorithm: o Choose places to embed critical code L. o Identify code/data suitable for checking. o Identify code/data suitable for tamper response.

24 Information Hiding ’07June 11-13, 200724 Attack Model Attacker plays a game on the program graph G. Goal: Run the program and avoid executing critical code L. Game moves Make a step on G: o either follow untampered execution of P o or tamper to change execution  (tampering detected by checks…) Guess a check D=(u 1,…,u s ). o If D=C i, then C i is disabled. If P crashes, restart.

25 Information Hiding ’07June 11-13, 200725 Attack Model Attacker plays a game on flow-graph G=(V,E). G

26 Information Hiding ’07June 11-13, 200726 Attack Model Check = set of nodes. G Attacker plays a game on flow-graph G=(V,E).

27 Information Hiding ’07June 11-13, 200727 Attack Model Check = set of nodes. G = critical code Attacker plays a game on flow-graph G=(V,E).

28 Information Hiding ’07June 11-13, 200728 Attack Model Check = set of nodes. G Execution = walk on G. = critical code Attacker plays a game on flow-graph G=(V,E).

29 Information Hiding ’07June 11-13, 200729 Attack Model Check = set of nodes. G Execution = walk on G. = critical code In each (random) step A can either: - observe -models untampered execution Attacker plays a game on flow-graph G=(V,E).

30 Information Hiding ’07June 11-13, 200730 Attack Model Check = set of nodes. G Execution = walk on G. = critical code In each (random) step A can either: - observe -models untampered execution - tamper current node Attacker plays a game on flow-graph G=(V,E).

31 Information Hiding ’07June 11-13, 200731 Attack Model Check = set of nodes. G Execution = walk on G. Check is activated when all its nodes are tampered. = critical code In each (random) step A can either: - observe -models untampered execution - tamper current node Attacker plays a game on flow-graph G=(V,E).

32 Information Hiding ’07June 11-13, 200732 Attack Model Check = set of nodes. G Execution = walk on G. Check is activated when all its nodes are tampered. P crashes when f checks are activated. = critical code In each (random) step A can either: - observe -models untampered execution - tamper current node Attacker plays a game on flow-graph G=(V,E).

33 Information Hiding ’07June 11-13, 200733 Attack Model Check = set of nodes. G Execution = walk on G. Check is activated when all its nodes are tampered. P crashes when f checks are activated. A tries to guess a check. = critical code In each (random) step A can either: - observe -models untampered execution - tamper current node Attacker plays a game on flow-graph G=(V,E).

34 Information Hiding ’07June 11-13, 200734 Attack Model Check = set of nodes. G Execution = walk on G. Check is activated when all its nodes are tampered. P crashes when f checks are activated. A tries to guess a check. If guess is correct, the check is disabled (can’t be activated). = critical code In each (random) step A can either: - observe -models untampered execution - tamper current node Attacker plays a game on flow-graph G=(V,E).

35 Information Hiding ’07June 11-13, 200735 Attack Model Check = set of nodes. G Execution = walk on G. Check is activated when all its nodes are tampered. P crashes when f checks are activated. A tries to guess a check. If guess is correct, the check is disabled (can’t be activated). = critical code In each (random) step A can either: - observe -models untampered execution - tamper current node Attacker plays a game on flow-graph G=(V,E).

36 Information Hiding ’07June 11-13, 200736 Attack Model Check = set of nodes. G Execution = walk on G. = critical code Attacker plays a game on flow-graph G=(V,E). Game moves: observe, tamper, guess

37 Information Hiding ’07June 11-13, 200737 Attack Model Check = set of nodes. G Execution = walk on G. = critical code Attacker wins if: - P runs for >N steps without crashing. - Each step in critical code is tampered. Attacker plays a game on flow-graph G=(V,E). Game moves: observe, tamper, guess.

38 Information Hiding ’07June 11-13, 200738 Security Estimates Security analysis in graph model. Parameters: k = cn(# of checks proportional to # of nodes) f = cn/2(response threshold is half of the checks) p = 1(tamper detection is perfect) l = n(critical code replicated in every node) N = n 1+  (required running time before crash) Analyzed attacks take  (n s ) time! (s = check size) No proof yet for arbitrary attacks. More work needed…

39 Information Hiding ’07June 11-13, 200739 Security Arguments P runs for >N steps  “Long” rapidly mixing random walk  Critical code encountered “many” times  A must tamper “many” nodes  Program crashes Claim 1: As long as no check is disabled, A wins with exp. small prob. (Enough to have “not too many” checks disabled.)

40 Information Hiding ’07June 11-13, 200740 Security Arguments Desired claim 2: Any O(n cs ) attacker learns a check location with exp. small prob. So far we only analyzed some specific attacks. No complete proof of above claim yet. Claim 1: As long as no check is disabled, A wins with exp. small prob. (Enough to have “not too many” checks disabled.) Claim 1 + Claim 2  No O(n cs ) attacker can win.

41 Information Hiding ’07June 11-13, 200741 Attack 1: Voting Attack Let V={1,…,n}. Each check is an s-tuple of integers (v 1,…,v s ). Main idea: Suppose A tampers with P, which subsequently crashes. Let W  V denote the tampered nodes. Then any s-tuple (v 1,…,v s )  W s is more likely to be a check than not. So “vote” for all (v 1,…v s )  W s. Do this D times and output k candidates with most votes.

42 Information Hiding ’07June 11-13, 200742 Attack 1: Voting Attack Let V={1,…,n}. Each check is an s-tuple of integers (v 1,…,v s ). 1.Fill an s-dimensional n  n  …  n array B with zeros. 2.for i=1 to D do 1.run P and tamper with it arbitrarily until it crashes (let W be the set of tampered nodes) 2.for each (v 1,…,v s )  W s do B[v 1,…,v s ] = B[v 1,…,v s ] + 1 3.Find the k entries of B with highest values and output their indices as guesses for check nodes. Can prove: Updating the table of votes takes n s steps. (Hence n s is lower bound on attack time.)

43 Information Hiding ’07June 11-13, 200743 Attack 2: Intersection Attack Let V={1,…,n}. Each check is an s-tuple of integers (v 1,…,v s ). Main idea: Suppose A considers m tampered runs of P, with W 1,…,W m denoting sets of tampered nodes in each run. If some check C= (v 1,…,v s ) is activated in all m runs, then C  B = (W 1  W 2  …  W m ) s. For large enough m, B could be of tractable size, and A could search all of it. But small |B| are unlikely to contain any checks. Can prove: Expected time to find a check is still >n cs.

44 Information Hiding ’07June 11-13, 200744 Summary and Further Work Main goals of work o Modeling of software tamper-resistance o Algorithms for tamper-resistance with analyzable security Extensions o More realistic model: Allow some adversarial steps in walk. o More realistic parameters: p<1 – tamper detection unreliable l<n – critical code replicated only in fraction of P Other parameters: number of checks, threshold, etc. o Implementation


Download ppt "A Graph Game Model for Software Tamper Protection Information Hiding ‘07 June 11-13, 2007 Mariusz Jakubowski Ramarathnam Venkatesan Microsoft Research."

Similar presentations


Ads by Google