Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor.

Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor

Recap: chosen ciphertext security Why chosen ciphertext/malleability matters Taxonomy of Attacks and Security Ideas for achieving CCA –Redundancy + Verification The NIZK approach Simple scheme achieving CCA1 –Based on DDH –Modification achieving CCA2 Chosen-Ciphertext Security via Correlated Products CCA and IBE Deniable Authentication

3 Adversarial Models STANDARD MODEL: Abstract models of computation Interactive Turing machines Private memory, randomness... Well-defined adversarial access Can model powerful attacks REAL LIFE: Physical implementations leak information Adversarial access not always captured by abstract models E k (m)

4 Adversarial Models E k (m) Attacks in the standard model: Chosen-plaintext attacks Chosen-ciphertext attacks Composition Self-referential encryption Circular encryption.... Attacks outside the standard model: Timing attacks [Kocher 96] Fault detection [BDL 97, BS 97] Power analysis [KJJ 99] Cache attacks [OST 05] Memory attacks [HSHCPCFAF 08]...

5 Adversarial Models Attacks in the standard model: Chosen-plaintext attacks Chosen-ciphertext attacks Composition Self-referential encryption Circular encryption.... Attacks outside the standard model: Timing attacks [Kocher 96] Fault detection [BDL 97, BS 97] Power analysis [KJJ 99] Cache attacks [OST 05] Memory attacks [HSHCPCFAF 08] Electromagnetic radiation analysis... Side channel: Any information not captured by the abstract “standard” model

6 Countermeasures GOALS Preserve functionality Security Efficiency Generic methods HARDWARE E.g., minimizing electromagnetic leakage, “tamper-proof” devices,... Ad-hoc solutions Typically expensive or inefficient SOFTWARE E.g., fixed timing (indep. of input), oblivious RAM,... Many heuristics Require precise modeling

7 Thesis of this course Many tools developed in the foundations of cryptography are helpful for protecting against side-channel attacks Proof by examples... and not only at implementation time Must incorporate side-channel attacks in the design of systems

8 Modeling Side-Channels Canetti, Dodis, Halevi, Kushilevitz, and Sahai ’00 Exposure-resilient functions: functions that “look” random even if several input bits are leaked Ishai, Prabhakaran, Sahai, and Wagner ’03 ’06 Private circuit evaluation allowing several wires to leak Micali and Reyzin ’04 Computation and only computation leaks information Dziembowski and Pietrzak ’08, Pietrzak ’09 Leakage-resilient stream-ciphers Computation and only computation leaks information, low-bandwidth leakage Akavia, Goldwasser, and Vaikuntanathan ‘09 Memory attacks Regev’s scheme is resilient to leakage of L/polylog(L) bits of the secret key

Side Channels Timing Attacks Cache Attacks Memory Attacks

Timing Attacks Kocher, Timing Attacks on Implementations of Diffie- Hellman, RSA, DSS, and Other Systems, (CRYPTO 1996) Brumley and Boneh, Remote Timing Attacks Are Practical, (USENIX Security 2003) Slides based on Vitaly Shmatikov

Timing Attack Basic idea : learn the system’s secret by observing how long it takes to perform various computations Typical goal : extract private key Extremely powerful because isolation doesn’t help –Victim could be remote –Victim could be inside its own virtual machine –Keys could be in tamper-proof storage or smartcard Attacker wins simply by measuring response times

RSA Cryptosystem Key generation: –Generate large primes P, Q –Compute N=PQ and  (N)=(P-1)(Q-1) –Choose small e, relatively prime to  (N) Typically, e=3 or e=2 16 +1=65537 –Compute unique d such that ed = 1 mod  (N) –Public key = (e,N); private key = d Encryption of m ( simplified !): c = m e mod N Decryption of c : c d mod N = (m e ) d mod N = m Why?

RSA Decryption RSA decryption: compute y x mod N –A modular exponentiation operation Naive algorithm : square and multiply: w 101 x

Basic Timing This takes a while to compute This is instantaneous Whether iteration takes a long time depends on the k th bit of secret exponent Old observation: timing depends on number of 1’s If all multiplication take the same time: all you get

Not all multiplications were created equal Different timing given operands Assumption/Heuristic : timings of subsequent multiplications are independent –Given that we know the first k-1 bits of x –Given a guess for the k th bit of x –Time of remaining bits independent Given measurement of total time can see whether there is correlation between events: k th step is long Total time is long Exact timing Exact guess

Outline of Kocher’s Attack Idea: guess some bits of the exponent; –Predict how long decryption will take If guess is correct, will observe correlation; if incorrect, then prediction will look random –The more bits you already know, the stronger the signal, thus easier to detect (error-correction property) Start by guessing a few top bits, look at correlations for each guess, pick the most promising candidate and continue Works against systems under direct control

RSA in OpenSSL OpenSSL: popular open-source toolkit –mod_SSL (in Apache = 28% of HTTPS market) –stunnel (secure TCP/IP servers) –sNFS (secure NFS) –Many more applications Kocher’s attack doesn’t work against OpenSSL –Instead of square-and-multiply, OpenSSL uses CR T, sliding windows and two different multiplication algorithms for modular exponentiation CRT = Chinese Remainder Theorem Secret exponent is processed in chunks, not bit-by-bit

Chinese Remainder Theorem n = n 1 n 2 …n k where gcd(n i,n j )=1 when i  j The system of congruences x = x 1 mod n 1 = … = x k mod n k –Has a simultaneous solution x to all congruences –There exists exactly one solution x between 0 and n-1 For RSA modulus N=PQ, to compute x mod N enough to know x mod P and x mod Q

Attack this computation in order to learn Q RSA Decryption With CRT To decrypt c, need to compute m=c d mod N Use Chinese Remainder Theorem –d 1 = d mod (P-1) –d 2 = d mod (P-1) –qinv = Q -1 mod P –Compute m 1 = c d 1 mod P; m 2 = c d 2 mod Q –Compute m = m 2 +(qinv*(m 1 -m 2 ) mod P)*Q these are precomputed

Operations Involved in Decryption What is needed to compute c d mod Q and xy mod Q ? Exponentiation –Sliding windows Multiplication routines –“ Normal ” - when operands have unequal length – Karatsuba - faster when operands have equal length Modular reduction – Montgomery reduction n log 2 3 Time of these operations is input sensitive

Montgomery Reduction Decryption requires computing m 2 = c d 2 mod Q Done by repeated multiplication –Simple: square and multiply (process d 2 one bit at a time) –More clever: sliding windows (process d 2 in 5-bit blocks) In either case, many multiplications modulo Q Multiplications use Montgomery reduction –Pick some R = 2 k –To compute x ¢ y mod Q : convert x and y into their Montgomery form xR mod q and yR mod q –Compute (xR * yR) * R -1 = zR mod q Multiplication by R -1 can be done very efficiently Avoid long divisions R a power of 2

Schindler’s Observation At the end of Montgomery reduction: if zR > Q, then need to subtract Q –Probability of this extra step is proportional to c mod Q If c is close to Q, many subtractions will be done If c mod Q = 0, very few subtractions –Decryption will take longer as c gets closer to Q, then become fast as c passes a multiple of Q By playing with different values of c and observing how long decryption takes, attacker can guess Q ! If all other operations are fixed!

Value of ciphertext c Decryption time Q2QP Reduction Timing Dependency

Integer Multiplication Routines 30-40% of OpenSSL running time is spent on integer multiplication If integers have the same number of words n, OpenSSL uses Karatsuba multiplication –Takes O(n log 2 3 ) If integers have unequal number of words n and m, OpenSSL uses normal multiplication –Takes O(nm)

g<qg>q Montgomery effect LongerShorter Multiplication effect ShorterLonger g is the decryption value (same as c ) Different effects… but one will always dominate! Summary of Time Dependencies

Decryption time # Reductions Mult routine Value of ciphertext Q 0-1 Gap Attack Is Binary Search

Initial guess g for Q between 2 511 and 2 512 Try all possible guesses for the top few bits Suppose we know i-1 top bits of Q. Goal: i th bit –Set g =…known i-1 bits of Q …000000 –Set g hi =…known i-1 bits of Q …100000 (note: g<g hi ) If g<Q<g hi then the i th bit of Q is 0 If g<g hi <Q then the i th bit of Q is 1 Goal: decide whether g<Q<g hi or g<g hi <Q Attack Overview

Two Possibilities for g hi Decryption time #Reductions Mult routine Value of ciphertext Q gg hi ? Difference in decryption times between g and g hi will be small Difference in decryption times between g and g hi will be large

Timing Attack Details What is “large” and “small”? –Know from attacking previous bits Decrypting just g does not work because of sliding windows –Decrypt a neighborhood of values near g –Will increase difference between large and small values, resulting in larger 0-1 gap Attack required only 2 hours, about 1.4 million queries to recover the private key –Only need to recover most significant half bits of q g, g+1, …, g+ 

The 0-1 Gap Zero-one gap

Extracting RSA Private Key Montgomery reduction dominates Multiplication routine dominates zero-one gap

Normal SSL Handshake Regular client SSL server 1. ClientHello 2. ServerHello (send public key) 3. ClientKeyExchange (encrypted under public key) Exchange data encrypted with new shared key

Attacking SSL Handshake SSL server 1. ClientHello 2. ServerHello (send public key) Attacker 3. Record time t 1 Send guess g or g hi 4. Alert 5. Record time t 2 Compute t 2 –t 1

Works On The Network Similar timing on WAN vs. LAN

Defenses Require statically that all decryptions take the same time –For example, always do the extra “dummy” reduction –… but what if compiler optimizes it away ? Dynamically make all decryptions the same or multiples of the same time “quantum” –Now all decryptions have to be as slow as the slowest decryption Use RSA blinding

RSA Blinding Instead of decrypting ciphertext c, decrypt a random ciphertext related to c –Choose random r 2 Z N * –Compute x’ = c ¢ r e mod N –Decrypt x’ to obtain m’ =x’ d –Calculate original plaintext m = m’/r mod N Since r is random, decryption time is independent of ciphertext 2-10% performance penalty Can prepare ahead

Blinding Works

Cache Attacks Cryptanalysis through Cache Address Leakage: Dag Arne Osvik, Adi Shamir, Eran Tromer Slides based on Eran Tromer

Cache attacks Pure software No special privileges No interaction with the cryptographic code Very efficient –full AES key extraction from Linux encrypted partition in 65 milliseconds) Compromise otherwise well-secured systems “Commoditize” side-channel attacks: –Easily deployed software breaks many common systems

CPU core 60% (until recently) Main memory 7-9% Why cache? cache Annual speed increase: Typical latency: 50-150ns0.3ns → timing gap

Address leakage The cache is a shared resource : cache state affects, and is affected by, all processes, leading to crosstalk between processes. The cached data is subject to memory protection… –Not attacked But the “ metadata” leaks information about memory access patterns: Which addresses are being accessed.

Associative memory cache DRAM cache memory block (64 bytes) cache line (64 bytes) cache set (4 cache lines)

S-box tables in memory DRAM cache S-box table

Detecting access to AES tables DRAM cache Attacker memory S-box table

Measurement technique Two approaches to exploit Inter-process crosstalk: Measuring the effect of the cache on the encryption –Need precise timing Measuring the effect of the encryption on the cache

DRAM cache T0 Attacker memory 1. Make sure the tables are cached 2. Evict one cache set 3. Time an encryption and see if it’s slow Measuring effect of cache on encryption

Measurement technique Two approaches to exploit Inter-process crosstalk: Measuring the effect of the cache on the encryption –Need precise timing Measuring the effect of the encryption on the cache

Measuring effect of encryption on cache DRAM cache Attacker memory 1. Completely evict tables from cache S-box table

Measuring effect of encryption on cache DRAM cache Attacker memory 1. Completely evict tables from cache 2. Trigger a single encryption S-box table

Measuring effect of encryption on cache DRAM cache Attacker memory 1. Completely evict tables from cache 2. Trigger a single encryption 3. Access attacker memory again. See which cache sets are slow S-box table

Advantages of second method Yields more information (  64) from a single encryption Insensitive to timing variance in encryption code path No real need to trigger the encryption – can wait until it happens by itself

char p[16], k[16]; // plaintext and key int32 T0[256],T1[256],T2[256],T3[256]; // lookup tables int32 Col[4]; // intermediate state... /* Round 1 */ Col[0]  T0[p[ 0] © k[ 0]]  T1[p[ 5] © k[ 5]]  T2[p[10] © k[10]]  T3[p[15] © k[15]]; Col[1]  T0[p[ 4] © k[ 4]]  T1[p[ 9] © k[ 9]]  T2[p[14] © k[14]]  T3[p[ 3] © k[ 3]]; Col[2]  T0[p[ 8] © k[ 8]]  T1[p[13] © k[13]]  T2[p[ 2] © k[ 2]]  T3[p[ 7] © k[ 7]]; Col[3]  T0[p[12] © k[12]]  T1[p[ 1] © k[ 1]]  T2[p[ 6] © k[ 6]]  T3[p[11] © k[11]]; A typical software implementation of AES lookup index = plaintext  key

Synchronous attack A software service performs AES encryption using a secret key. An attacker process runs on the same CPU. The attacker process can somehow invoke the service on known plaintext. Examples: –Encrypted disk partition + filesystem –IP/Sec, VPN

Synchronous attack on AES: Overview Measure (possibly noisy) cache usage of many encryptions of known plaintexts. Guess the first key byte. For each hypothesis: –For each sampled plaintext, predict which cache line is accessed by “ T0[p[ 0] © k[ 0]] ” Identify the hypothesis which yields maximal correlation between predictions and measurements. Proceed for the rest of the key bytes. Practically, a few hundred samples suffice. Got 64 bits of the key (high nibble of each byte)! Use these partial results to mount attack further AES rounds, exploiting S-box nonlinearity. A few thousand samples for complete key recovery.

Protection: The Oblivious RAM Model Oblivious Turing Machine : At any point in time know where the heads are – The access pattern is independent of the Important: to convert to circuits Get good results for the Cook-Levin Theorem Oblivious RAM The access pattern is independent of the – Probability distribution! Suggested by Goldreich 1987

Model CPU Main memory Small private memory qiqi M[q i ]

Oblivious RAM Requirements Any sequence of locations i 1, i 2, … induces a distribution on sequences of requests q 1, q 2 … Functionality : should be able to figure out the original content Security : for any two sequence of locations i 1, i 2, … and i’ 1, i’ 2, … induced distributions of requests should be indistinguishable

Oblivious Ram Constructions Trivial: O(n) slowdown –O(log n) bits private memory Known: polylog slowdown [Goldreich-Ostrovsky 96] –O(log n) bits private memory

59 Memory Attacks [HSHCPCFAF 08] Concern: Not only computation leaks information Memory retains its content after power is lost 5 seconds 30 seconds 60 seconds 5 minutes http://citp.princeton.edu/memory

60 Can use redundancy in round keys Not only computation leaks information Memory retains its content after power is lost Recover “noisy” keys Cold boot attacks Completely compromise popular disk encryption systems Reconstruct DES, AES, and RSA keys http://citp.princeton.edu/memory Memory content can even last for several minutes Memory Attacks [HSHCPCFAF 08]

61 Public-Key Encryption Semantic security [GM82] under CPA: For any m 0 and m 1 infeasible to distinguish E pk (m 0 ) and E pk (m 1 ) (sk, pk) pk m 0, m 1 Output b’ E pk (m b ) b Ã {0,1}

62 Key-Leakage Attacks Semantic security with key leakage [AGV 09]: For any* leakage f(sk) and for any m 0 and m 1 infeasible to distinguish E pk (m 0 ) and E pk (m 1 ) (sk, pk) pk f Output b’ f(sk) b Ã {0,1} Clearly, cannot allow f(sk) that easily reveals sk For now f : SK ! {0,1} ¸ for ¸ < |sk| m 0, m 1 E pk (m b ) Akavia, Goldwasser and Vaikuntanathan

63 Is this the right model? Noisy leakage as opposed to low-bandwidth leakage Leakage of intermediate values Are intermediate values always erased? Key generation process Decryption process Keys generated using a “weak” random source Not a perfect model, but still a good starting point Discuss extensions later on

64 What We Know A generic method for protecting against key-leakage attacks Main building block: Hash Proof Systems [CS 02] Efficient instantiations Based on decisional Diffie-Hellman, few exponentiations Chosen-ciphertext key-leakage attacks A generic CPA-to-CCA transformation Efficient schemes Extensions Noisy leakage Leakage of intermediate values Weak random sources

65 Outline of the Talk Some tools The generic construction by examples A simple scheme: ¸ ¼ |sk|/2 Improved schemes: ¸ ¼ |sk| Extensions of the model Conclusions, further work, and some rest...

66 Min-Entropy Probability distribution X over {0,1} n H 1 (X) = - log max x Pr[X = x] X is a k -source if H 1 (X) ¸ k (i.e., Pr[X = x] · 2 -k for all x ) Represents the probability of the most likely value of X ¢ (X,Y) =  a  |Pr[X=a] – Pr[Y=a]| Statistical distance :

67 Extractors Universal procedure for “purifying” an imperfect source Definition: Ext: {0,1} n £ {0,1} d ! {0,1} ℓ is a (k,  ) -extractor if for any k - source X ¢ (Ext(X, U d ), U ℓ ) ·  d random bits “seed” E XT k -source of length n ℓ almost-uniform bits x s

68 Strong Extractors Output looks random even after seeing the seed Definition: Ext: {0,1} n £ {0,1} d ! {0,1} ℓ is a (k,  ) -strong extractor if Ext’(x, s) = s ◦ Ext(x,s) is a (k,  ) -extractor Leftover hash lemma [ILL 89]: Pairwise independent hash functions are strong extractors Example: Ext(x, (a,b)) = first ℓ bits of ax+b over GF[2 n ] Output length ℓ = k – 2log(1/  ) Seed length d = 2n, almost pairwise independence d = O(log n + k)

69 Decisional Diffie-Hellman gxgx gygy AliceBob Both parties compute K = g xy DDH assumption: (g, g x, g y, g xy )  (g, g x, g y, g z ) for random x, y, z 2 Z q (g 1, g 2, g 1 r, g 2 r )  (g 1, g 2, g 1 r 1, g 2 r 2 ) for random g 1, g 2 2 G and r, r 1, r 2 2 Z q

70 Outline of the Lecture Some tools The generic construction by examples A simple scheme: ¸ ¼ |sk|/2 Improved schemes: ¸ ¼ |sk| Extensions of the model Conclusions, further work, and some rest...

71 G - group of order q Ext : G £ {0,1} d ! {0,1} - strong extractor Choose g 1, g 2 2 G and x 1, x 2 2 Z q Let h = g 1 x 1 g 2 x 2 Output sk = (x 1, x 2 ) and pk = (g 1, g 2, h) Key generation A Simple Scheme MAIN IDEA: Redundancy : any pk corresponds to many possible sk ’s h=g 1 x 1 g 2 x 2 reveals only log(q) bits of information on sk=(x 1,x 2 ) Leakage of ¸ bits ) sk still has min-entropy log(q) - ¸

72 G - group of order q Ext : G £ {0,1} d ! {0,1} - strong extractor Choose g 1, g 2 2 G and x 1, x 2 2 Z q Let h = g 1 x 1 g 2 x 2 Output sk = (x 1, x 2 ) and pk = (g 1, g 2, h) Choose r 2 Z q and a seed s 2 {0,1} d Output (g 1 r, g 2 r, s, Ext(h r, s) © m) Output e © Ext(u 1 x 1 u 2 x 2, s) Key generation Enc pk (m) Dec sk (u 1, u 2, s, e) A Simple Scheme u 1 x 1 u 2 x 2 = g 1 rx 1 g 2 rx 2 = (g 1 x 1 g 2 x 2 ) r = h r

73 Theorem: The scheme is resilient to any leakage of ¸ ¼ log(q) bits half the size of sk A Simple Scheme Proof by reduction: Adversary for the encryption scheme Distinguisher for decisional Diffie-Hellman

Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor.

Similar presentations

Presentation on theme: "Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor.

Similar presentations

Presentation on theme: "Topics in Cryptography Lecture 7 Topic: Side Channels Lecturer: Moni Naor."— Presentation transcript:

Similar presentations

About project

Feedback