Presentation is loading. Please wait.

Presentation is loading. Please wait.

Secure Cloud Database. Introduction Cloud computing – IT as a service from third party service provider Security in cloud environment – Adversary corrupts.

Similar presentations


Presentation on theme: "Secure Cloud Database. Introduction Cloud computing – IT as a service from third party service provider Security in cloud environment – Adversary corrupts."— Presentation transcript:

1 Secure Cloud Database

2 Introduction Cloud computing – IT as a service from third party service provider Security in cloud environment – Adversary corrupts the service provider? – Goal: protect sensitive data

3 Related Work Encryption Approach – NetDB2, IBM (Outsourced database) – Relational Cloud, CryptDB (MIT, CIDR 2011) TrustedDB using secure hardware (VLDB 2011 demo, Radu Sion) Fully homomorphic encryption (STOC 2009) Secure Multi-Party Computation Approach – ShareMind

4 NetDB2 Tuple 1xxxyyy Tuple 2aaabbb Tuple 1!a4a3g Tuple 2L%jm*K Value-level encryption SELECT * WHERE value = `xxx’SELECT * WHERE value = `!a4’ DB Encrypted DB Tuple 1P2 Tuple 2P1 + Partition information Partition: P1: < `m’; otherwise P2 SELECT * WHERE value < `xxx’SELECT * WHERE value in [P1, P2] Simple deterministic encryption

5 CryptDB Onion-encryption: multiple encryption done on 1 data 10 Original data encrypt E 1 (10) = A*65h OPES: numeric comparisons E 2 (A*65h) = BB647 Deterministic encryption Equality can be done Non-deterministic encryption No computation is feasible E 3 (BB647) = %j@9G If the user wants more computation power, decrypt to the desired level (one way!)

6 Weakness of encryption approach Functions supported are not generic – For example: Supported (OPES): SELECT * WHERE SALARY > 6000 Not supported: SELECT * WHERE SALARY + BONUS > 6000

7 TrustedDB Provides generic functionality – Owner puts its keys in a secure hardware – The hardware is given to service provider – When there is computation on sensitive data, it can be done by the secure hardware Weakness – Processing power limited by secure hardware – Hardware management by owner IBM 4764 PCI-X Cryptographic Coprocessor

8 Fully homomorphic encryption Property: (E : encryption function) – E(x) + E(y) = E(x + y) ---- XOR gate for [0, 1] – E(x) E(y) = E(xy) ---- AND gate for [0, 1] Conceptually support any computations that can be represented by circuits – Difference: No branch operation (if-then-else) Weakness – Naturally not supporting select statement – Poor efficiency for large circuit so far

9 ShareMind Key: Secret sharing + recursive processing A B C Service Provider 1 Service Provider 2 Service Provider 3 Query Result D E F D + E + F = Result DB DB = A + B + C

10 Properties of ShareMind Generic operations – Recursive processing: the result of one computation can be the input of another computation, both result and input are hided in shares Weakness – Requires multiple non-colluding parties – Owner has no control (no key), poor sense of security

11 Objective Two party problem: owner and service provider (SP) The owner keeps a `key’ SP keeps an encrypted database Functions to be supported: generic selection Efficient operations

12 Overview of our approach MPC supports generic operations – Data hided in shares In other words: we encrypt by secret sharing Following questions: – How to encrypt exactly? – How to compute queries?

13 Our approach DB ABC SP2SP1SP3 Owner DBA BC SP1SP2 Owner MPC-based approach The owner keeps a copy; but it is large and the owner has to involve in query computation

14 Our approach Owner keeps a small share A (small storage) Without A, SP cannot recover DB (similar security strength as MPC) Owner has minimal involvement in MPC (low cost) DB A B SP Owner Our Model Share compression Message compression Functionality generality inherits from MPC

15 Background

16 Secret sharing (around 1980) 10 Secret 4 6 shares AliceBob 6+4 = 10 What is the secret value? Alice’s share would be 5? 20? -3? The secret is recovered only when the two parties exchange their shares

17 Secret sharing General case s Secret s1s1 s2s2 …snsn The secret can be divided into n parties, for any n s = g(s 1, s 2, …, s n ) Example: Sum of all shares (modular) Bitwise XOR of all shares Product, string concatenation, etc… Security requirement: Given k < n shares, it is hard to recover s

18 To design a generic secure database

19 How secure? The security model Negative result – Ideal security: Querying workflow: user issues query => service providers compute result and return to user Knowledge gained by service providers: NONE. Not even anything about query and result! – A solution achieving ideal security is not more efficient than a non-outsourcing solution (not using cloud)

20 Knowledge gained by service provider Output space of a simple selection query: varies from no tuple to the entire database – Even larger space if we consider joins Example knowledge gain – If the output size is small, the service provider knows it is not the case that the query selects entire table To hide the above information, each returned query result should be at least of size = entire table

21 Our security model Provides adequate security for practical use – Level 1 model: An attacker observes an instance of encrypted database but not other values. Security is said to be enforced if the attacker cannot recover the original database Example: Hack into the cloud server and copy the instance – Level 2 model: An attacker observes an instance of encrypted database and knows the original values of some of the tuples. Security is said to be enforced if the attacker cannot recover the values of other tuples Example: Hacker plants adversary programs on SP and observes the encrypted value and Similar to chosen-ciphertext attack (CPA)

22 Which level to use? Check which model fits! Example: – Name of 40 students in a class Domain size is small and is assumed to be public Easy to be mapped to the encrypted tuple Level 2 is recommended – Account balance in banks Values are not known to public Level 1 should be good enough At the same time, we will try to hide as much other information as possible

23 Information revealed to SP The service provider can observe – Query content The tables that are related to the query Number of conditions, types of conditions, attributes that are related – Query answer the set of shares of tuples in some query answer

24 Example query SELECT Name FROM Employer WHERE Salary > 6000 Transformed query may look like to one service provider SELECT ATTRIBUTE_7 FROM TABLE_A WHERE ATTRIBUTE_3 - X > 0 WITH PARAM_X = [1234, 3335, 222, 1119] WITH PARAM_CMP_X = [335, 17778]

25 Some basic design – level 2 model To hide the database, we use secret sharing DB = A + B In our case, we use multiplicative secret sharing – To store value v, we have ab = v (mod n) n: system parameter The shares are a, b DB A B SP Owner

26 Example sharing n= 35 (5 * 7) AB 34 64 AB 2413 82 AB 223 272 Owner SP

27 Share Compression The shares of the DB is generated randomly Who decides the random shares? Lets use a pseudo random function

28 Share compression function Input: – key (secret to owner): – Tuple (row) ID : r Requirements: – Support generic functionality (show later) – Secure (note: now considering level 2) IDX 118 220 IDShare 12 24 f(ID) = mx rc mod n IDShare 19 25 Share A Kept by owner Share B By SP m, c: secret key; x, n: public m=1, x = 2, c = 1

29 Compression function basis Our function mx rc mod n – n is the product of two big primes p, q Similar to RSA function – Encryption: x e = y (mod n) – Decryption: y d =x (mod n) – Properties n=pq ed = 1 (mod (p-1)(q-1)) Brief security analysis – Ignore m (= 1), view x r as the plaintext to encrypt Function format: a b mod n – Given a, n, a b mod n, it is hard to obtain b (RSA security protection)

30 Storage cost Linear to number of columns – Assuming the IDs are from 1-t Just need to remember t Note on the random function: – To make the input look like random, we have » f(ID) = mh(ID) k mod n h: any one-way hash Storage part is easy, how about computation? IDShare 11 24 …… f(ID) = mID k mod n

31 How to do multiplication? Column-column multiplication – The two values are both in share format To compute: AB – A = a 1 a 2 – B = b 1 b 2 Part 1 is held by user, part 2 is held by SP – C = AB = a 1 a 2 b 1 b 2 = (a 1 b 1 )(b 1 b 2 ) A = a 1 a 2 B = b 1 b 2 C = (a 1 b 1 )(a 2 b 2 ) Each party computes on its own

32 Column-column multiplication IDC 13 24 AB 123 241 A (m = 4, x=2, c=1) B (m =1, x=3, c=2) 189 21611 IDAB 1912 2916 Real value Owner SP C = A X B 6 4 Easy computation at SP ? How to compute without materializing the share table?

33 Computation at User Operation on key level: * – We have m 1 x rc * m 2 y rk mod n = m 1 m 2 (x c y k )r mod n Let z = x c y k – Result: IDA (m = 4, x=2, c=1)B (m =1, x=3, c=2) Constant: not good to security User will publish z (for later operations)

34 User’s choice of parameter Recall – x ed mod pq = x for any e, d s.t. ed mod (p-1)(q-1) = 1 Let z = x c y k – z ed = z (mod n) =>

35 Our example IDA (m = 4, x=2, c=1)B (m =1, x=3, c=2) New key: = (x 5 ) 5 = x (mod 35) 5*5 = 1 (mod 24) IDC (m=4, x=23, c=5) 12 21

36 Recap: example IDC 13 24 AB 123 241 A (m = 4, x=2, c=1) B (m =1, x=3, c=2) 189 21611 IDAB 1912 2916 Real value Owner SP C = A X B 6 4 Easy computation at SP IDC (m=4, x=23, c=5) 12 21

37 Column-constant multiplication IDC 19 29 AB 123 241 A (m = 4, x=2, c=1) B (m =1, x=3, c=2) 189 21611 IDAB 1912 2916 Real value Owner SP C = 2A 4 8 Easy computation at SP IDC (m = 8, x=2, c=1) 116 232 Just update m

38 Power A 2 = (a 1 a 2 ) 2 = a 1 2 a 2 2 IDA 12 24 A (m = 4, x=2, c=1) 18 216 IDA 19 29 Owner SP Real value IDC 111 2 C = 2A 4 16 IDC (m = 8, x=2, c=2) 129 211

39 Next: resharing Objective: to renew a key – From old key to – Reason? Some operations may result in dependent parameters Column-column multiplication:, => – Note: only m is dependent, for x, c SP is hard to trace – Fine if multiplication alone, but better decouple the dependency

40 Constant column IDABC 1233 2413 A (m = 4, x=2, c=1) B (m =1, x=3, c=2) C(m=2,x= 2,c=2) 1898 2161132 IDABC 191231 291634 Owner SP Real value The constant value is a system parameter

41 Resharing Do an arbitrary multiplication – AC p for random p Let k be the constant value – The column is multiplied by k, to cancelled out the effect: => Let z = x c – z ed = z done

42 Column-column addition A = a 1 a 2 B = b 1 b 2 – C = A + B => a 1 a 2 + b 1 b 2 – Goal: C = c 1 c 2 = a 1 a 2 + b 1 b 2 c 2 = a 1 c 1 -1 a 2 + b 1 c 1 -1 b 2 Owner: a 1, b 1 SP: a 2, b 2 Kept by owner

43 Example IDAB 123 241 A (m = 4, x=2, c=1) B (m =1, x=3, c=2) 189 21611 IDAB 1912 2916 Real value Owner SP IDC = A+B 15 25 IDC (m = 3, x=3, c=3) 111 217

44 Hint c 2 = a 1 c 1 -1 a 2 + b 1 c 1 -1 b 2 Part 1: a 1 c 1 -1 – * => => Part 2: b 1 c 1 -1 – * => IDC (m = 3, x=3, c=3) C -1 (m = 12, x=3, c=21) IDA (m = 4, x=2, c=1)B (m =1, x=3, c=2) IDHA 123 23 IDHA 14 213

45 Computation at SP IDAB 1912 2916 SP IDHA 123 23 IDHA 14 213 C = HA * A + HB * B IDHA 110 225

46 Example IDAB 123 241 A (m = 4, x=2, c=1) B (m =1, x=3, c=2) 189 21611 Real value Owner IDC = A+B 15 25 IDC (m = 3, x=3, c=3) 111 217 IDAB 1912 2916 SP IDHA 110 225

47 Security c 2 = a 1 c 1 -1 a 2 + b 1 c 1 -1 b 2 Part 1: a 1 c 1 -1 – * => => Part 2: b 1 c 1 -1 – * => IDC (m = 3, x=3, c=3) C -1 (m = 12, x=3, c=21) IDA (m = 4, x=2, c=1)B (m =1, x=3, c=2) IDHA 123 23 IDHA 14 213 Resharing

48 Column-constant addition Remember we have constant column => reduce to column-column addition

49 Comparisons One operation is required only – X > 0 Every other comparison can be transformed to the above format with 1 addition Equivalent operation – Check the sign bit of the data

50 Domain partitioning Modular arithmetic -3 = 32 (mod 35) -10 = 25 (mod 35) Domain 0~ 1024bit value Positive if in this range Negative if in this range ~ 1023 bits

51 Recap: security We will let SP observe the comparison result, to achieve efficient selections Goal – If the real value is +ve, make it to +ve region – If the real value is –ve, make it to –ve region 0~ 1024bit value Positive if in this range Negative if in this range

52 Controlling the parameter A = a 1 a 2 – Reshare to make a 1 a small constant IDA (m = 4, x=2, c=1) 18 216 Owner IDA 19 29 SP IDA 12 24 Real value IDA (m = 12, x=1, c=0) 112 2 Owner IDA 16 212 SP n = 35 n/2 = 17 As long as there is no overflow, the result is correct

53 Overflow? Each region is around 2 1023 – Should be more than enough for usual domains, 4bytes int => 2 32 – m in order of 2 80 (standard brute force search domain size) Security issue – Factoring attack Each value has m as a factor – Order-preserving A larger value will give a larger share at SP

54 Random column X > 0  f(R)X > 0 for f(R) > 0 Example of f(R) – (R-x) 2 : 160 bit value x is random in every query IDABR 1232 24199 Real value R is random in 2 80 (+ve domain, > 0)


Download ppt "Secure Cloud Database. Introduction Cloud computing – IT as a service from third party service provider Security in cloud environment – Adversary corrupts."

Similar presentations


Ads by Google