Presentation is loading. Please wait.

Presentation is loading. Please wait.

Secure Database System. Introduction Demand of secure database systems – Cloud computing Database-as-a-Service Current cloud database systems – Amazon.

Similar presentations


Presentation on theme: "Secure Database System. Introduction Demand of secure database systems – Cloud computing Database-as-a-Service Current cloud database systems – Amazon."— Presentation transcript:

1 Secure Database System

2 Introduction Demand of secure database systems – Cloud computing Database-as-a-Service Current cloud database systems – Amazon RDS – Microsoft SQL Azure Advantages of cloud database systems – Economies of scale – Focus on own business

3 Security challenge Security concern – Data is put to third party service providers – The servers may be compromised To enforce security, encryption is necessary Challenge – How to compute queries on encrypted data

4 Single method approach A standalone encryption system is developed to address a particular query pattern Example: – Order-preserving encryption scheme (OPES) supports comparison (E(x) > E(y) iff x > y) – RSA (E(x)E(y) = E(xy))

5 Difficulty in building a generic query system Each method (e.g. OPES, RSA) has its own encryption mechanism. The encrypted values by each method are not interoperable – There is no trivial method to translate an encrypted value by OPES to the corresponding encrypted value by RSA – The following query cannot be supported: SELECT * WHERE price * quantity > 1000 Supported by RSA Supported by OPES Cannot be done by OPES, RSA or composition of OPES and RSA

6 Building database system based on single method approach Example systems: NetDB2 (with encryption), CryptDB Limitations – Limited support on complex queries Need to develop a new encryption method to support each query pattern – Lowered security guarantee in order to support more query patterns at the same time by one method

7 Our approach How to develop a query system that supports generic querying? Relational algebra – A few primitives are enough to build any queries Observation – Data interchangeability: the result of one primitive operator can be used as input by other primitive operators

8 To enforce data interchangeability There is only one encrypted data format All operations operate on this format A similar secure mechanism with data interchangeability – ShareMind – Using secure multiparty computation (SMC) with secret sharing Each data is split into shares and is distributed to multiple parties. A distributed algorithm among all parties is executed and gives the result in shared form.

9 Illustration of SMC + secret sharing Party 1 x: 3 y:8 Party 2 x: 2 y:4 Party 3 x: 5 y:-7 After some communications Party 1 z: 13 Party 2 z: 6 Party 3 z: 6 Plain values: x = 10 y = 5 Note: 10 = 3 + 2 + 5 5 = 8 + 7 + (-7) Plain values: z = (x – y) 2 z = 25 SMC algorithms Secret sharing

10 Generic operations in SMC Basic: – Addition – Multiplication Any operations that can be expressed as circuit can be computed – Addition on binary data can be regarded as XOR gate – Multiplication on binary data can be regarded as AND gate – The two gates can form a universal gate which can express any circuit

11 Using the idea of SMC + secret sharing on encrypted database? Multiple parties vs client-server Same storage size (= original database size) for all parties – Secure share generation reduces the storage cost at user Data Owner / User Cloud server UserCloud server UserCloud server

12 Development of new operators Why? Our goal: – To develop (i) a secure generator with (ii) its corresponding operators SMCSecure database system Operations are done between multiple parties Operations are done between user and service provider (SP) No privileged partyUser is privileged. Can observe any plain data and should always have a low cost in any computation Shares in secret sharing are materialized in each party Shares at user are not materialized but can be generated

13 Attack model Security is defined w.r.t. to an attack model The attack model in our case: chosen ciphertext attack (CPA) – Formally: an attacker can observe the ciphertext of any chosen plaintext. But it is still computationally hard to recover the key Some remarks on CPA – CPA is also used in RSA – OPES cannot guard against CPA

14 System Scope First address integer type data Focus on operations between columns in the same table – SELECT (PRICE * QUANTITY) Also support aggregate operation and limited join operation

15 DESCRIPTION OF OUR SOLUTION

16 Encryption procedure Secret sharing – Multiplicative secret sharing – Given a plain value v, the share at user v k, and the share at SP v’ v = v k v’ mod n (n is a parameter in share generating function) The share at user is called the item key of the value v – The item key of each cell in the table is different – Each item key can be identified by the row ID and column ID

17 Encryption illustration AB 123 241 Plain data AB 189 21611 AB 1912 2916 n=35 Item keys at user Encrypted values at SP Number of item keys = number of values in the table

18 Secure item key generator We extend RSA as our generator Each column has a column key (private values) Each row has a row ID r (public value) Item key: mx r mod n – n: the system parameter generated in RSA; n is a composite number with two big prime factors n is public – m, x, r are non-zero random values < n Note: n is at least 1024-bit value A B 189 21611 Item keys at user

19 Actual storage AB 123 241 Plain data A B AB 1912 2916 n=35 Table schema, and column keys at user Encrypted values at SP AB 189 21611 Conceptual item keys Note: User does not need to keep row IDs

20 Recovering values Example query: SELECT A Plain data A B AB 1912 2916 n=35 Table schema, and column keys at user Encrypted values at SP A 2 4 A 19 29 A 8 16 * Row IDs are passed to user too

21 Security of our item key generator Our generating function extends RSA function – Ours: mx r mod n (r, n are public, m, x are private) – RSA: x e mod n (e, n are public, x is private) Imagine m = 1, the functions are equivalent

22 PRIMITIVE OPERATORS

23 Overview Operations between columns – Multiplication (SELECT A * B) – Addition (SELECT A + B) – Will show that the above two are enough to support generic function evaluation (that can be expressed as a circuit and inputs are values in the same row) Note: above operations assume both inputs are encrypted – We are interested in encrypt-plain column-column operations (SELECT A * B; A is encrypted but B is not) – Special case: one of the operands is constant Column-constant operation (SELECT 10 * A)

24 Basic primitive operations Column-column multiplication Column-column addition Column-constant multiplication Column-constant addition Encrypt-plain column multiplication Encrypt-plain column addition Necessary operation – Power – Key regeneration

25 General Procedure C 1y 2z A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP Each operation is an algorithm which may contain some communication between user and SP C 1 2 The result is always a new column Security remark: The underlying item key generator is secure. In order to show the entire system is secure, it is adequate to show that the messages (if any) in the algorithm does not breach security w.r.t. CPA

26 Basic primitive operations Column-column multiplication Column-column addition Column-constant multiplication Column-constant addition Encrypt-plain column multiplication Encrypt-plain column addition Necessary operation – Power – Key regeneration

27 Column-column multiplication C=AB (SELECT A*B AS C) In some row r, the values of A, B are a, b – a = a k a’ (a k : item key at user, a’ encrypted value of a) – b = b k b’ (b k : item key at user, b’ encrypted value of b) c=ab = (a k b k ) (a’b’) mod n AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP C 3 4 n=35 Can be done by SP Item keys are not materialized at user. User operates on column key level

28 Column-column multiplication AB ……… r4*2 r mod 351*9 r mod 35 ……… A B Table schema, and column keys at user Item key table C … (4*1)*(2*9) r mod 35 … C

29 Column-column multiplication - Result AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP n=35 C C 13 24 Result: C 12 21 C=AB 6 4 Answer Security: No information about item keys of A and B is sent to SP

30 Basic primitive operations Column-column multiplication Column-column addition Column-constant multiplication Column-constant addition Encrypt-plain column multiplication Encrypt-plain column addition Necessary operation – Power – Key regeneration

31 Column-constant multiplication C=kA (e.g., SELECT 5*A AS C) – k is a constant In some row r, the values of A is a – a = a k a’ (a k : item key at user, a’ encrypted value of a) c=5a = (5a k ) (a’) mod n AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP C 9 9 n=35 No action at SP (4*2 r mod 35) * 5 = 20 * 2 r mod 35 C

32 Column-constant multiplication - Result AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP n=35 C C 19 29 Result: C 15 210 C=5A 10 20 Answer Security: No information about item keys of A is sent to SP

33 Basic primitive operations Column-column multiplication Column-column addition Column-constant multiplication Column-constant addition Encrypt-plain column multiplication Encrypt-plain column addition Necessary operation – Power – Key regeneration

34 Power C=A k (e.g., SELECT A^2 AS C) – k is a constant In some row r, the values of A is a – a = a k a’ (a k : item key at user, a’ encrypted value of a) c=a 2 = (a k ) 2 (a’) 2 mod n AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP C 11 n=35 A 2 at SP (4*2 r mod 35) 2 = 16 * 4 r mod 35 C

35 Power - Result AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP n=35 C C 111 2 Result: C 129 211 C=A 2 4 16 Answer Security: No information about item keys of A is sent to SP

36 Basic primitive operations Column-column multiplication Column-column addition Column-constant multiplication Column-constant addition Encrypt-plain column multiplication Encrypt-plain column addition Necessary operation – Power – Key regeneration

37 Key regeneration Objective: Set C = A, but C’s column key is different from A – C’s key appears to be random to SP AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP n=35 C 2 4 C C ??

38 Adding a constant column ABK 1234 2414 ………4 ………4 Plain data An artificial column K is added The value on K is the same for all rows. The value is randomly determined by user at the beginning (CREATE TABLE). In the example, it is 4. A B K α = 4 Table schema, and column keys at user ABK 191216 29 17 Encrypted values at SP K is encrypted like other columns

39 Key regeneration Set C = ( α -1 ) p AK p – α -1 is modular multiplicative inverse of α w.r.t. n – The multiplicative inverse of 4 is 9 w.r.t. n = 35 – p is randomly determined each time – The value of each row at C = value at A Procedure C 1 = ( α -1 ) p AC 2 =K p C =C 1 C 2 12 3 Column-constant multiplication Power Column-column multiplication Note: SP has no action in step 1

40 Key regeneration C = A = ( α -1 ) p AK p – α = 4, p = 2, α -1 = 9 => C = 9 2 A K 2 = 11A K 2 ABK 1234 2414 Plain data A B K α = 4 Table schema, and column keys at user ABK 191216 29 17 Encrypted values at SP C C 29 11 AK 2 C 1 = 11A C 2 = K 2 C 2 4 C 123 229 Security: Only parameter sent to SP: p Even if C’s key is sent to SP, SP cannot get K’s key. In the form of x e, e is known to SP, but x is not. Hard to compute x (like RSA). Thus, it is hard to get A’s key

41 Basic primitive operations Column-column multiplication Column-column addition Column-constant multiplication Column-constant addition Encrypt-plain column multiplication Encrypt-plain column addition Necessary operation – Power – Key regeneration

42 Column-column addition C=A+B (SELECT A+B AS C) In some row r, the values of A, B are a, b – a = a k a’ (a k : item key at user, a’ encrypted value of a) – b = b k b’ (b k : item key at user, b’ encrypted value of b) c=a+b = (a k a’) + (b k b’) mod n AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP We must combine a k and a’ to compute addition. But a k is not materialized (generated by A’s key) Send A’s key to SP in a protected way.

43 Column-column addition C=A+B (SELECT A+B AS C) In some row r, the values of A, B are a, b – a = a k a’ (a k : item key at user, a’ encrypted value of a) – b = b k b’ (b k : item key at user, b’ encrypted value of b) c=a+b = (a k a’) + (b k b’) mod n In the end, c should be also encrypted like other values, i.e., c = c k c’ mod n c k c’= (a k a’) + (b k b’) mod n c’ = (c k -1 a k )a’ + (c k -1 b k )b’ mod n c k can be abstracted by C’s column key. User generates C’s key randomly Remaining problem is to help SP compute c’ User prepares these two parts Item keys are not there yet, but can be abstracted at column key level C ; A At row r, c k = m c x c r mod n c k -1 = m c -1 (x c -1 ) r mod n a k = m a x a r mod n c k -1 a k = m c -1 m a (x c -1 x a ) r mod n =>

44 Example Hint for A Hint for B 1234 2313 AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP C 1 First, generate C’ key C -1 2 C’s inverse 3 Hint for A, B Hint A Hint B 4 SP materializes the hints for every row C 10 25 5 SP obtains encrypted values of C C 5 5 C 111 217 Obtain the correct answers if we look at plain values

45 Security Hint for A Hint for B 1234 2313 AB 123 241 Plain data A B AB 1912 2916 Table schema, and column keys at user Encrypted values at SP C Hint for A, B Hint A Hint B C 10 25 These 4 values are what SP can observe

46 Security Hint for A, B Hint A Hint B These 4 values are what SP can observe A C -1 B C -1 4 equations: pm a mod 35 = 13 qx a mod 35 = 26 … C -1 ’s key is different in each addition, but A and B are not In the long run, an attacker can gather enough information to breach security Each can be imagined as a column in the table. Before sending the key of this column, we do a key regeneration

47 Security Recap: Even if the newly regenerated key is revealed to SP, SP cannot associate it to the old key. – Because there is an exponential in the formula Hint for A, B Hint A Hint B After key regeneration Hint A Hint B Hint A Hint B SP’s view And so cannot know A’s or B’s or C’s key

48 Basic primitive operations Column-column multiplication Column-column addition Column-constant multiplication Column-constant addition Encrypt-plain column multiplication Encrypt-plain column addition Necessary operation – Power – Key regeneration Trivial as we have a constant column

49 Basic primitive operations Column-column multiplication Column-column addition Column-constant multiplication Column-constant addition Encrypt-plain column multiplication Encrypt-plain column addition Necessary operation – Power – Key regeneration

50 Encrypt-plain operations C=AB (SELECT A*B AS C) but now B is not encrypted Encryption will always incur some overheads, e.g., decryption is needed. Encrypted is done only when the data is sensitive AB 123 241 Plain data A B AB 193 291 Table schema, and column keys at user Encrypted values at SP n=35 B is not encrypted is equivalent to B has a key of. All operations are the same Encrypted columns and unencrypted columns are interoperable

51 Generic column-column operations With addition and multiplication, we can compute any function that can be expressed as a circuit All data is in binary form It is sufficient to show that we can build a universal gate (e.g., NAND gate) on top of binary data

52 Building NAND gate 1 – XY (multiplication and addition) Any circuit can be expressed Note: Since we are using multiplicative sharing, we have a poor protection on 0 values – Example: RSA has a poor protection on 0 and 1 values XYResult 001 011 101 110

53 Switching to other values X+Y-XY+1 (addition and multiplication and non-zero booleans) XYResult 221 212 122 112

54 Side note: Addition revisit Note: we are protecting the column keys, but an attacker may observe some information in our operations c’ = (c k -1 a k )a’ + (c k -1 b k )b’ mod n c’ = c k -1 a + c k -1 b mod n Dangerous in the binary case, as c k -1 a = c k -1 b iff a = b – An attacker can identify whether the bits are the same The same factor

55 A more secure method X+Y-XY+1 = (X-p)(q-Y)+ (1-q)X + (1-p)Y + (1+pq) RHS =qX + pY – XY – pq + (1-q)X + (1-p)Y + (1+pq) = X + Y –XY + 1 p, q are random numbers All parts are of different values

56 Note on circuit construction Not efficient if we all use generic gate construction – Shortcut operations should be developed for common jobs (part of future work, e.g., on string data) Still there is no comparison operation (branch) – We will discuss comparison in later slides The above generic gate construction is of theoretical interest only

57 Summary of our operations so far With addition and multiplication – Compute any arithmetic function (using addition, multiplication, power) on integer columns relatively efficiently (significantly smaller overhead than baseline at user) Baseline: user download the encrypted database, decrypt it and compute query on its own

58 EXTENSION OPERATORS

59 Comparison Note: the objective is to let SP filter tuples – The result of comparison should be revealed to SP – Thus, data interchangeability cannot be achieved by comparison Side note: If the comparison result is required to be hided from SP as well, the overhead at user is significantly increased – Such requirement will have a cost at user not less than baseline

60 Comparison One operation is required only – X > 0 Every other comparison can be transformed to the above format with 1 addition Equivalent operation – Check the sign bit of the data

61 Domain partitioning Modular arithmetic -3 = 32 (mod 35) -10 = 25 (mod 35) Domain 0~ 1024bit value Positive if in this range Negative if in this range ~ 1023 bits

62 Comparison We will let SP observe the comparison result, to achieve efficient selections Goal – If the real value is +ve, make it to +ve region – If the real value is –ve, make it to –ve region 0~ 1024bit value Positive if in this range Negative if in this range

63 Controlling the parameter a = a k a’ => a’ = a k -1 a – Regenerate the key to make a k -1 a small constant A 18 216 User IDA 19 29 SP A 12 24 Real value A 112 2 User IDA 16 212 SP n = 35 n/2 = 17 As long as there is no overflow, the result is correct A -1

64 Overflow? Each region is around 2 1023 – Should be more than enough for usual domains, 4 bytes int => 2 32 Security issue – Factoring attack Each value has the same factor (e.g., 3 in the last example) – Order-preserving A larger value will give a larger value at SP

65 Random column X > 0  f(R)X > 0 for f(R) > 0 Example of f(R) – (R-p+1) 2 : 160 bit value p is random in every query IDABR 1232 24199 Real value R is random in 2 80 (+ve domain, > 0)

66 Aggregate query Since they are usually the last operations, data interchangeability is not important COUNT – Same as selection: after SP filters the tuples, just count qualifying ones SUM – Next slide

67 SUM SELECT SUM(A) – Now addition operation is between rows – Using the same logic as column-column addition AB 123 241 A B AB 1912 2916 Plain data Table schema, and column keys at user Encrypted values at SP r?? Generate the result item key (only the row ID) s=a k1 a’ 1 + a k2 a’ 2 a s s’ = a k1 a’ 1 + a k2 a’ 2 s’ = a s -1 a k1 a’ 1 + a s -1 a k2 a’ 2

68 SUM SELECT SUM(A) s’ = a s -1 a k1 a’ 1 + a s -1 a k2 a’ 2 A a k1 = mx r1 mod n a s = mx r mod n a s -1 = m -1 (x -1 ) r mod n a s -1 a k1 = x r1 (x -1 ) r mod n SP needs x and (x -1 ) r to compute the above part Part of column key Cannot be sent to SP directly Performs a key regeneration (not exactly)

69 Key regeneration Keep C = pA for random p (not = A) – Note that an attacker may know A, but cannot know C, no CPA attack on C A C = pA As we discussed, key regeneration does not let the attacker trace x from knowing m’ and x’ Revealing this x’ is safe The sum calculated is multiplied by p The user just multiply p -1 to get the actual sum

70 Indexing Processing each tuple by linear scan is feasible but slow Indexing is needed Note: index itself is a compromise of security – If certain tuples are filtered without any processing, the attacker can obtain certain information about the data, e.g., a range about the data

71 An index option Make data become uncertain AB 123 241 AB 11-23-4 2 1-2 User SP Domain partitioning Index on uncertain data

72 Index processing First process index, filter all disqualified tuples Then, use cryptographic operation to compute the actual answer

73 Integration with existing DBMS DBMS Applications SPUser Query SDB Client LayerSDB Server Layer Query Execution Plan Secure Operators Secure Operators Memory SQL Result

74 Example SELECT C WHERE A * B + D > 20 A B C D Row IDABCD 105………… 278………… Table schema, and column keys at user Encrypted values at SP n=35 A*B + D – 20 > 0 E Column-column multiplication: E = AB Column-column addition F = E + D – 20 Comparison F Query execution plan done (with corresponding parameters) Note: E, F can be thrown away, since they are not needed in the result

75 Example SELECT C WHERE A * B + D > 20 A B C D Row IDABCD 105………… 278………… Table schema, and column keys at user Encrypted values at SP n=35 SP receives the query plan Row IDAnswers? 105No 278Yes 337No 129No …… Execute the plan and find the answers Projection on C only Row IDC 2783 77612 …… Encrypted answer sent back to user Row IDs must be there

76 Example SELECT C WHERE A * B + D > 20 A B C D Table schema, and column keys at user n=35 Row IDC 2783 77612 …… Row IDC 2789 7769 …… User computes own item keys Encrypted answers C 27 3 … Decrypt


Download ppt "Secure Database System. Introduction Demand of secure database systems – Cloud computing Database-as-a-Service Current cloud database systems – Amazon."

Similar presentations


Ads by Google