Presentation is loading. Please wait.

Presentation is loading. Please wait.

Secure Query Processing in an Untrusted (Cloud) Environment.

Similar presentations


Presentation on theme: "Secure Query Processing in an Untrusted (Cloud) Environment."— Presentation transcript:

1 Secure Query Processing in an Untrusted (Cloud) Environment

2 Agenda Introduction to the model Overview of different approaches

3 Model Data owner / Users Database Service Provider (SP) Data Database Provides professional database service Backup Performance … EmpIDHourlyRateWorkingHour 24036 Get back data from SP for own use Find Alice’s record

4 Introduction: security concern Data owner / Users Database Service Provider (SP) Sensitive Data Trusted PartyUntrusted Party Objectives of our work: (1)Protect sensitive data from being seen by untrusted party (including SP) (2)Users still enjoy the database service from SP March 2009, Google Docs allowed unintended access to some private documents June 2013, Facebook bug leaks contact info of 6 million users

5 Secure database system - overview Encrypt data before sending to SP 5730923749300489226453 EmpIDHourlyRateWorkingHour 13023 24036 Data owner (DO) / User Service provider (SP) EmpIDHourlyRateWorkingHour 79826334164104547322019 5730923749300489226453 Q: SELECT * WHERE HourlyRate * WorkingHour > 900 Q’ 24036 Transformed queries, with some ‘trapdoors’ to help SP to compute the answer

6 Approaches to solve the problem Hardware-based solutions – Trusted DB [SIGMOD 2011], Cipherbase [SIGMOD 2013] Homomorphic-encryption-based solutions – CryptDB [CACM 2012], MONOMI [PVLDB 2013] Secure Multiparty Computation (SMC) approach – ShareMind [PAISI 2012] Our solution Secure indexing approaches – Orthogonal to above solutions, can be integrated to any of them – Domain partitioning [SIGMOD 2002]

7 Before we discuss different approaches… SP is assumed to be more powerful. Users are trusted. They can see plain data. – A baseline solution exists Users retrieve the entire encrypted database Decrypt it, then do whatever they want – Problems of the baseline solution High communication cost and high processing cost at users What different approaches are trying to do – Delegate the query processing job to SP Utilize the power of SP – Users obtain the (encrypted) query answers only Low communication cost and low processing cost at users – We can always revert back to baseline method!

8 Hardware-based solutions Use of secure co-processor Can store a key on it – No party can observe the key stored on it Provides API for cryptographic actions using the stored key Tamper-resistance – Cannot hack the device through physical intrusion

9 Use of secure coprocessor Users SP Data Database Secure co-processor(s) is/are installed at SP side Data Encryption using a secret key The key is sent to the secure coprocessor through secure channel Note: the key is known to users and the secure co-processor only Find Alice’s record Decrypt the records one by one and process the query Result Can be encrypted or plain. In this example, just return yes/no Answer Decrypt the answer

10 Optimization strategies Add more secure co-processors for parallel processing Compute the part of query that does not involve encrypted data on DBMS first – Example: SELECT * FROM T WHERE Price > 10 and Order_Date < “22 Feb 2014” If Price is encrypted while Order_Date is not, the DMBS first processes the predicate Order_Date < “22 Feb 2014”

11 Pros and cons Pros – Strong security protection as long as the secure coprocessor is not compromised – Can process any query Cons – Require special hardware – Expensive In USD (Data obtained on 7 Feb 2014)

12 Homomorphic-encryption-based solutions Homomorphic encryption – A special type of encryption which allows certain type of operations (on plain values) to be executed on encrypted values Let E be an encryption function – Homomorphic property E(f(x, y)) = g(E(x), E(y)) – Examples RSA – E(a)*E(b) = E(a*b) OPES [SIGMOD 04] – E(a) > E(b) if and only if a > b

13 Using homomorphic encryptions E(35) by OPES ejAAS Users SP EmpIDHourlyRateWorkingHour 15023 23036 EIDHRWH 1ka6fjh3a45 2d2s2aAnm24 Sensitive By OPES By RSA EIDHRWH 1Hj%345877 2Ks12#AA244 OPES RSA SELECT HR*WH WHERE HR > 35 HR > 35 > < HR*WH z%^#5 HR*WH 1150

14 Pros and cons Pros – Low overheads in query processing at SP Example: just need multiplication on RSA-encrypted data without encryption or decryption Cons – Multiple encrypted versions of the same data may be needed – Does not support composition of operations Without data interoperability Example: cannot compute HR*WH > 6000

15 Secure Multiparty Computation (SMC) approach EIDWH 218 213 25 Users SP #1 SP #2 SP #3 EIDHRWH 15023 23036 EIDHRWH 16028 23118 EIDHRWH 14056 205 EIDHRWH 15039 29913 Secret sharing v = v 1 + v 2 + v 3 mod 100 Each SP can’t derive the plain value v by having one share v i only SELECT EID, WH WHERE HR > 35 By exchanging some information (may involve multiple rounds), the result can be computed securely EIDWH 236

16 Pros and cons Pros – Theoretically support any computations – Usually low processing cost at SPs Most protocols do not need cryptographic operations Cons – High communication costs between SPs Multiple rounds of communication – The SPs must not be colluding – 3 times the cost due to 3 SPs

17 Our solution Users SP Keys Encrypted data 2-party secret sharing Storage cost at user is linear to schema size (number of tables and number of columns) SELECT EID, WH WHERE HR > 35 Some hints for SP to process (derived from keys) Message size depends on keys (small) Encrypted Results

18 Key features of our design Low processing cost at users – Operate on keys only – Make use of SP’s processing power for query processing Allows composition of operations – Example: evaluate Quantity * Price + Fixed_cost First compute A = Quantity * Price Then compute Ans = A + Fixed_cost – Data interoperability

19 Key features of our design Allow operations between plain and encrypted data – Encrypting everything is not suggested Overheads in processing on encrypted data – Queries may compose of both plain and encrypted data – Example: SELECT * WHERE Num_Stock * Stock_Price > 5000 Num_Stock is encrypted, Stock_Price and the constant are not.

20 What can our system do? SQL structure SELECT T1.Price*T2.Quantity FROM Inventory as T1 INNER JOIN SaleOrder as T2 ON T1.itemID = T2.itemID WHERE T1.Stock*T1.Price < 10,000 On integer type data Projection with numeric operations; can be expressions composed with addition, multiplication Equi-join Predicate(s) to filter result tuples; support AND/OR/NOT; support expressions

21 More operations INSERT/UPDATE – Example: UPDATE T1 SET Salary = Salary * 1.05 WHERE PeerScore + ManagerScore > 30 Basic aggregate function: COUNT/SUM/AVERAGE – Example: SELECT SUM(HR*WH) FROM T WHERE Age < 30 Can be an expression Just like selection

22 Limitations Incur high processing cost to SP, due to massive cryptographic operations Still under development – Currently focus on integer type data – Query plan optimization

23 END.

24 ADDITIONAL MATERIALS

25 SMC Example: addition protocol z s 3 + r 1 SP #1 SP #2 SP #3 xy x1x1 y1y1 xy x3x3 y3y3 xy x2x2 y2y2 Operation: z = x + y s 1 = x 1 +y 1 -r 1 v = v 1 + v 2 + v 3 mod n s 2 = x 2 +y 2 -r 2 z s 1 + r 2 z s 2 + r 3 s 3 = x 3 +y 3 -r 3 z 1 + z 2 + z 3 = x + y

26 Our solution XY x 1a y 1a x 2a y 2a Users SP Row-idXY r1r1 x 1a y 1a r2r2 x 2a y 2a … …… Row-idXY E(r 1 )x 1b y 1b E(r 2 )x 2b y 2b … …… 2-party secret sharing Row-idX ck X Y ck y r1r1 r2r2 Column key for each column XY x 1b y 1b x 2b y 2b It incurs a high storage overhead to users Row-ids are encrypted by some existing encryption method Without knowing the shares at users, SP can’t recover the plain data A table of pseudo- random numbers

27 The actual storage at both sides A B Users Row-idAB 188 23229 Row-idAB E(1)931 E(2)2229 SP Users only remember the column keys (each contains two values) AB 23 41 Plain data v = v 1 v 2 mod n n = 35

28 Operation on our encrypted data A B Users Row-idAB E(1)931 E(2)2229 SP Similar to SMC, there will be some communications between user and SP But the communication is uni-directional (only user -> SP) Operation: C = A+B C C e = A’ + B’ E(1)20 E(2)5 Some ‘hints’ are sent to SP to help SP compute the operation

29 Retrieving the data SELECT C WHERE A * B + D > 20 A B C D Table schema, and column keys at user Row-idMatch? E(1)No E(2)Yes E(6)No E(4)No …… Find the answers Projection on C only Row-idC E(2)3 E(16)12 …… Encrypted answer sent back to user Row-ids must be there Row-idABCD E(1)………… E(2)………… Encrypted values at SP

30 Decrypting the result SELECT C WHERE A * B + D > 20 A B C D Table schema, and column keys at user v = v 1 v 2 mod n n=35 Row-idC E(2)3 E(16)12 …… Row-idC 231 1617 …… User computes own item keys Encrypted answers C 23 29 … Decrypt

31 Without data interoperability RSA: E 1 (x) * E 1 (y) = E 1 (x*y) * E 1 (x)E 1 (y)E 1 (a) = E 1 (x*y) OPES: E 2 (a) > E 2 (b) if a > b Supports multiplication over encrypted data Supports comparison over encrypted data > E 2 (a) E 2 (b) How to compute x+y > b over encrypted data? User Operate on different space decrypt E 1 (a) then encrypt E 2 (a)

32 With data interoperability + E(x)E(y) > E(a) = E(x+y) E(b) How to compute x*y > b over encrypted data? Other examples: (x 1 – x 2 ) 2 + (y 1 – y 2 ) 2 can be computed using addition and multiplication only

33 Secure item key generator INPUT: row key r, column key – All are kept private System parameter: n, g – Selected by DO, n is public, g is not Generation function: v k = mg xr mod n Security: – Extension of RSA function – Even if an attacker observes several item keys, it is computationally hard to derive the secret parameters and hence other item keys

34 Illustration 1: Multiplication of 2 columns AB 123 241 Plain data A B AeAe BeBe 1931 22229 Table schema, and column keys at DO Encrypted values at SP n=35 g=2 C CeCe 134 28 Result: C 129 218 C=AB 6 4 DO SP C e = A e B e

35 Proof of correctness We have a = m a g rx a a’ b = m b g rx b b’ Decryption on C m a m b g r(x a +x b ) (a’b’) = (m a g rx a a’)(m b g rx b b’) = ab A B AeAe BeBe E(r)a’b’ C CeCe E(r)a’b’ DO SP

36 Illustration 2 Addition C=A+B – Example: SELECT * WHERE salary + bonus > 40,000 Preparation stage – We add a constant column S to the plain database – S is encrypted, i.e., DO keeps a column key of S, SP keeps a column of encrypted values AB 23 41 ABS 231 411

37 ABS 231 411 DOSP Plain data C = A + B 5 5 C A B S AeAe BeBe SeSe E(1)9318 E(2)22294 p A = 15 p B = 2 A’ = q A A e S e p A B’ = q B B e S e p B E(1)2926 E(2)41 Row keyC 123 21 Item keys q A = 18 q B = 4 C e = A’ + B’ E(1)20 E(2)5 Storage at both sides DO gives hints to SP SP computes the encrypted answers p A = 13 -1 * (5-2) mod 24 p B = 13 -1 * (5-3) mod 24 q A = 2 * 11 15 * 4 -1 mod 35 q B = 1 * 11 2 * 4 -1 mod 35 p A = x s -1 * (x c -x a ) mod Φ(n) p B = x s -1 * (x c -x b ) mod Φ(n) q A = m a * m S p a * m C -1 mod n q B = m b * m S p b * m C -1 mod n

38 Proof of correctness We have a = m a g rx a a’ b = m b g rx b b’ 1 = m s g rx s s’  s’ = m s -1 g -rx s Following the procedure, we have A B S AeAe BeBe SeSe E(r)a’b’s’ C CeCe E(r)c’ DO SP c’ = (q A a’s’ p A )+(q B b’s’ p B )c’ = (m a m s p A m c -1 ) a’ s’ p A + (m b m s p B m c -1 ) b’ s’ p B A’ = q A A e S e p A B’ = q B B e S e p B E(1)2926 E(2)41 C e = A’ + B’ E(1)20 E(2)5 q A = m a * m S p a * m C -1 mod n q B = m b * m S p b * m C -1 mod n c’ = (m a m c -1 ) a’ g -rx s p A + (m b m c -1 ) b’ g -rx s p B (m s -1 g -rx s ) p A = m s -p A g -rx s p A c’ = (m a m c -1 ) a’ g -r(x c -x a ) + (m b m c -1 ) b’ g -r(x c -x b ) p A = x s -1 * (x c -x a ) mod Φ(n) p B = x s -1 * (x c -x b ) mod Φ(n) c’ = m c -1 g -rx c (m a g rx a a’ + m b g rx b b’) c’ = m c -1 g -rx c (a + b) Decryption on c’ m c g rx c c’ = a + b


Download ppt "Secure Query Processing in an Untrusted (Cloud) Environment."

Similar presentations


Ads by Google