Presentation is loading. Please wait.

Presentation is loading. Please wait.

DBMask: Fine-Grained access control on encrypted relational databases

Similar presentations


Presentation on theme: "DBMask: Fine-Grained access control on encrypted relational databases"— Presentation transcript:

1 DBMask: Fine-Grained access control on encrypted relational databases
Good Morning everyone, My name is Ihsan. I am a graduate student at Purdue University in the department of electrical and computer engineering and a member of Database and Information Security group lead by Prof. Elisa Bertino. I will be presenting the work titled “DBMask: Fine-grained access control on encrypted relational databases. M I Sarfraz, M Nabeel, J Cao, E Bertino

2 Overview Organizations moving towards the paradigm of “database-as-a-service”. Data usually encrypted to preserve confidentiality. 2 Encrypt & upload Download & decrypt Encrypted DB SENSITIVE Organization Cloud 1 DB Cloud Organization I will begin with a brief overview. For the purpose of efficient data management and economic benefits, organizations are increasingly moving towards the paradigm of “database-as-a service” by which their data are managed by a DBMS hosted in a cloud. By outsourcing data to the cloud, organizations save the cost of building and maintaining a private database system and have to pay for only services they use. However, data are most valuable assets in an organization and inappropriate data disclosure or leakage puts the organizations business at risk. Therefore, data is encrypted in order to preserve confidentiality.

3 Problem 1. Encrypted Query Processing: How can users query and retrieve encrypted data without the cloud decrypting the data or query? 2. Fine-Grained Access Control: How to provide fine-grained access to data over encrypted relational data? Ex. SELECT * FROM table1 WHERE col3 = x5a8c34 Query I will now discuss the problems our work addresses. Given that the data is encrypted, How can users query and retrieve encrypted data without the cloud decrypting it? Given that data is encrypted, How to provide fine-grained access control at thr granularity level of a table, a column, row and even a cell. Encrypted Table Encrypted Results

4 Related Work Fully Homomorphic Encryption Trusted Hardware
Encrypted Query Processing Fine-Grained Access Control Fully Homomorphic Encryption Gentry et. al Trusted Hardware TrustedDB, CipherBase Untrusted Server Hacigûmus et. al CryptDB First research effort that investigates access control for queries on encrypted relational data. MONOMI: extends CryptDB to support OLAP The works most directly related to ours are as follows: Past research has extensively investigated query processing over encrypted data. Gentry et al. provide a theoretical solution using fully homomorphic encryption which allows servers to compute arbitrary functions over encrypted data. However, fully homomorphic encryption schemes are stil prohibitvely expensive by orders of magnitude. Approaches that utilize trusted hardware have also been proposed. Such approaches process sensitive queries inside the trusted module. However, they require special expensive hardware as well as modifications to existing database query processor. Hacigumus et al. proposed a pioneering approach that performs as much as query as possible on the remote server and perform the remaining query at the client site. However, they provide support for only approximate queries to be executed on server. CryptDB is the first work that systematically investigates access control for SQL queries and the motivation behind our work. CryptDB architecture assumes a proxy between a user and cloud server. Given a plaintext query submitted by a user, the proxy first checks if the query is authorized according to the access control policies. If this is the case, the proxy encrypts the query by the corresponding secret key derived from the user’s password. The encrypted query is then forwarded to the cloud, which runs the query over encrypted data and returns the result to the proxy. The proxy then decrypts the query result and returns it to the user. MONOMI uses the same underlying security framework as CryptDB but a different architecture to provide support for complex queries.

5 CryptDB (Limitations)
Onions of encryption multiple decryptions of entire columns security level decreases over time Unable to execute certain queries e.g. computation and comparison on same column Access control not cryptographically enforced susceptible to bypass and SQL injection attacks Cannot perform computations on values for different principals CryptDB however suffers from the following limitations. The first limitation is the onions of encryption. An onion is a multiple layers of encryptions. Each layer is applied for a specific query operation or purpose, and the encryption layers from the external layer to the most internal layer are increasingly weaker. Given a query, the server peels off layers in order to support certain operations. Hence the support of query operations is at the cost of supporting multiple decryptions of entire columns. In addition, although onions offer multiple levels of security, security decreases over time as layers are peeled off. CryptDB is unable to execute queries that cannot be processed entirely on the server. For example, it does not support queries requiring both comparison and computation on the same column . The third limitation is that the row/cell level access control mechanism is not cryptographically enforced. Instead, CryptDB always encrypts a column using a single key and utilizes a proxy based reference monitor to enforce row level access control. When the access control is not fully enforced using cryptography, the system is susceptible to bypass and SQL injection attacks. For example, a malicious user may trick the database to return more rows than they have access to. The risk of such attacks can be reduced by utilizing cryptographic enforcement since malicious users are unable to decrypt the result. Further, CryptDB fails to process queries on the cloud server over data items encrypted with different keys for different users based on access control policy since the ciphertexts are encrypted with different keys even though the users are authorized.

6 Contributions DBMask: A novel solution that provides support for:
Cryptographically enforced fine-grained access control at granularity level of table, column, row and cell Relational query operators by adding a comparison friendly column per column. New architecture to execute encrypted queries To address the limitations of CryptDB, we propose DBMask, a novel solution that supports:

7 DBMask Architecture Encrypted Data Encrypted Query Encrypted DB
2 5 Encrypted Query Encrypted DB Untrusted Cloud 3 Trusted Proxy Encrypted Keys Data Owner Encrypted Resultset 6 4 Attributes + request Plaintext Data 7 Key Management 1 Users DBMask Architecture Our system includes four entities: data owner, data user, proxy and data server. Our system include four entities: data owner, data user, proxy and the data server. Their interactions are as follows: The data owner uses different secret keys to encrypt different portions of data, according to the access control policies. The encrypted data are uploaded to the data server. A data user with authenticated identity attributes can verify itself to the proxy. The successful attribute based verification of the user to the proxy allows the proxy to either derive or obtain one or multiple secret keys required to encrypt the user query. Given a plaintext query submitted by the user, the proxy uses these keys to rewrite the query into an encrypted query, which can then be executed on the encrypted data in the data server. The encrypted query results are returned from the data server to the proxy, which decrypts the results using the secrets established at the time of verification and forwards them to the data user. Notice that during the query processing stage, the data server learns neither the query being executed nor the result set of the query.

8 Adversary Model Data server Proxy User honest-but-curious
Data owner remains offline after uploading data to server. We assume data owner is fully trusted. Data server honest-but-curious Proxy trusted third party attack compromises logged-in-users data owner does not outsource encryption to proxy User not trusted no secret key stored at use side The data owner remains offline after it uploads the encrypted data to data server. We assume that the data owner is fully trusted. The remining three entities might be compromised. The data server is assumed to be honest-but-curious. It does not attempt to actively attack the encrypted data, e.g., by altering the query answers or changing the encrypted data but instead is passive. The server will never be given the key by which the ciphertext can be decrypted to obtain the plaintext. The proxy is a trusted third party. All the secret keys, which are generated by the data owner and stored at the proxy, are encrypted. Our key management scheme (see Section 5) requires that these encrypted secret keys cannot be decrypted by the proxy alone. Instead, they can only be decrypted by the proxy with the help of authorized data users. An attack that has compromised the proxy can access the keys of logged-in users. Consequently, it can also access the data, authorized to those users. However, the secret keys of all the inactive users remain secure. In our model, the data owner does not outsource the data encryption operation to the proxy, although it is trusted. This is to avoid a “single point of failure". Otherwise, if the proxy is compromised at the pre-processing stage (i.e.,the stage of generating the keys to encrypt data), then the whole system is compromised. Users are not trusted in our system and the proxy establishes trust via certified identity attributes issued by the data owner. Our system does not store at the user side any secret key, which can be used to decrypt the data.

9 Cell fine grained access control + encrypted query processing fine grained access control encrypted query processing Attribute Based Group Key Management (ABGKM) Scheme Specialized schemes deterministic [AES], OPE [Boldyreva 09], keyword search [Song 00] Our Approach We will now discuss the fine grained access control mechanism We split fine grained access control from encrypted query processing Access control policies require that different users have access to different data items. To prevent unauthorized access, different data items are unencrypted under different keys. Although this provides support in terms of access control, different data items encrypted under different keys inhibits one from doing computations over a column. This is one the limitations of CryptDB as mentioned earlier. To address this limitation, we split …

10 Access Control Model We utilize attribute based access control model (ABAC): Attribute Condition (AC) nameA op l e.g. age > 35 ABAC Policy Defined over (s,o) s is Boolean expression over a set of ACs e.g. age < 30 and role = nurse o denotes set of cells e.g. select * from patient where age > 40 Users have a set of identity attributes e.g. roles, seniority, age … Data is associated with ABAC policies A user whose identity attributes satisfy the ABAC policies is allowed access to the data item

11 Access Control Model (cont’d …)
Group automatic assignment based on identity attributes convert each Boolean expression into disjunctive normal form (DNF) for each disjunctive clause, create a group Group Poset partial ordered set of groups exploit hierarchical relationship among groups e.g. Consider the following two Boolean expressions (BE) defined over the attribute conditions C1 , C2 and C3 : BE1 = C1 ∧ (C2 ∨ C3 ) and BE2 = C2 Then the corresponding expressions in DNF are: BE1 = (C1 ∧ C2 ) ∨ (C1 ∧ C3 ) and BE2 = C2 G1 G2 G3 G G3 G1 G2

12 Key Management BGKM schemes are special type of GKM scheme whereby private communication channels not required. Subscribers not given private keys. Instead are given a secret that combined with public information can be used to obtain actual private keys. However, BGKM schemes do not support group membership policies over a set of attributes. AB-GKM provides all the benefits of BGKM schemes and support attribute based access control policies. s + PI = k1 aI a2 a3 aI a2 a3

13 ABGKM The idea behind AB-GKM scheme is as follows:
separate BGKM instance for each attribute condition boolean expression over a set of attribute conditions embedded in an access structure tree internal nodes are threshold gates and leaves are BGKM instances The expression ((“type = regular” ∧ “region = Indiana”) ∨ “type = premium”, {new movie}) OR AND type = premium type = regular region = Indiana

14 ABGKM (cont’d …) Initializes the system
Setup () SecGen (Y) KeyGen (BE) KeyDer (iA, PI, T) ReKey (BE) Initializes the system Generates a set of secrets for each commitment Yi (certified attributes) Outputs a symmetric key, public information and an access tree Outputs the symmetric key only if identity attributes satisfy access structure T. Executed whenever the dynamics in the system change

15 Cell fine grained access control + encrypted query processing fine grained access control encrypted query processing Attribute Based Group Key Management (ABGKM) Scheme Specialized schemes deterministic [AES], OPE [Boldyreva 09], keyword search [Song 00] Our Approach Now, we will discuss SQL-aware comparison

16 SQL-aware Comparison JOIN-SEM JOIN-DET [AES] Blinded trapdoor values
Numerical and Keyword Comparison Computing Join PPNC PPNC-SEM AES + blinding factor PPNC-DET [AES] PPNC-OPE [Boldyreva ‘09] PPKC PPKC-SEM [Song ‘00] JOIN-SEM Blinded trapdoor values JOIN-DET [AES] security performance security performance

17 SQL-aware comparison (cont’d …)
Setup ({P}) EncVal (x) GenTrapdoor (t) Compare (ex,et) Initializes the underlying scheme Generates the encrypted value ex Generates the trapdoor value et Compares the values ex and et and outputs result

18 Cell fine grained access control + encrypted query processing fine grained access control encrypted query processing Attribute Based Group Key Management (ABGKM) Scheme Specialized schemes deterministic [AES], OPE [Boldyreva 09], keyword search [Song 00] Our Approach How to perform secure query evaluation over encrypted data?

19 Secure Query Evaluation
System Initialization User Registration Data owner: runs AB-GKM.Setup, PPNC.Setup and PPKC.Setup converts boolean expression over set of identity attributes to groups constructs group poset Registration: users register attributes with data owner data owner generates secrets for identity attributes user-secret database is maintained with data owner and proxy

20 Secure Query Evaluation (cont’d)
data-col match-col cell trap-col label-col ACP’s: G1, SELECT * FROM patient WHERE age < 40 G2, SELECT Age FROM Patient WHERE Age > 30 G3, SELECT Age, Diagnosis FROM Patient WHERE Age < 40 and Diagnosis = ‘Asthma’ Data Encryption and Upload e.g. Consider the following two Boolean expressions (BE) defined over the attribute conditions C1 = “level > 3”, C2 = “role = doctor” and C3 = “role = nurse”: BE1 = C1 ∧ (C2 ∨ C3 ) and BE2 = C2 Then the corresponding expressions in DNF are: BE1 = (C1 ∧ C2 ) ∨ (C1 ∧ C3 ) and BE2 = C2 G1 G2 G3 Patient ID Age Diagnosis 1 35 HIV 2 30 Cancer 3 40 Asthma 4 38

21 Secure Query Evaluation (cont’d)
data-col data-col label-col For each Gi, secret key Ki generated using AB-GKM.KeyGen To avoid multiple encryptions, master key generated User can access the cell by executing AB-GKM.KeyDer Ek(x) refers to the semantically secure encryption of x Data Encryption and Upload Patient ID ID-grp Age Age-grp Diagnosis Diagnosis-grp 1 G1 35 G1,G2 HIV 2 30 Cancer 3 40 G2 Asthma 4 38 G2,G3 G3

22 Secure Query Evaluation (cont’d)
data-col match-col match-col label-col For each cell, If numeric, PPNC.EncVal to encrypt If string, PPKC.EncVal to encrypt compn and compk refer to PPNC.EncVal and PPKC.EncVal respectively Data Encryption and Upload Patient

23 Secure Query Evaluation (cont’d)
data-col match-col trap-col trap-col label-col For each cell, If numeric, PPNC.GenTrapDoor * blindingfactor to encrypt If string, PPKC.GenTrapDoor * blindingfactor to encrypt Data Encryption and Upload Patient ID-enc ID-com ID-trap ID-grp Age-enc Age-com Age-trap Age-grp Diagnosis-enc Diag-com Diag-trap Diag-grp Ek1(1) compn(1) comp’n(1) G1 Ek12(35) compn (35) comp’n(35) G1,G2 Ek1(HIV) compk(HIV) comp’k(H..) Ek1(2) compn(2) comp’n(2) Ek1(30) (30) comp’n(30) Ek1(Cancer) compk(Cancer) comp’k(Ca..) Ek1(3) compn(3) comp’n(3) Ek2(40) (40) comp’n(40) G2 Ek1(Asthma) compk(Asthma) comp’k(As..) Ek1(4) compn(4) comp’n(4) Ek23(38) (38) comp’n(38) G2,G3 Ek3(Asthma) G3 Patient

24 Secure Query Evaluation (cont’d …)
Data Querying and Retrieval Processing a query is filtering-refining procedure filter aggregates or clauses that cannot be computed on server project the columns referenced by filtered aggregates or clauses to the query replace each column to be included in query result by “data-col” replace each column in the predicate of WHERE clause with “match-col” compute trapdoor for each value in the predicate of where clause

25 Secure Query Evaluation (cont’d …)
Data Querying and Retrieval Processing a query is filtering-refining procedure add a predicate to WHERE clause that determines the group of the user requesting the query run the query on the server decrypt the resultset If clause(s) or aggregate functions filtered, populate the resultset in-memory and refine the result according to the constraint in the clause

26 Secure Query Evaluation
DBMask-SEC DBMask-PER PPNC-SEM PPKC-SEM JOIN-SEM label-col encrypted PPNC-OPE PPKC-SEM JOIN-DET label-col in plaintext

27 Example Queries DBMask-SEC DBMask-PER level = 4
G1 SELECT ID, Age,Diag FROM Patient WHERE Age > 35 AND Diag LIKE ‘Asthma’ ORDER BY Age ASC role = doctor G3 DBMask-SEC DBMask-PER

28 Example Queries DBMask-SEC DBMask-PER
SELECT p.Age,d.Description FROM Patient p, Diagnosis d WHERE p.ID = d.PatientID role = doctor G3 DBMask-SEC DBMask-PER

29 Special Case – Row Level Access Control
ID-enc ID-com ID-trap Age-enc Age-com Age-trap Diagnosis-enc Diag-com Diag-trap Groups Ek1(1) compn(1) comp’n(1) Ek1(35) compn(35) comp’n(35) Ek1(HIV) compk(HIV) comp’k(H..) G1 Ek12(2) compn(2) comp’n(2) Ek12(30) compn(30) comp’n(30) Ek12(Cancer) compk(Cancer) comp’k(Ca..) G1, G2 Ek23(3) compn(3) comp’n(3) Ek23(40) compn(40) comp’n(40) Ek23(Asthma) compk(Asthma) comp’k(As..) G2, G3 Ek1(4) compn(4) comp’n(4) Ek1(38) compn(38) comp’n(38) Ek1(Asthma) SELECT ID, Age, Diag FROM Patient WHERE Age > 35 AND Diag LIKE ‘Asthma’ ORDER BY Age ASC level = 4 G1 role = doctor G3

30 Evaluation Performance of a fully encrypted database, TPC-C query workload. Analyze access control functionality – web application CRIS. DBMask C++/PostgreSQL 9.1/Ubuntu 12.04 NTL library/boolstuff library data-col: 128-bit AES (CBC) match-col: 128 bit keys 8 cores/ 8GB RAM

31 TPC-C Thorughput DBMask-PER: 33% CryptDB: 34% DBMask-SEC: 68%

32 Latency, Bandwidth, Space
Latency: 44% increase on server, proxy adds 4.3 ms (67% rewriting) Bandwidth: 2.8x (worst case) To assess latency, we measure the processing time of the same type of queries used above by studying the intervals at each stage of processing, namely at server and proxy. Fig 3 shows server and proxy latency for several query operations with DBMask-PER as the underlying scenario. We observe that there is an overall increase by 44% on the server side. The proxy adds on average 4.3 ms to the interval of which 24% is utilized in encryption/decryption and the most (67%) is spent in query rewriting, parsing and processing. To assess the bandwidth utilization incurred by split execution of queries between proxy and server by DBMask, we evaluate the average data transferred from the server to the proxy. Fig 4 shows average data transferred for those queries that cannot be processed entirely on the server by DBMask. In the worst case, the data transferred transferred is 2.8x in comparison to data transferred from server when plaintext queries are executed entirely on the server. This does not significantly increase the bandwidth requirements between the proxy and the server. Table 6 shows the amount of disk space used on the server by plaintext database, DBMask and CryptDB. DBMask increases the amount of data stored by 3.2 times and hence would not result in significant increase to storage cost. This is also relatively small in comparison to the space overhead imposed by CryptDB (4.5x).

33 CRIS – Web Application Column: 64% Row: 36% Cell: 79%
CRIS is a web based application supporting an easy to use system for managing and sharing scientific data. The data in the form of projects, experiments and jobs residing in CRIS is of sensitive nature and hence must be protected from unauthorized usage. To test the functionality of DBMask, we select a workspace which acts as a container for all activities and data to be managed by a single group of scientists consisting of 19 users. We define four ACP based on six attribute conditions over user identity attributes that capture the access control requirements of this particular workspace in CRIS. To evaluate the performance overhead imposed by DBMask and study the influence of access control, we run plaintext queries on Postgres and compare it to running queries on DBMask at different granularities of access control namely column level, row level and cell level. CRIS database has 87 tables with 298 columns in total. Figure 5 shows the effect on throughput by running CRIS on Postgres in comparison to DBMask with the underlying scenario DBMask-PER. Each HTTP request by a logged in user consists of multiple queries in order to allow a user to create, read, update and/or delete a project(s), experiment(s) or job(s). The results show that there is a loss of throughput by 64% for column level, 36% for row level and 79% for cell level access control with DBMask and a logged in user is only able to access objects it is permitted. We consider this to be a reasonable overhead considering the gains in confidentiality and privacy. The finer granularity of row level over column level results in better performance as row level consumes less disk space and is able to take advantage of indexing to speed table scans. Our scheme …

34 Summary DBMask: Support for fine grained access control and encrypted query processing using novel column per column approach Modest overhead, no modification to internals of DBMS

35 Thanks, questions?


Download ppt "DBMask: Fine-Grained access control on encrypted relational databases"

Similar presentations


Ads by Google