Download presentation
Presentation is loading. Please wait.
1
Integrity Coded Databases (ICDB)
Ensuring Correctness and Freshness of Outsourced Databases Ujwal karki , Graduate student Advisor: Dr. Jyh-haw Yeh Department of Computer Science Boise State University
2
Cloud Computing Internet-based computing for shared processing, resources and data on demand “The worldwide public cloud services market is projected to grow 18 percent in 2017 to total $246.8 billion, up from $209.2 billion in 2016, according to Gartner, Inc.” “74% of Tech Chief Financial Officers (CFOs) say cloud computing will have the most measurable impact on their business in 2017.”
3
Cloud Computing Reasons for growing popularity:
Avoids upfront infrastructure costs Meet fluctuating and unpredictable business demand Requires minimal management effort Pay-as-you-go (PAYG) model (charges based on usage)
4
Cloud Computing When we think of security, we probably think of bad guys or outsiders What if the “bad guy” is authorized to use the system? Risks from Insiders Data Owner have no control of data, once outsourced System Administrator have total access to your data Could steal, modify or even destroy sensitive information State of art technologies for external attack High risk of potential insider attacks still exist
5
Thesis Statement Unauthorized modification to the outsourced database cannot be prevented We present an Integrity Coded Databases (ICDB), which allows data owner to detect insider’s modifications Integrity: Data item returned from the cloud should be original, without unauthorized modification Integrity Code (IC): ICs are the codes generated by applying cryptographic functions with the Data, unique Serial Number and a Secret Key from data owner Use of secret key allows only the data owner to generate IC and verify the integrity of the data
6
Integrity Coded Databases (ICDB)
Standard Database Table dept_no dept_name d001 Marketing d002 Finance ICDB Table dept_no dept_name dept_no_ic dept_no_serial dept_name_ic dept_name_serial d001 Marketing IC(d001) 10004 IC(Marketing) 10006 d002 Finance IC(d002) IC(Finance) 10007
7
Integrity Coded Databases (ICDB)
Key Idea of ICDB: Store Integrity Codes (IC) into the database, along the data Each query fetches data along with corresponding IC and serial number Verification: recompute IC and compare to the IC returned ICDB Table dept_no dept_name dept_no_ic dept_no_serial dept_name_ic dept_name_serial d001 Marketing IC(d001) 10004 IC(Marketing) 10006 d002 Finance IC(d002) IC(Finance) 10007
8
ICDB Concerns Correctness: Returned data should be original, and not forged Freshness: Returned data should be current and not include previously removed data Completeness: All data items satisfying query conditions should be returned
9
Related Works File system: The hash values are stored at secure local hash repository for verification RDBMS: Signature per tuple for correctness and separate table signature for completeness Using timestamp and probabilistic method: Freshness is detected by checking deleted old fake tuples in result Use of Signature chain for authenticating outsourced database: focuses on completeness problem for the range queries only. Integrity protection using Authenticated Data Structure based techniques : Focused on the Merkle Hash Tree based data integrity techniques -ICDB has efficient technique for freshness, has multiple schemes and is transparent
10
Focus of the Thesis How effectively can ICDB ensure the data correctness and freshness? How much additional memory is required to store integrity codes? -each data to be protected has corresponding IC and serial How much additional time and data is required for Verification compared to standard DB query? -additional computation is needed to regenerate IC and compare with the fetched IC
11
ICDB Models Basic ICDB Model Dual Mode Verification (DMV) Model
-built on the top of basic model -feature to verify query results in aggregate
12
Basic ICDB Model Entities: Cloud Database Server (CDS) ICDB Client
Steps 1 & 2 : Create ICDB instance and outsource to CDS Steps 3 & 4 : SQL query converted to ICDB query by Query Conversion component and forwarded to CDS
13
Basic ICDB Model Step 5 : CDS returns query result plus ICs and serials Step 6 : Integrity Verification Result presented to the user
14
Dual Mode Verification (DMV) Model
Entities Cloud Database Server (CDS) ICDB Cloud Application (CA) ICDB Client -ICDB Cloud Application (CA) in addition to CDS -generates Aggregate Integrity Code (AIC) required for AV Aggregate Verification(AV) Mode: verify fetched data as a whole -reduces the network load Detailed Verification (DV) Mode: -verify each fetched tuple or attribute data
15
Dual Mode Verification (DMV) Model
Aggregate Verification Steps 1 & 2 : DB to ICDB instance conversion remains the same as in basic model Steps 3 & 4 : DVM converts SQL query to ICDB queries Q1 & Q2 -Q1 sent to CDS to fetch SQL result plus serials -Q2 sent to CA as a delegate to fetch ICs and then generate AIC (steps 5 & 6)
16
Dual Mode Verification (DMV) Model
Aggregate Verification Step 7 : CA generates and sends AIC to ICDB client Step 8 : ICDB client generates AIC from the result of Q1 (data plus serials) -matches with AIC from CA for aggregate verification
17
Dual Mode Verification (DMV) Model
Detailed Verification Steps 9 & 10: UI display -If AV fails, DV is optional. -if DV not chosen by user, entire dataset is discarded If DV not chosen by user: Step 11 & 12 : All ICs for data items( fetched earlier by Q1) needs to be fetched by ICDB client using Q2
18
Dual Mode Verification (DMV) Model
Detailed Verification Step 13 & 14 : Q1 plus Q2 result forwarded for detailed verification . -present individual corrupted data to the user. Note: -DV can detect which particular tuple or attribute has been altered -AV can detect if any of the data in whole dataset is altered, but not which in particular
19
ICDB construction Integrity Code (IC) (correctness) Serial Number(Freshness) IC(d)=G(m,k) ‘d’ is data item, ‘k’ is the secret key ‘IC(d)’ is the integrity code of data ‘d’ ‘m’ is collection of information related to ‘d’ ‘G’ is the IC generating function The pair <IC(d), s> is defined as IC unit where ‘s’ is the unique Serial Number. -data owner keeps a list of serials that are revoked/ invalid. -if query returns a data with valid IC but revoked serial, the data is not fresh.
20
IC generating Algorithms
RSA HMAC CMAC RSA Uses Public key for encryption (or signature verification) and Private Key for decryption (or signature generation) ‘m’ is message, (N, x) is the public key and ‘y’ is the private key -In practice, RSA keys are typically 1024 to 4096 bits
21
IC generating Algorithms
RSA HMAC CMAC RSA Supports Homomorphic Encryption(multiplication) Homomorphic property allows operation on ciphers without need for decryption.
22
IC generating Algorithms
RSA HMAC CMAC Keyed-Hash Message Authentication Code(HMAC) HMAC uses a cryptographic hash and a secret key HMAC in this work uses SHA-128 as its digest HMAC doesn't use the construction as Hash(key||message) HMAC are not subject to the length extension attacks as normal Hash
23
IC generating Algorithms
RSA HMAC CMAC Cipher-based Message Authentication Code (CMAC) CMAC is a technique for constructing a MAC from a block cipher CMAC in this project uses AES-128 as its backing cipher Fixes the security vulnerabilities like variable-length attack in CBC-MAC -CBC-MAC pairs for two messages (m, t) and (m’, t’) can generate a third message m’’ whose CBC-MAC will also be t'
24
ICDB Granularity Schemes
Based on granularity levels of Integrity protection One Code per Field (OCF) -Each entity attribute has an IC -If data doesn’t match its IC, then the field entry is invalid One Code per Tuple (OCT) -Each tuple has an IC. -If data doesn’t match its IC, then the tuple entry is invalid. -basically defines the grouping of data to construct IC
25
ICDB Granularity Schemes
One Code per Field (OCF) For every field in a table, there must exist a field to store the corresponding IC and serial ICDB -OCF Table example dept_no dept_name dept_no_ic dept_no_serial dept_name_ic dept_name_serial d001 Marketing IC(d001) 10004 IC(Marketing) 10006 d002 Finance IC(d002) IC(Finance) 10007
26
ICDB Granularity Schemes
IC generating function for a data ‘d’ in OCF: ICOCF (d) = G(m, k) = G(T.A(e) + D + T.K(e) + A + T + s, k) ‘m’ is the collection of information related to data ‘d’ that includes: -data ‘d’ itself represented by T.A(e) (field A’s value of entity ‘e’ in table ‘T’) -a delimiter ‘D’ - primary key value T.K(e) of the same entity e -name of field ‘A’, name of Table ‘T’, unique serial ‘s’ assigned to IC -G(m, k) is IC generating function using data owner’s secret key ‘k’
27
ICDB Granularity Schemes
One Code per Tuple (OCT) every table has additional fields to store the corresponding IC and serial In OCF, ratio between each data field and its IC size is high. In OCT, storage efficiency is improved by use of single IC per tuple. dept_no dept_name ic serial d001 Marketing IC(d001,Marketing) 1005 d002 Finance IC(d002,Finance) 1006
28
ICDB Granularity Schemes
One Code per Tuple (OCT) ICOCT (d) = G(m, k) = G(T.A1(e)+ D + T.A2(e) + ….. + T.An(e) + T + s, k) = G(d + T + s, k) d= (T.A1(e)+ D + T.A2(e) + ….. + T.An(e)) are the field values in a tuple ‘D’ is the delimiter between each field values ‘m’ is the collection of information related to ‘d’ ‘T’ is the table name and ‘s’ a unique serial number G (m, k) is the IC generating function on ‘m’ using secret key ‘k’
29
ICDB Conversion Schema Conversion Data Conversion Query Conversion
Schema Conversion (OCF) Schema Conversion (OCT) dept_no dept_name ALTER TABLE table_name ADD COLUMN CONCAT(column_name, '_ic') TEXT NOT NULL, CONCAT(column_name, '_serial') TEXT NOT NULL AFTER column_name; dept_no dept_name dept_no_ic dept_no_serial dept_name_ic dept_name_serial dept_no dept_name ic serial
30
ICDB Conversion Schema Conversion Data Conversion Query Conversion
Each data of the tables are copied to a text file. Integrity Code for each field data is created based on the level of protection granularity and saved in new file. Converted data is then uploaded to ICDB in the cloud.
31
ICDB Conversion Schema Conversion Data Conversion Query Conversion
Schema conversion and data conversion is same for both the Basic ICDB Model and DMV Model Query conversion for basic ICDB model is different from DMV model For both the models, ICDB query is derived from standard SQL query on the level of protection granularity
32
} { ICDB Conversion Query Conversion for Basic-OCF (Algorithm A)
Input (an SQL query) Output (an OCF-Basic query) attribute names Key attribute names attribute names in condition } { Serials ICs
33
ICDB Conversion Query Conversion for Basic-OCF (example SELECT query conversion) Original SQl query Applying Algorithm A, the converted OCF-Basic query is: SELECT salary FROM salaries WHERE emp_no = 1001; SELECT salary, emp _no, from _date salary _IC, salary _serial, emp _no _IC, emp _no _serial, from _date _IC, from _date _serial FROM salaries WHERE emp _no = 1001;
34
ICDB Conversion Query Conversion for Basic-OCF (example Aggregate functional query conversion) Original SQl query Applying Algorithm A, the converted OCF-Basic query is: Select sum (salary) from salaries; SELECT salary, emp _no, from _date salary _IC, salary _serial, emp _no _IC, emp _no _serial, from _date _IC, from _date _serial FROM salaries;
35
ICDB Conversion Query Conversion for DMV-OCF
makes use of two different cloud services: Cloud Database Server (CDS) and ICDB Cloud Application (CA) CA requires only the IC's to generate an Aggregate Integrity Code (AIC) ICDB client fetches data plus serials from CDS -two different modes of verification: Aggregate Verification (AV) and Detailed Verification (DV)
36
ICDB Conversion Query Conversion for DMV-OCF (Algorithm B)
AV Mode: issue two different queries Query Q1 to CDS and Q2 to CA
37
ICDB Conversion Query Conversion for DMV-OCF (Algorithm B)
For detailed verification, the same Q2 is issued to CDS to fetch ICs
38
ICDB Conversion Query Conversion for Basic-OCT (Algorithm C)
Input (an SQL query) Output (an OCT-Basic query)
39
ICDB Conversion Query Conversion for Basic-OCT (eg. SELECT query conversion) Original SQl query Applying Algorithm C, the converted OCT-Basic query is: SELECT salary FROM salaries WHERE emp_no = 1001; SELECT emp _no, salary, from _date, to _date, salaries _IC, salaries _serial FROM salaries WHERE emp _no = 1001;
40
ICDB Conversion Query Conversion for DMV-OCT (Algorithm D)
-AV Mode: similar to DMV-OCF, issue two different queries -Query Q1 to CDS and Q2 to CA Input (an SQL query)
41
ICDB Conversion Query Conversion for DMV-OCT (Algorithm D)
-for detailed verification, the same Q2 is issued to CDS to fetch ICs
42
AIC generation and Verification
AIC for ICs using RSA are generated by homomorphic multiplication of fetched ICs by cloud application as: The AIC can be regenerated by applying RSA algorithm to the aggregate data by ICDB client as:
43
AIC generation and Verification
For MACs, all ICs are aggregated and applied hashing (SHA-256) to generate AIC by cloud application as: ICDB client has to regenerate all the ICs for all the returned data and then generate the AIC from all the regenerated ICs as:
44
Experimental Results and Analysis
Hardware and software used: Boise State university’s onyx server MySQL (MariaDB) with InnoDB as its database engine JAVA SE 1.8 Bouncy Castle (an open source crypto library) MySQL publicly available Employees (v1.0.6)
45
Experimental Results and Analysis
Integrity Protection Forgery Attack: Attack that mutates or alters fields in a database. -> IC cannot be generated without the secret key of data owner Substitution Attack: Attack that modifies fields by substituting them with existing fields within database. ->all data to be protected are tied with their primary keys and other properties as attribute name and table name
46
Experimental Results and Analysis
Old Data Attack: Attack that returns the data(along with correct IC), which was previously stored but is no more in the database. ->use of ICRL prevents this Tuple Insertion Deletion Attack: Attack that adds new rows or deletes the existing rows. -> since forgery is detected, insertion is easily detected. -> deletion can be guaranteed by completeness guarantee only
47
Experimental Results and Analysis
Memory Penalty
48
Experimental Results and Analysis
Performance Penalty (Basic Model) SELECT query process ICDB client converts SELECT Query to ICDB SELECT query CDS executes and returns the result for ICDB SELECT query ICDB client verifies the returned result Results of SELECT * query on Employees.salaries Table.
49
Experimental Results and Analysis
We can interpret Performance Penalty Rate also as Process Rate Process Rate: How many MB of user data can be processed in one second? Total fetched user data size is without ICs and serials Total process time = query conversion + query execution + query verification
50
Experimental Results and Analysis
Performance Penalty (Basic Model) INSERT query process ICDB client converts SQL INSERT query to ICDB INSERT query -this requires generating IC and unique serial for each data to be protected CDS then executes the ICDB INSERT query Results of INSERT query on Employees.salaries Table.
51
Experimental Results and Analysis
Performance Penalty (Basic Model) DELETE query process ICDB client issues a SELECT query to fetch the to-be delete data from database server Verify the fetched data If verified, execute the DELETE query Revoke the serial numbers in ICRL Results of DELETE query on Employees.salaries Table.
52
Experimental Results and Analysis
Performance Penalty (Basic Model) JOIN query process ICDB client converts SQL JOIN Query to ICDB JOIN query (using Algorithm A) CDS executes and returns the result for ICDB JOIN query ICDB client verifies the returned result Results of Join query on Employees.salaries and Employees.employees Table.
53
Experimental Results and Analysis
Performance Penalty (Basic Model) Functional query process ICDB client will need to issue a SELECT query to fetch the required data for evaluating the aggregate operation ICDB client verifies the fetched data ICDB client locally computes: SUM, MIN, MAX, AVG or COUNT Results of Functional query on Employees.salaries Table.
54
Experimental Results and Analysis
Performance Penalty (DMV Model) SELECT query process (Aggregate Verification) ICDB client converts SELECT Query to ICDB SELECT queries Q1 issued to CDS and Q2 to CA CA forwards Q2 to CDS to fetch all corresponding ICs and compute AIC; ICDB client on retrieving all data and serials, computes the AIC and verifies with AIC from CA Results of SELECT * query on Employees.salaries Table.
55
Experimental Results and Analysis
Performance Penalty (DMV Model) DELETE query process (Aggregate Verification) ICDB client generates and issues ICDB SELECT queries Q1 & Q2 to verify the to-be delete data from database server Verify using Q1 & Q2 as in SELECT query process If verified, execute the DELETE query Revoke the serial numbers in ICRL Results of DELETE query on Employees.salaries Table.
56
Experimental Results and Analysis
Performance Penalty (DMV Model) JOIN query process (Aggregate Verification) ICDB client converts SQL JOIN Query to ICDB JOIN queries Q1 issued to CDS and Q2 to CA CA forwards Q2 to CDS to fetch all corresponding ICs and compute AIC; ICDB client on retrieving all data and serials from CDS, computes the AIC and verifies with AIC from CA Results of JOIN query on Employees.salaries and Employees.employees Table.
57
Experimental Results and Analysis
Performance Penalty (DMV Model) Functional query process (Aggregate Verification) ICDB client issues ICDB SELECT queries Q1 & Q2 to fetch and verify the required data for evaluating the aggregate operation. Verify using Q1 & Q2 as in SELECT query process ICDB client locally computes: SUM, MIN, MAX, AVG or COUNT. Results of Functional query on Employees.salaries Table.
58
ICDB Summary OCT vs OCF OCT has less memory penalty and less performance penalty OCF is able to detect if a particular field entry/data is corrupted, while OCT is only able to detect if a tuple is corrupted. OCT to fetch entire tuples even for verifying single column. HMAC-SHA vs. CMAC-AES vs. RSA. The large size (1024 bits) ICs for RSA incurs more memory penalty than the MAC algorithms with 128-bit ICs. RSA can be used for homomorphic operation on behalf of the data owner. CMAC-AES has slightly lower performance penalty rate than HMAC-SHA in all experiments and is best amongst all.
59
ICDB Summary SELECT vs. INSERT vs. DELETE vs. JOIN vs. Functional
From the least to the largest (for both the models): [DELETE < INSERT << SELECT ≈ JOIN < Functional] symbol ‘<<’ means a notable (significant) increase. Basic Model vs. DMV Model. DMV (AV mode) has reduced the network overhead significantly (only single IC fetched!) The memory penalty for both models are the same. In DMV (AV mode), huge performance improvement for the RSA algorithm. (performance of RSA similar to MACs)
60
Ranking of ICDB Schemes
A 'High' ranking in the scale refers to the worst result (highest penalty rate) and a 'Low' ranking refers to the best result (least penalty rate) in our experiment.
61
Conclusion ICDB is successfully able to protect data integrity for outsourced databases in clouds. The use of unique serial number to each IC ensures the freshness of outsourced data. Implemented an ICDB working prototype. Shown all the experimental results and analyzed their indications/implications. we also investigated the pricing schemes ( storage, data inflow/outflow and number of instances) of existing Database Service Providers so that we are able to suggest the ICDB schemes. -Amazon Web Services (AWS), Google Cloud Platform and Microsoft Azure.
62
Conclusion With service charge of only outbound data, the ICDB schemes in the AV mode of DMV model is the best choice as the outbound data is almost the same as a standard SQL database For unpredictable DB size and transactions, the best choice would be CMAC-AES in OCT which has the minimum performance penalty and increases the database size by approximately 1.53 times only -users can easily choose from amongst the provided list of 12 ICDB schemes that best fit their requirements
63
THANK YOU Ujwal Karki, Graduate Student
THANK YOU
64
Future Works More databases of different nature with larger size can be tested to gain a better understanding of performance. The experiment could be performed on the real cloud environment to study the performance Our experiment is configured to only communicate with a MySQL database, but other database options can be tested (e.g., PostgreSQL, SQLite). Experiments based on different SELECT (not just SELECT *) query could reduce the performance gap between OCT and OCF. Future and further research is necessary to assure the completeness of queried data returned from the CDS.
92
Related Works Securing storage in public cloud Infrastructure
File system: -The hash values are stored at secure local hash repository -Data file is requested to regenerate the hash of the file for verification RDBMS: -Signature per tuple for correctness and separate table signature for completeness -Uses incremental signature scheme with XOR MACs These works do not include freshness
93
Related Works Freshness guarantee for an Outsourced Database
Using timestamp and probabilistic method: Timestamps sets an expiration time for each signature, and must be updated when it is expired Audits can insert new fake tuples and delete old ones. Freshness is detected by checking deleted old fake tuples in result. Use of Trusted Third Party for Verification: The user compare the checksum(by data owner) and sum' (by service provider)to check for the freshness and completeness. ICDB uses a simple technique by keeping revoked list of serials for detecting freshness
94
Related Works Use of Signature chain for authenticating outsourced database: Their construction helps to achieve completeness by a secure linking of tuple-level signatures to form a so-called signature chain. No freshness guarantees are provided and is solely focused on the completeness problem for the range queries only. Sorted in ascending order prior to the signature computation adds extra computational overhead to both the database server(cloud) and the client.
95
Related Works Integrity protection using Authenticated Data Structure based techniques Focused on the Merkle Hash Tree based data integrity techniques Hashes of only those nodes which are involved in computation of the hash of the root node is transmitted The computed hash is matched with stored root hash The structure of the relational database is changed Tracing of involved nodes in authentication path adds performance overhead ICDB Model is more transparent to the database server
96
Experimental Results and Analysis
Process Rate
97
Conclusion Major Service Providers:
Amazon Web Services (AWS), Google Cloud Platform and Microsoft Azure. Charges are mostly based on the storage, data inflow/outflow and number of instances Some cloud providers such as Microsoft Azure offer a scheme in package Google and AWS cloud offers free inbound data but charges for all outbound data
98
ICDB Conversion Query Conversion for Basic-OCF (example JOIN query conversion) SELECT departments.dept _name, dept _emp.from _date, dept_emp.to _date FROM departments, dept _emp WHERE departments.dept _no = dept _emp.dept _no AND departments.dept _no = 'd002'; SELECT departments.dept _name, dept _emp.from _date, dept _emp.to _date, departments.dept _no, dept _emp.emp _no, IC(departments.dept _name), S(departments.dept _name), IC(dept _emp.from _date), S(dept _emp.from _date), IC(dept _emp.to _date), S(dept _emp.to _date), IC(departments.dept _no), S(departments.dept _no) IC(dept _emp.emp _no), S(dept _emp.emp _no) FROM departments, dept _emp WHERE departments.dept _no = dept _emp.dept _no AND departments.dept _no = 'd002';
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.