Secure Data Outsourcing

Slides:



Advertisements
Similar presentations
Cloud Computing Security Monir Azraoui, Kaoutar Elkhiyaoui, Refik Molva, Melek Ӧ nen, Pasquale Puzio December 18, 2013 – Sophia-Antipolis, France.
Advertisements

A Privacy Preserving Index for Range Queries
Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
Building web applications on top of encrypted data using Mylar Presented by Tenglu Liang Tai Liu.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
SplitX: High-Performance Private Analytics Ruichuan Chen (Bell Labs / Alcatel-Lucent) Istemi Ekin Akkus (MPI-SWS) Paul Francis (MPI-SWS)
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
PRIVACY AND SECURITY ISSUES IN DATA MINING P.h.D. Candidate: Anna Monreale Supervisors Prof. Dino Pedreschi Dott.ssa Fosca Giannotti University of Pisa.
Query Assurance on Data Streams  Ke Yi (AT&T Labs, now at HKUST)  Feifei Li (Boston U, now at Florida State)  Marios Hadjieleftheriou (AT&T Labs) 
CS7380: Privacy Aware Computing Oblivious RAM 1. Motivation  Starting from software protection Prevent from software piracy A valid method is using hardware.
SafeQ: Secure and Efficient Query Processing in Sensor Networks Fei Chen and Alex X. Liu Department of Computer Science and Engineering Michigan State.
Privacy and Integrity Preserving in Distributed Systems Presented for Ph.D. Qualifying Examination Fei Chen Michigan State University August 25 th, 2009.
DSAC (Digital Signature Aggregation and Chaining) Digital Signature Aggregation & Chaining An approach to ensure integrity of outsourced databases.
Spring 2004 ECE569 Lecture ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
DSAC (Digital Signature Aggregation and Chaining) Digital Signature Aggregation & Chaining An approach to ensure integrity of outsourced databases.
Research interest: Secure database outsourcing Presented by Alla Lanovenko Thesis Adviser: Professor Huiping Guo 599 A 11 December 2006.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
Practical Techniques for Searches on Encrypted Data Yongdae Kim Written by Song, Wagner, Perrig.
Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University.
R 18 G 65 B 145 R 0 G 201 B 255 R 104 G 113 B 122 R 216 G 217 B 218 R 168 G 187 B 192 Core and background colors: 1© Nokia Solutions and Networks 2014.
Construction of efficient PDP scheme for Distributed Cloud Storage. By Manognya Reddy Kondam.
Privacy Preserving Query Processing in Cloud Computing Wen Jie
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
1 Privacy-Preserving Distributed Information Sharing Nan Zhang and Wei Zhao Texas A&M University, USA.
Secure Cloud Database using Multiparty Computation.
Shiyuan Wang, Divyakant Agrawal, Amr El Abbadi Department of Computer Science UC Santa Barbara DBSec 2010.
Computer Science iBigTable: Practical Data Integrity for BigTable in Public Cloud CODASPY 2013 Wei Wei, Ting Yu, Rui Xue 1/40.
Database Management 9. course. Execution of queries.
Outsourcing Database Services Đ ỗ Ph ướ c Hoàng T ườ ng Lân Nguy ễ n Minh Thông Lê Tu ấ n Đ ạ t
Wai Kit Wong 1, Ben Kao 2, David W. Cheung 2, Rongbin Li 2, Siu Ming Yiu 2 1 Hang Seng Management College, Hong Kong 2 University of Hong Kong.
Wai Kit Wong, Ben Kao, David W. Cheung, Rongbin Li, Siu Ming Yiu.
Identity-Based Secure Distributed Data Storage Schemes.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Secure Cloud Database with Sense of Security. Introduction Cloud computing – IT as a service from third party service provider Security in cloud environment.
Disclosure risk when responding to queries with deterministic guarantees Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University.
Multiplicative Data Perturbations. Outline  Introduction  Multiplicative data perturbations Rotation perturbation Geometric Data Perturbation Random.
Research Case in Cloud Computing IST 501 Fall 2014 Dongwon Lee, Ph.D.
Executing SQL over Encrypted Data in Database-Service-Provider Model Hakan Hacigumus University of California, Irvine Bala Iyer IBM Silicon Valley Lab.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Secure Data Outsourcing. Outline  Motivation  Background  Research issues  Summary.
Protection of outsourced data MARIA ANGEL MARQUEZ ANDRADE.
Privacy-preserving rule mining. Outline  A brief introduction to association rule mining  Privacy preserving rule mining Single party  Perturbation.
Merkle trees Introduced by Ralph Merkle, 1979 An authentication scheme
Methodology – Physical Database Design for Relational Databases.
Privacy vs. Utility Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Wei-Shinn Ku Slide 1 Auburn University Computer Science and Software Engineering Query Integrity Assurance of Location-based Services Accessing Outsourced.
Security in Outsourced Association Rule Mining. Agenda  Introduction  Approximate randomized technique  Encryption  Summary and future work.
多媒體網路安全實驗室 Practical Searching Over Encrypted Data By Private Information Retrieval Date: Reporter: Chien-Wen Huang 出處: GLOBECOM 2010, 2010 IEEE.
DES Analysis and Attacks CSCI 5857: Encoding and Encryption.
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
Presented By Amarjit Datta
Keyword search on encrypted data. Keyword search problem  Linux utility: grep  Information retrieval Basic operation Advanced operations – relevance.
1 Ullman et al. : Database System Principles Notes 5: Hashing and More.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Privacy Preserving Outlier Detection using Locality Sensitive Hashing
Database Laboratory Regular Seminar TaeHoon Kim Article.
IIIT Hyderabad Private Outlier Detection and Content based Encrypted Search Nisarg Raval MS by Research, CSE Advisors : Prof. C. V. Jawahar & Dr. Kannan.
PREPARED BY: MS. ANGELA R.ICO & MS. AILEEN E. QUITNO (MSE-COE) COURSE TITLE: OPERATING SYSTEM PROF. GISELA MAY A. ALBANO PREPARED BY: MS. ANGELA R.ICO.
Data Security and Privacy Keke Chen
Searchable Encryption in Cloud
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
COMP 430 Intro. to Database Systems
Location Privacy.
A Privacy-Preserving Index for Range Queries
Cloud Security 李芮,蒋希坤,崔男 2018年4月.
Database Design and Programming
University of Cyprus By: Nectarios Efstathiou
Helen: Maliciously Secure Coopetitive Learning for Linear Models
Multiplicative data perturbation (2)
Presentation transcript:

Secure Data Outsourcing

Outline Motivation Background knowledge Research issues Summary Problem description Review cryptographic primitives Research issues Summary

Motivation Cost of maintaining large data 4-5 times of the cost of data acquisition DBAs are paid well  More and more service providers Low cost – cloud computing Maintain one database for one user  multiple users Examples: Alentus.com Datapipe.com Discountasp.net … Concerns about data security and privacy Untrusted service provider

Un-trusted server Lazy: incentives to perform less Curious: incentives to acquire information Malicious: Denial of service Incorrect results Possibly compromised

Challenges Data confidentiality Access privacy Query assurance Data need to be encrypted (?) Query on protected data? Mapping Indexing Access privacy SQL query Access pattern – access index/data Query assurance Correct Complete Fresh

Why is it hard? Arbitrary expressivity Cost SQL statements Often, restricted for certain type of query for simplicity (e.g. range query, knn query) Cost Communication Computation (server side vs client side)

Data confidentiality Bucketization method (crypto-index) Order preserving encryption Perturbations

Bucketization method Hacigumus (SIGMOD02)

Main steps Partition sensitive attributes Order preserving: supports comparison Random: query rewriting becomes hard Build index on the partitions Rewrite queries to target partitions ‘john doe’  105 Select * from T’ where name=105 Execute queries and return results Prune/post-process results on client

Trade off between confidentiality and overhead Larger partition  increased privacy  increased overheads

Order preserving encryption Agrawal2004, Boldyreva2009 The set of data is securely transformed so that the order is preserved but the distribution and domain are changed Benefits: indexing/searching on OPE encrypted data Weakness: once the original distribution is known, OPE is broken

Not attribute-wise order preserving Order preserving encryption (OPE, Agrawal et al 2004) is not resilient to distribution-based attacks Original Xi distribution is known Transformed Xi’ distribution OPE Bucket based Estimation

Perturbation based methods Multiplicative perturbations RASP perturbation for query services (range query, kNN query) (Xu 2014)

confidential query services in the cloud framework Data D D’ D’ D’=F(D) Data owner q’ Query q q’=Q(q) H(q’,D’) Authorized Users Result R’ Result R R=G(R’) Trusted client Honest but curious cloud RASP framework for confidential query services in the cloud

RASP perturbation k-dimensional numeric data, n records, represented as a k x n matrix, x: a record

Properties Not an OPE Preserves convexity of the dataset Convex dataset in Rk  another convex dataset in Rk+2. Good for range query Each range query in Rk  hyperplane based query  range query in Rk+2 .

RASP properties Convexity preserving Queried range (hypercube) is convex RASP transforms the range to another convex (polyhedron) half space: wTx<=a wTx=a The intersection of convex sets is also convex.

illustration of convexity preserving Perturbed space Original space OPE space Xi < a  E(Xi)<E(a)

Secure query transformation A naïve solution Based on the convexity preserving property Problems: (1) A-1 can be probed (2) is . . If a is known, the whole dimension i is breached.

Secure query transformation Enhanced solution Xk+2 is always positive (Xi-a)  0  (Xi-a)Xk+2  0 Correspondingly, in the encrypted space yTy  0, Problems addressed: (1) A-1 cannot be derived from  (2) (Xi-a)Xk+2  0 contains the random component Xk+2 that protects the condition (Xi-a)  0

Efficient two-stage query processing illustrated Stage2: Filter out the junk records Stage1: Querying this bounding box Original space Transformed space A multidimensional tree index is been built on the encrypted data (in the transformed space) in the server.

The client calculates the large bounding box; Stage 1: The client calculates the large bounding box; The server uses the index to find the results. Stage 2: filter the initial results with the conditions yTiy  0 for 1…2m Note: the two-stage strategy works, if the output of stage 1 is significantly smaller than the original database and can be fit into the memory. Otherwise, use linear scan with stage 2 filtering.

Access pattern privacy On database queries Problem is the same as PIR Attackers may use the access pattern to breach data confidentiality Each of previous approaches should handle this problem!

PIR is impractical Solutions based on private Information retrieval (PIR) PIR is still impractical

For Bucktization approach Based on the architecture of Hacigumus (SIGMOD02) Hore VLDB04 (paper 138) For range query Privacy concern: reveal the distribution of value in each bucket “Diffusion”: split buckets and combine parts of different buckets Trade off: now the server needs to return more noisy results  larger size

For OPE Queries are protected (assume the original distribution is unknown) Access pattern is not protected may give some information to break the mapping (e.g., estimate the original distribution), no study yet.

For RASP Queries are protected But privacy of access pattern is not preserved

Integrity checking Common methods Checksum Hash functions hash trees Hash chains

Integrity guarantee Merkle hash tree H(H(x1)+H(x2)) , + is string concatenation Can be stored with tree like structure : index, xml

Hash chains

Applications Query correctness with merkle trees

Using merkle tree Example: 5<=q<=10 LUB(q) = 4 GLB(q) = 11

Operations: Issues Related work Selections, projections, equijoins, set ops Issues Works only on data with verification objects Query expressiveness Expensive Related work Pang et. al (ICDE04, SIGMOD05), using ElGamal function Sion VLDB05: challenge token F.Li SIGMOD06: freshness

Trusted hardware

Possible benefits

Discussion Data confidentiality/access pattern Restrict cryptographic definition (keyword search) or Relaxed definition (perturbation, bucketization, OPE, etc.) It is very difficult to formulate and prove the security of non-traditional approaches Do we need to reformulate the security model? and how?