Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPT-S 580-06 Advanced Databases 1 Yinghui Wu EME 49 ADB (ln29)

Similar presentations


Presentation on theme: "CPT-S 580-06 Advanced Databases 1 Yinghui Wu EME 49 ADB (ln29)"— Presentation transcript:

1 CPT-S 580-06 Advanced Databases 1 Yinghui Wu EME 49 ADB (ln29)

2 DBMS: privacy and security in the Cloud Data security and privacy Security and privacy in cloud Data confidentiality Research Challenges CPT-S 580-08 Advanced Databases adapted from “Secure and Privacy- preserving database services in the cloud, Divy Agrawal, et.al, ICDE 2013 tutorial”

3 Database systems: security & privacy issues ADB (ln29)

4 Access Control [Bertino et al. TDSC’05] Problem Statement: authorizing data access scopes (relations, attributes, tuples) to users of DBMS Discretionary access control –Authorization administration policies, ie, granting and revoking authorization (centralized, ownership, etc) –Content-based using views and rewriting for fine-grained access control –Role-based access control: a function with a set of actions, consisting of users members Mandatory access control: –Object and subject classification (eg, top secret, secret, unclassified, etc). 4

5 Data Anonymization Problem: protecting Personally Identifiable Information (PII) and their sensitive attributes 5 Quasi-identifierSensitive DOBGenderZipcodeDisease 1/21/76Male53715Heart Disease 4/13/86Female53715Hepatitis 2/28/76Male53703Brochitis 1/21/76Male53703Broken Arm 4/13/86Female53706Flu 2/28/76Female53706Hang Nail Quasi-identifiers need to be generalized or suppressed Quasi-identifiers are sets of attributes that can be linked with external data to uniquely identify an individual

6 Equivalence class share same QI Solution: k-Anonymity [Samarati et al. TR’98] Quasi-identifiers indistinguishable among k individuals Implemented by building generalization hierarchy or partitioning multi-dimensional data space 6 Homogeneity attack Background knowledge attack

7 Enhanced Solution: l-Diversity [Machanavajjhala et al. ICDE’06] At least l values for sensitive attributes in each equivalence class 7 ZipcodeAgeSalaryDisease 476**2*20KGastric Ulcer 476**2*25KGastritis 476**2*30KStomach Cancer 4790*≥4050KGastritis 4790*≥40100KFlu 4790*≥4070KBronchitis 476**3*60KBronchitis 476**3*80KPneumonia 476**3*90KStomach Cancer A 3-diverse patient table Similarity attack Skewness attack

8 Enhanced Solution: t-Closeness [Li et al. ICDE’07] Distance between overall distribution of sensitive attribute values and distribution of sensitive attribute values in an equivalence class bounded by t 8

9  A randomized function K gives ε-Differential Privacy IFF for all datasets D 1 and D 2 differing on at most one element, and all S Range (K) Strong privacy guarantees while querying a database 9 Query A PERTURBATION P(A) Query A’ PERTURBATION P(A’) Indistiguishable! Differential Privacy for Statistical Data [Dwork ICALP’06]

10 Secure Devices for Privacy [Anciaux et al. SIGMOD’07] Problem: protecting private data during queries involving both private (hidden) and public (visible) data Solution: carry private data in a secure USB key, ensure private data never leaves the USB key, and only public data flows to the key Query optimization for small RAM USB key 4/11/2013ICDE 2013 Tutorial10

11 Database security & privacy in the cloud ADB (ln29)

12 Cloud – A Tempting Attack Target Why the cloud? –Ubiquitous access to consolidated data. –Shared infrastructure  economies of scale –A lot of small and medium businesses Why attack? –Target one service provider, attack multiple companies –Financial gain from trading sensitive information 12

13 Cloud Provides Novel Attack Opportunities Co-residence attack [Ristenpart et al. CCS’09] –Adversary: non-provider-affiliated malicious parties –Map and identify location of target VM –Place attacker VM co-resident with target VM –Cross-VM side-channel attacks (due to sharing of physical resources): eg, number of visitors to a page, or keystroke attacks for password retrieval. Signature wrapping attack [ Somorovsky et al. CCSW’11 ] –Control Interface compromise by capturing a SOAP msg. –Manipulate SOAP message with arbitrary XML fragments –Use XML signature vulnerability to pass authentication –Take control of a victim’s account 13

14 A Barrier to Conquer Security and privacy – a barrier to cloud adoption Data (sensitive data) – a key concern need to solve data security and privacy problems in the cloud 14

15 Problems Amplified by the Cloud 15 Data confidentiality –Attacks Unauthorized accesses, side channel attacks –Solutions Encryption, querying encrypted data Trusted computing User Cloud Servers Data Query Answer Access privacy –Attacks Inferences on access patterns or query results –Solutions Private information retrieval Query obfuscation

16 Challenges: Conflicting Goals 16 Existing Services FunctionalityPerformance Confidentiality / Privacy High Low High Many Crypto Systems/Protocols Ideal State

17 Data confidentiality ADB (ln29)

18 Database as a Service [Hacigümüs et al. ICDE’02] Protects data from steeling but plaintext data can still be seen on the server Write – encrypt before storing –insert into lineitem (discount) values (encrypt(10,key)) Read – decrypt before access –select decrypt(discount,key) from lineitem where custid = 300 Encryption alternatives –Software level v.s. Hardware level (cryptographic coprocessor) encryption –Granularity: field, row, page 18

19 Partition and Identification Index [Hacigümüs et al. SIGMOD’02] E(tuple): encrypted-tuple, {attribute-index} Attribute-index: attribute value partition ids 19 2 0 200 400 600 8001000 7514

20 Partition and Identification Index Client knows a map function, Map(val) = id of the partition containing val 20 2 0 200 400 600 8001000 7514 1 0 200 400 600 8001000 2457 Random mapping Order-preserving mapping

21 Mapping Predicate Conditions Map(< val) : ids of the partitions that could contain values < val E.g. Map(eid < 280) = {2, 7} for random mapping Map(> val) : ids of the partitions that could contain values > val Map(A i = A j ): pairs of ids of the partitions that could have equal A i and A j values Decryption and processing on the client 21

22 Mapping Predicate Conditions 22 emp.did = mrg.did

23 Partition / Bucketization Review Pros –Efficient computation on the server Cons –Data update is hard (may need re-distribution) –Filtering super answer set could be time consuming depending on the partitions sizes –Might reveal value distribution from relative partitions changes during dynamic data updates 23

24 CryptDB [Popa et al. SOSP’11] Supports a wide range of SQL queries over encrypted data Server fully evaluates queries on encrypted data, and client does not perform query processing SQL-aware encryption –leverage provable practical techniques for different SQL operators over encrypted data Adjustable query-based encryption –Dynamically adjust the encryption level of data items according to user’s queries Onion of encryptions –From weaker forms of encryption that allow certain computation to stronger forms of encryption that reveal no information 24

25 SQL-Aware Onion Encryption 25 RND: no functionality DET: equality selection SEARCH: word selection (only for text fields) Any value JOIN: equality join RND: no functionality OPE: comparison Any value OPE-JOIN: inequality join int value HOM: sum

26 CryptDB System 26 For performing cryptographic operations For sending certain onion layer key

27 Open problems ADB (ln29)

28 Open Research Problems Encryption for processing range/join database queries on encrypted data Improve performance of querying encrypted data for use in practical OLTP applications –Pre-computation –Parallel calculation End to end security in the cloud –Need information flow control and auditing in addition to cryptography or trusted computing based approaches 28

29 Concluding Remarks Cloud security and privacy is not a completely new problem. Some issues are amplified by the cloud. Protecting data confidentiality and access privacy Maintaining practical functionality and performance while achieving security and privacy 29

30 References [Bertino et al. TDSC’05] E. Bertino et al. Database security-concepts, approaches, and challenges. In IEEE TDSC, 2(1), 2005. [Samarati et al. TR’98] P. Samarati et al. Protecting privacy when disclosing information: k- anonymity and its enforcement through generalization and suppression. TR 1998. [Machanavajjhala et al. ICDE’06] A. Machanavajjhala et al. l-diversity: privacy beyond k- anonymity. In ICDE 2006. [Li et al. ICDE’07] N. Li et al. t-closeness: privacy beyond k-anonymity and l-diversity. In ICDE 2007. [Dwork ICALP’06] C. Dwork. Differential privacy. In ICALP(2) 2006. [Verykios et al. SIGMOD’04] V. S. Verykios et al. State-of-the-art in privacy preserving data mining. In SIGMOD 2004. [Agrawal et al. SIGMOD’00] R. Agrawal et al. Privacy-preserving data mining. In SIGMOD 2000. [Clifton et al. KDD’02] C. Clifton et al. Tools for privacy preserving distributed data mining. In KDD 2002. [Anciaux et al. SIGMOD’07] N. Anciaux et al. GhostDB: querying visible and hidden data without leaks. In SIGMOD 2007. 30

31 References [Chaudhuri et al. CIDR’11] S. Chaudhuri et al. Database access control & privacy: is there a common ground? In CIDR 2011. [Ristenpart et al. CCS’09] T. Ristenpart et al. Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds. In CCS 2009. [Somorovsky et al. CCSW’11] J. Somorovsky et al. All your clouds are belong to us: security analysis of cloud management interfaces. In CCSW 2011. [Hacigümüs et al. ICDE’02] H. Hacigümüs et al. Providing database as a service. In ICDE 2002. [Song et al. S&P’00] D. Song et al. Practical techniques for searches on encrypted data. In S&P 2000. [Hacigümüs et al. SIGMOD’02] H. Hacigümüs et al. Executing SQL over encrypted data in the database service provider mode. In SIGMOD 2002. [Hore et al. VLDB’04] B. Hore et al. A privacy-preserving index for range queries. In VLDB 2004. [Agrawal et al. SIGMOD’04] R. Agrawal et al. Order preserving encryption for numeric data. In SIGMOD 2004. 31

32 References [Popa et al. SOSP’11] R. A. Popa et al. Cryptdb: protecting confidentiality with encrypted query processing. In SOSP 2011. [Damiani et al. CCS’03] E. Damiani et al. Balancing confidentiality and efficiency in untrusted relational DBMSs. In CCS 2003. [Wang et al. SDM’11] S. Wang et al. A comprehensive framework for secure query processing on relational data in the cloud. In SDM 2011. [Aggarwal et al. CIDR’05] G. Aggarwal et al. Two can keep a secret: a distributed architecture for secure database services. In CIDR 2005. [Emekci et al. ICDE’06] F. Emekci et al. Privacy preserving query processing using third parties. In ICDE 2006. [Agrawal et al. SRDS’88] D. Agrawal et al. Quorum consensus algorithms for secure and reliable data. In SRDS 1988. [Bajaj et al. SIGMOD’11] S. Bajaj et al. Trusteddb: a trusted hardware based database with privacy and data confidentiality. In SIGMOD 2011. [Song et al. IEEE’12] D. Song et al. Cloud data protection for the masses. In IEEE Computer, 45(1), 2012. [Chor et al. JACM’98] B. Chor et al. Private information retrieval. In J. ACM, 45(6), 1998. 32

33 References [Kushilevitz et al. FOCS’97] E. Kushilevitz et al. Replication is not needed: single database, computationally private information retrieval. In FOCS 1997. [Sion et al. NDSS’07] R. Sion et al. On the computational practicality of private information retrieval. In NDSS 2007. [Olumofin et al. FC’11] F. G. Olumofin et al. Revisiting the computational practicality of private information retrieval. In FC 2011. [Williams et al. NDSS’08] P. Williams et al. Usable private information retrieval. In NDSS 2008. [Wang et al. DBSEC’10] S. Wang et al. Generalizing PIR for practical private retrieval of public data. In DBSec 2010. [Wang et al. DAPD’13] S. Wang et al. Towards practical private processing of database queries over public data. In DAPD 2013. [Vimercati et al. ICDCS’11] S. D. C. Vimercati et al. Efficient and private access to outsourced data. In ICDCS 2011. 33


Download ppt "CPT-S 580-06 Advanced Databases 1 Yinghui Wu EME 49 ADB (ln29)"

Similar presentations


Ads by Google