Secure outsourcing of XML data Barbara Carminati University of Insubria at Varese
Software as a Service Get What you need When you need it Pay for What you use Don’t worry about Deployment, installation, maintenance, upgrades Hire/train/retain people
Emerging trend: data outsourcing Database as a Service ( DBaaS), why? Most organizations need efficient data management DBMSs are extremely complex to deploy, setup, and maintain Require skilled DBAs (at very high cost!) Driven by faster, cheaper, and more accessible networks
Traditional architecture Client DBMS Server
Third-party architecture Data Provider Data Queries Results Client Data owner Outsourced db Internet
Research issues Distributed query management Consistency Security & Privacy: Main requirements: confidentiality, integrity, authenticity, completeness, etc…
Security & Privacy Na Ï ve solution: Data providers are trusted -- they always operate according to owners security and privacy policies
Security & Privacy To be satisfied even in the presence of an untrusted provider that: Can modify/delete the data Can access sensitive/private information Can send data to non authorized users Can send a user not all the information he/she is authorized to access Can be attacked from outside To be satisfied by incurring minimal computation and bandwidth overhead
Main requirements Confidentiality Authenticity/integrity Completeness
Confidentiality Confidentiality: Data are disclosed only to authorized users Usually, confidentiality requirements are expressed through a set of access control policies
Authorizations Reference Monitor Access granted (partially or totally) Access denied Access control policies SAs Users Access control Access request
Confidentiality When data are outsourced, confidentiality has a twofold meaning: Confidentiality wrt users: protect data against unauthorized user’s read accesses Confidentiality wrt providers: protect the Owner’s data from read accesses by untrusted providers
Integrity It refers to information protection from modifications; it involves several goals: Assuring the integrity of information with respect to the original information– often referred to as authenticity Protecting information from unauthorized modifications
Integrity/authenticity Usually enforced through signature techniques When data are outsourced: Traditional signature techniques are not enough A user can be returned only selected portions of the data signed by the owner
Completeness It refers to ensure that users receive all information they are entitled to access, according to the owner policies
Secure outsourcing of XML data our proposal
Scenario Credential base Policy Base XML Source Owner Provider XML docs We focus on XML The Owner is the producer of information. It specifies access control policies The Provider is responsible for managing (a portion of) the Owner information and answering user queries according to the access control policies specified by the Owner
We focus on XML data The Owner specifies access control policies according to an access control model supporting: Fine-grained and credential-based access control XML-based language to express access control policies and credentials (X-Sec) Scenario
Example X-Sec Alice Credential Access Control Policy (encoded by X-Sec language) Alice Rossi marketing administrative organization.xml organization.xml PMPathtargetCred expression
Access control policy authorizes Alice to see Example Alice submits this Xpath: denied Alice Rossi 80K 7 Bob Red 50K 5 Tom Black 170K 12 Kim 150K 11 Ann 80K 7
Problem Credential base Policy Base XML Source Owner Provider 1 XML docs XML docs XML docs XML docs Provider 4 Provider 2 Provider 3 Untrusted Strategies for ensuring confidentiality, authenticity and completeness even if the provider is not trusted
Proposed solution: overall idea The owner outsources to providers a Security Enhanced Encryption of the original XML docs, where: Authenticity and integrity are enforced by an alternative digital signature devised for XML docs, i.e., Merkle Signature; Confidentiality is ensured by the properties of Well formed encryption; It contains security information, that makes the providers able to evaluate queries. Moreover, the owner provides users with auxiliary data structures (i.e., Query templates), that make them able to submit queries directly to providers and verify the obtained query results
SE-ENC document Query Template Well-formed encryption Merkle Signature Security Information Removal of encrypted content Partioning information Authenticity information K1K1 KjKj KmKm KpKp XML document Owner-side processing
System architecture User Answer Query OWNER CLIENT CLIENT PROVIDER SE-ENC document Decryption keys credentials
System architecture User Answer Query Reply Document XML queryOWNER CLIENT CLIENT PROVIDER SE-ENC document Query Template
Confidentiality enforcement
Confidentiality issues Secure data outsourcing implies two different confidentiality issues: Confidentiality with respect to users Confidentiality with respect to providers
Confidentiality Problem: Providers must be able to evaluate queries and enforce access control policies on XML documents, by respecting at the same time confidentiality requirements Solution based on encryption techniques
The idea is that before sending a document to a provider, the owner encrypts it: Well formed encryption The approach is based on encrypting all document portions to which the same set of access control policies apply with the same key Well Formed Encryption
&1 &13&9&7&6&4&3 &2&8 &5 &10 &12&11 &14 &15&16 P1,P3 P2 P1,P3 P3 P1,P3 Well-Formed Encryption
Node encrypted with key K1 &1 &13&9&7&6&4&3 &2&8 &5 &10 &12&11 &14 &15&16 P1,P3 P2 P1,P3 P3 P1,P3 Well-Formed Encryption
Nodes encrypted with key K2 &1 &13&9&7&6&4&3 &2&8 &5 &10 &12&11 &14 &15&16 P1,P3 P2 P1,P3 P3 P1,P3 Well-Formed Encryption
Nodes encrypted with key K3 &13&7&6&4&3 &2&8 &5 &12&11 &14 &15&16 P1,P3 P2 P1,P3 P3 P1,P3 &1 &9 &10 Well-Formed Encryption
Nodes encrypted with key Kd &13 &8 &12&11 &14 &15&16 P1,P3 P2 P1,P3 P3 P1,P3 &9&7&6&4&3 &2 &5 &10 &1 Well-Formed Encryption
&13 &8 &12&11 &14 &15&16 P1,P3 P2 P1,P3 P3 P1,P3 P1K2 P2K1 P3K2, K3 &9&7&6&4&3 &2 &5 &10 &1 Well-Formed Encryption
The owner does not supply any key to providers Keys are properly stored by the owner into the user entries in the directory server. Each user entry contains the key(s) corresponding to access control policies satisfied by the user: Hierarchical key management scheme that minimizes the number of keys to be permanently stored Well Formed Encryption: Key management
Each node of the resulting encrypted document is accessible only by authorized users It prevents provider accesses to the managed data Well-formed encryption ensures confidentiality both wrt users and Providers Well Formed Encryption pro
Issue: How can the Provider evaluate queries on XML encrypted data? Well Formed Encryption cons
Quering XML encrypted data - Querying encrypted documents is a difficult issue and greatly depends on the kinds of queries that are submitted to providers. - In our scenario, we assume users submit XPath expressions
- Xpath expressions: Queries that impose conditions only on the structure of the XML document (structure queries) Queries that impose conditions also on data content (content-dependent queries) Quering XML encrypted data
- Xpath expressions: Queries that impose conditions only on the structure of the XML document (structure queries) Queries that impose conditions also on data content (content-dependent queries) Quering XML encrypted data
Well formend encryption is encoded by an XML document preserving the structure of the original XML document Enc(tg1,K1) Enc(tg2,K2) Enc(tg3,K1) Enc(Att,K1) Enc(tg3,K3) Enc(Att,K3) tg1 tg2 Att tg3 Att tg3 Well Formed Encryption
Preserving the original doc structure greatly facilitates the evaluation of structure queries over the encrypted document But it implies some security threats: Data dictionary attacks by providers and users: At schema level (tag/attribute names) On element data contents/attribute values Well Formed Encryption
To prevent data dictionary attacks we adopt the encryption scheme proposed by Song, Wagner and Perrig for textual data (IEEE Symposium on Security and Privacy,2000) : Different occurrences of the same word, encrypted with the same key, result in different encryptions It is possible to perform keyword-based searches on the encrypted textual data without knowing decryption keys Well Formed Encryption
XPath expressions specify only the location path: Ex: //tag1/tag2/tag3// Since we preserve the structure, client simply generates the corresponding encrypted query Ex: //Enc(tag1,K1)/Enc(tag2,K2)/Enc(tag3,K1)// Providers are able to evaluate the encrypted query directly on the encrypted document Quering XML encrypted data structure queries
- Xpath expressions: Queries that impose conditions only on the structure of the XML document (structure queries) Queries that impose conditions also on data content (content-dependent queries) Quering XML encrypted data
In order to make a provider able to evaluate conditions on encrypted data, we provide it with additional information In particular, on the basis of the data domain, we use two different strategies: non-textual data: Hacigums et al. (SIGMOD 2002) textual data: Song et al. (IEEE Symposium on Security and Privacy,2000) Quering XML encrypted data content-dep. queries
Proposed solution for non-textual data: Previous research on querying encrypted relational db (H.Hacigumus et al.) Given a relation R, the data owner divides the domain of each attribute into distinguished partitions, to which it assigns a different id For each encrypted tuple, the provider receives also the partition ids of each of its attributes The provider is able to perform queries directly on the encrypted tuples, by exploiting the partitioning ids Quering XML encrypted data content-dep. queries
38093John Bob Alice0945 SalaryDipNameEid IDPartition salary …... … ID Eid #%& ID Salary ID Dip ID Name etuple SELECT * FROM Employee WHERE Salary=275 SELECT * FROM Employee WHERE ID_salary=46 Employee relationProvider Quering XML encrypted data content-dep. queries
IDPartition salary Provider Owner tg1 tg2 Salary tg3 Salary tg3 Enc(tg3,K1) Enc(tg1,K1) Enc(tg2,K2) Enc(380,K1); 30 Enc(275,K3); 46 Enc(tg3,K3) Well formed encryption & Node Partion IDs Quering XML encrypted data content-dep. queries
Proposed solution for textual data: A first phase during which the Owner preprocesses the textual data contained in an attribute/element and extracts from them a set of meaningful keywords. Second phase where each keyword is encrypted according to the Song et al. schema Quering XML encrypted data content-dep. queries
Provider Owner //tg1/tg2/tg3[contains(.,’DB’)]// tg1 tg2 tg4 tg3 //Enc(tg1,K1)/Enc(tg2,K2)/Enc(tg3,K1)[contains(., Enc(‘DB’,k1))]// Enc(tg3,K1) Enc(tg1,K1) Enc(tg2,K2) Enc(tg3.content,K1); Enc(XML,k1), Enc(DB,k1) Enc(tg3.content,K3); ….. Enc(tg4,K3) Well formed encryption & Encrypted keywords Keywords: XML, DB Quering XML encrypted data content-dep. queries
Authenticity and Integrity enforcement
Authenticity/integrity User Owner query Signed view To ensure authenticity in two-party architectures traditional digital signature works well
But… ….traditional digital signatures have some problems in third-party architectures!! User Owner Provider Xpath
Merkle Signature An alternative way to sign an XML doc By applying a unique digital signature on an XML doc it is possible to ensure the authenticity of: the whole document any portions of it It uses a different way to compute the digest of XML docs, based on the Merkle tree authentication mechanisms
It uses a different way to compute the digest of XML docs, based on the Merkle tree authentication mechanisms N5N4 N3 N1 N2 N7N6 MhX(N7)=h(h(N7.content) || h(N7))) An alternative way to sign an XML doc By applying a unique digital signature on an XML doc it is possible to ensure the authenticity of: the whole document any portions of it Merkle Signature
It uses a different way to compute the digest of XML docs, based on the Merkle tree authentication mechanisms N5N4 N3 N1 N2 N7N6 MhX(N3)=h(h(N3.content) || h(N3) || MhX(N6) || MhX(N7)) An alternative way to sign an XML doc By applying a unique digital signature on an XML doc it is possible to ensure the authenticity of: the whole document any portions of it Merkle Signature
It uses a different way to compute the digest of XML docs, based on the Merkle tree authentication mechanisms N5N4 N3 N1 N2 N7N6 J8ygVS8nqtl F5HP3FBj9e ZU/KYY= Merkle Signature An alternative way to sign an XML doc By applying a unique digital signature on an XML doc it is possible to ensure the authenticity of: the whole document any portions of it Merkle Signature
How can a user validate the Merkle signature computed on the whole XML document by having only a portion of it? Merkle Hash Paths Merkle hash paths
Merkle Hash Paths for a leaf node v’ v MhPath(4,1) The Merkle hash Path between v’ and v consists of: the Merkle hash values of all the siblings of the nodes belonging to the path connecting v’ to v
Since the provider operates on encrypted data, it is not able to compute Merkle hash paths The owner includes into the SE-Enc docs the hash value of each node Merkle Hash Paths
Completeness enforcement
Completeness Completeness is verified through the use of Query Templates The query template consists of the SE-ENC document (i.e., the well formed encryption, plus the additional information) without data content. By executing queries submitted to the provider on the query template, a user is able to verify the completeness of the query answer without accessing information he/she is not allowed to see.
Completeness The encrypted data structure makes user able to verify the completeness of structure queries By exploiting partition information and ciphered keywords, a user is able to verify the completeness of content-dependent queries
Conclusion
SE-ENC document Query Template Well-formed encryption Merkle Signature Security Information Removal of encrypted content Partioning information Authenticity information K1K1 KjKj KmKm KpKp XML document Owner-side processing
SE-ENC document Reply document Query evaluation Create Reply document Insert Merkle Signature Insert information needed for authenticity verification Provider-side processing
Reply document Confidentiality verification Authenticity verification User-side processing Query Template Completeness verification
System architecture User Answer Query Subscription Request Provider entry key 1 2 SE-ENC XML documents 4 3 User_ID Users entry Key Subscription Request 5 6 Directory server Reply Document User Policy Configuration + Query OWNER CLIENT CLIENT PROVIDER Providers Users Alice Bob Frank Query Template documents User Policy Configuration + encryption Keys 78 9
References Papers in XML B. Carminati, E. Ferrari. Confidentiality Enforcement for XML Outsourced Data. In Proc. of the Second International EDBT Workshop on Database Technologies for Handling XML Information on the Web, Munich, Germany, March B. Carminati, E. Ferrari, E. Bertino. Assuring Security Properties in Third Party Architecture. Proc. of the International Conference on Data Engineering (ICDE’05), poster paper. B. Carminati, E. Ferrari. Trusted Privacy Manager: A System for Privacy Enforcement on Outsourced Data. Proc. of the International Workshop on Privacy Data Management, Tokyo, Japan, April E. Bertino, B. Carminati, E. Ferrari, B. Thuraisingham, A. Gupta. Selective and Authentic Third-party Distribution of XML Document. IEEE Transactions on Knowledge and Data Engineering, 16(10): , E. Bertino, B. Carminati, E. Ferrari. A Flexible Authentication Method for UDDI Registres. Proc. of the 2003 International Conference on Web Services (ICWS'03), Las Vegas, June E. Bertino, B. Carminati, E. Ferrari. A temporal key management scheme for secure broadcasting of XML documents. Proc. of the 9th ACM conference on Computer and Communications Security, Washington, November E. Bertino, E. Ferrari. Secure and Selective Dissemination of XML Documents. ACM Transactions on Information and System Security (TISSEC), 5(3): , Papers in relational data H.Hacigumus, B.Iyer, C.Li, and S.Mehrotra. Executing SQL over Encrypted Data in the Database Service Provider Model. In Proceedings of the SIGMOD Conference, D. X. Song, D. Wagner and A. Perrig, Practical Techniques for Searches on Encrypted Data, In Proceedings of the IEEE Symposium on Security and Privacy, Oakland, California, B. Chor, O. Goldreich, E. Kushilevitz, M. Sudan. Private Information Retrieval In Proc. of Symposium on Foundations of Computer Science,1995 Devanbu P., Gertz M., Martel C., Stubblebine S.G. Authentic Third-party Data Publication. In Proc. of the 14 th Annual IFIP WG 11.3 Working Conference on Database Security, Schoorl, the Netherlands, Goh E., Secure Indexes, Cryptology ePrint Archive, Report 2003/216, 2003 Golle P., Staddon J. and Waters B., Secure Conjunctive Keyword Search Over Encrypted Data, In Proc. of the Applied Cryptography and Network Security Conference, Mykletun E., Narasimha M.,Tsudik G. Authentication and Integrity in Outsourced Databases. In Proc. of the 11 th Annual Symposium on Network and Distributed System Security, San Diego, California, Pang H., Jain A., Ramamritham K. and Tan K., Verifying completeness of relational query results in data publishing, In Proc. of the ACM SIGMOD international conference on Management of data, Baltimore, Maryland, 2005