1 Secure XML Querying with Security Views Wenfei Fan University of Edinburgh & Bell Laboratories Chee-Yong Chan National University of Singapore Minos.

Slides:



Advertisements
Similar presentations
A View Based Security Framework for XML Wenfei Fan, Irini Fundulaki, Floris Geerts, Xibei Jia, Anastasios Kementsietsidis University of Edinburgh Digital.
Advertisements

XML: Extensible Markup Language
Access Control Policy Translation and Verification Within Heterogeneous Data Federations Gregory Leighton Denilson Barbosa University of Alberta June 11,
Representing Boolean Functions for Symbolic Model Checking Supratik Chakraborty IIT Bombay.
1 Information Preserving XML Schema Embedding Philip BohannonBell Laboratories Wenfei FanUniv of Edinburgh & Bell Labs Michael Flaster Bell Laboratories.
1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
CSE 6331 © Leonidas Fegaras XML and Relational Databases 1 XML and Relational Databases Leonidas Fegaras.
Using Multi-Encryption to Provide Secure and Controlled Access to XML Documents Tomasz Müldner, Jodrey School of Computer Science, Acadia University, Wolfville,
An Algorithm for Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T. J.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
--What is a Database--1 What is a database What is a Database.
Xyleme A Dynamic Warehouse for XML Data of the Web.
Web-site Management System Strudel Presented by: LAKHLIFI Houda Instructor: Dr. Haddouti.
DYNAMIC ELEMENT RETRIEVAL IN A STRUCTURED ENVIRONMENT MAYURI UMRANIKAR.
Firewall Policy Queries Author: Alex X. Liu, Mohamed G. Gouda Publisher: IEEE Transaction on Parallel and Distributed Systems 2009 Presenter: Chen-Yu Chang.
A Framework for Using Materialized XPath Views in XML Query Processing Dapeng He Wei Jin.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
Containment and Equivalence for an XPath Fragment By Gerom e Mikla Dan Suciu Presented By Roy Ionas.
Summary. Chapter 9 – Triggers Integrity constraints Enforcing IC with different techniques –Keys –Foreign keys –Attribute-based constraints –Schema-based.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
CS246 Query Translation. Mind Your Vocabulary Q: What is the problem? A: How to integrate heterogeneous sources when their schema & capability are different.
Inbal Yahav A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04 DB Seminar, Spring 2005 By: Andrey Balmin Fatma Ozcan Kevin.
Sangam: A Transformation Modeling Framework Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI)
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
Chapter 4 The Relational Model.
Querying Tree-Structured Data Using Dimension Graphs Dimitri Theodoratos (New Jersey Institute of Technology, USA) Theodore Dalamagas (National Techn.
Xpath Query Evaluation. Goal Evaluating an Xpath query against a given document – To find all matches We will also consider the use of types Complexity.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
TDDD43 XML and RDF Slides based on slides by Lena Strömbäck and Fang Wei-Kleiner 1.
1 Static Type Analysis of Path Expressions in XQuery Using Rho-Calculus Wang Zhen (Selina) Oct 26, 2006.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
CODD’s 12 RULES OF RELATIONAL DATABASE
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
An Algebra for Composing Access Control Policies (2002) Author: PIERO BONATTI, SABRINA DE CAPITANI DI, PIERANGELA SAMARATI Presenter: Siqing Du Date:
Querying Structured Text in an XML Database By Xuemei Luo.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Knowledge Technologies March 2001 DataChannel, Inc Preserving Process Hyperlink-Based Workflow Representation W. Eliot Kimber, DataChannel, Inc.
XML Data Management 10. Deterministic DTDs and Schemas Werner Nutt.
1 Dept of Information and Communication Technology Creating Objects in Flexible Authorization Framework ¹ Dep. of Information and Communication Technology,
VLDB'02, Aug 20 Efficient Structural Joins on Indexed XML1 Efficient Structural Joins on Indexed XML Documents Shu-Yao Chien, Zografoula Vagena, Donghui.
RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ A. ONUR DOĞUÇ
Extending context models for privacy in pervasive computing environments Jadwiga Indulska The School of Information Technology and Electrical Engineering,
Declaratively Producing Data Mash-ups Sudarshan Murthy 1, David Maier 2 1 Applied Research, Wipro Technologies 2 Department of Computer Science, Portland.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
CIS/SUSL1 Fundamentals of DBMS S.V. Priyan Head/Department of Computing & Information Systems.
Management of XML and Semistructured Data Lecture 10: Schemas Monday, April 30, 2001.
XML and Database.
XML Access Control Koukis Dimitris Padeleris Pashalis.
Johannes Kepler University Linz Department of Business Informatics Data & Knowledge Engineering Altenberger Str. 69, 4040 Linz Austria/Europe
Answering Tree Pattern Queries Using Views Laks V.S. Lakshmanan, Hui (Wendy) Wang, and Zheng (Jessica) Zhao University of British Columbia Vancouver, BC.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Inference Problem - I September.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Containment of Partially Specified Tree-Pattern Queries
XML Validation. a simple element containing text attribute; attributes provide additional information about an element and consist of a name value pair;
EJBs +XML + Integrity Constraints Data-Object Modeling and Optimization (DOMO) June 2003 Rajesh Bordawekar, Michael Burke, Mukund Raghavachari, Vivek Sarkar,
Efficient Discovery of XML Data Redundancies Cong Yu and H. V. Jagadish University of Michigan, Ann Arbor - VLDB 2006, Seoul, Korea September 12 th, 2006.
R-customizers Goal: define relation between graph and its customizers, study domains of adaptive programs, merging of interface class graphs.
1 Integration of data sources Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Views / Session 3/ 1 of 40 Session 3 Module 5: Implementing Views Module 6: Managing Views.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
Building Trustworthy Semantic Webs
Dynamic Multi-version Ontology-based Personalization
RE-Tree: An Efficient Index Structure for Regular Expressions
Managing XML and Semistructured Data
Managing XML and Semistructured Data
Presentation transcript:

1 Secure XML Querying with Security Views Wenfei Fan University of Edinburgh & Bell Laboratories Chee-Yong Chan National University of Singapore Minos Garofalakis Bell Laboratories

2 The need for XML security Data in XML format: Business information: confidential Health-care data: Patient Privacy Act, … Access control: multiple groups simultaneously query the same XML document each user group has a different access-control policy Enforcement of access-control policies: XML Query Engine user group 1user group n... inaccessible accessible

3 Secure XML querying For each user group of an XML document T, specify a access-control policy S, enforce S : for any query Q posted by the group over the document T, Q(T) consists of only data accessible wrt S Access control for XML: How to specify access policies at various levels of granularity? How to efficiently enforce those access policies? XML Query Engine user group inaccessible accessible Q Q(T) XML document T

4 Example: an XML document of patients Document DTD D hospital  patient* patient  SSN, name, record* record  date, diagnosis, treatment treatment  (trial + regular) trial  trName, treatment* regular  tname, bill * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill DTD graph Access-control policies over docs of D: Doctors in the hospital are granted access to all the data in the docs Insurance company is allowed to access billing information only

5 Access-control policy for syndrome surveillance patients: accessible to only those who are diagnosed to have a certain disease “ DIS ” (a constant) records: –only with diagnosis = “ DIS ” –part of “ DIS ” records: date, diagnosis, treatment, tname –denied from seeing whether a patient is in a clinical trail or not (trial, regular, trName) –denied from accessing billing information * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill X X X X

6 Challenge: Access-control specification various levels of granularity: restricting access to entire subtrees or specific elements conditional access: e.g., a patient is accessible if and only if it has a descendant diagnosis = “ DIS” overriding: e.g., tname overrides the accessibility of its parent regular inheritance: e.g., SSN and name inherit the accessibility of patient * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill conditionally accessible

7 Challenge: access-control enforcement should not imply any drastic degradation in performance Example: an XPath query Q posed by a syndrome surveillance group over a document T //patient[name=`Joe’]//tname access control requirement: Q(T)  {accessible tname} enforcement: ensure that –all and only those Joe’s having a descendant diagnosis = “ DIS ”, –all and only those records with diagnosis = “ DIS ” * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill conditionally accessible

8 Challenge: schema availability One needs schema information to facilitate query formulation and optimization How to define a schema (DTD) characterizing all and only the accessible information, without security breach? How to automatically derive such a DTD from the document DTD and an access-control specification? XML DTD is far more complicated than its relational counterpart – recursive, nondeterministic * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill conditionally accessible

9 Previous proposals/standards for XML security Dozens of models have been proposed for XML: XACML, XACL, … Specifying and enforcing access-control at a physical level –annotate data nodes in an XML document with accessibility, and check accessibility at runtime (with optimizations for tree-pattern queries and tree/ DAG DTD s), or –materialize a view consisting of accessible data Problems: –costly (time, space): multiple accessibility annotations/views –error-prone: integrity maintenance becomes a problem when the underlying data or access policy is updated No support for schema availability: either deny access to any schema information, or expose the entire document DTD -- security breach

10 A seemingly plausible model annotate data nodes with accessibility check accessibility at runtime, and expose the document DTD D Example: permissible XPath queries: Q1: //patient[name=`Joe’]/record /treatment/*/tname Q2: //patient[name=`Joe’]/record /treatment//tname Security breach: from the document DTD it follows that if Q2(T) – Q1(T) is nonempty then Joe is involved in a clinical trial * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill

11 Our security model for XML Security administrator: specifies a access-control policy for each group by extending the document DTD with XPath qualifiers Derivation module: automatically derives a security-view definition from each policy: view DTD and mapping via XPath Query translation module: rewrite and optimize queries over views to equivalent queries over the underlying document XML document specification 1 specification k specification n derivation module Security view k (view DTD, xpath( )) Security view n (view DTD, xpath( )) Security view 1 (view DTD, xpath( )) query Optimizer Rewriter query query translation module

12 Overcome the limitations of previous proposals Specification and enforcement: at the conceptual (schema) level –no need to update the underlying XML data –no need to materialize views or perform runtime check Schema availability: view schema is automatically derived –characterizing accessible data –exposing necessary schema information only XML document specification 1 specification k specification n derivation module Security view k (view DTD, xpath( )) Security view n (view DTD, xpath( )) Security view 1 (view DTD, xpath( )) query Optimizer Rewriter query query translation module

13 Access-control specification DTD D : element type definitions A    ::= PCDATA |  | A1, …, Ak | A1 + … + Ak | A* Specification S = (D, access( )) : a mapping access( ) from the edges in the document DTD  { Y, N, [q] }. For each A  , for each B in , define Access(A, B) as –Y: accessible ( true ) –N: inaccessible ( false ) –[q]: XPath qualifier, conditional: accessible iff [q] holds XPath fragment: p ::=  | A | * | // | p/p | p  p | p[q] q ::= p | p = “c” | q1  q2 | q1  q2 |  q Access policyDocument DTD = + XPath qualifiers

14 Example: access policy S for syndrome surveillance access(hospital, patient) = [//diagnose = “ DIS ”] -- [q1] access(patient, record) = [diagnose = “ DIS ”] -- [q2] access(treatment, trial) = N access(treatment, regular) = N access(regular, tname) = Y conditionally accessible overriding: if access(A, B) = Y (N), then the B children of A override the accessibility of A inheritance: if access(A, B) is not explicitly defined, then the B children of A inherit the accessibility of A content-based: conditional accessibility via XPath qualifiers * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill [q1] [q2]

15 Properties of the specification language XML tree of the document DTD: the accessibility of each data node is uniquely defined by an access specification –relative to the path from root –a qualifier at a node a constrains the entire subtree rooted at a, e.g., [q2] constrains tname various levels of granularity: entire subtrees or specific elements schema level: the underlying XML data is not touched; efficient, easy to specify and maintain conditionally accessible * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill [q1] [q2]

16 Enforce access control – security views XML security view:  = (Dv, xpath( )) with respect to an access policy S = (D, access( )), Dv: view DTD, exposed to the user and characterizing the accessible information (of document DTD D ) wrt S Schema availability: to facilitate query formulation xpath( ): mapping from instances of D to instances of Dv defined in terms of XPath queries and view DTD Dv –for each A   in Dv, for each B in , xpath(A, B) = p –p: generates B children of an A element in a view p ::=  | A | * | // | p/p | p  p | p[q] q ::= p | p = “c” | q1  q2 | q1  q2 |  q

17 Example: view DTD for syndrome surveillance  = (Dv, xpath( )) with respect to access policy S = (D, access( )) * treatment tname * hospital SSN patient name record * diagnosisdate Document DTD D View DTD Dv Hide trial, trName, regular, bill Expose accessible information only * treatment tname * trial trName hospital SSN patient name record * diagnosisdate regular bill [q1] [q2]

18 Example: view definition for syndrome surveillance xpath( ): maps edges in view DTD Dv to paths in document DTD D hospital  patient* xpath(hospital,patient) = hospital/patient [q1] [q1]: [//diagnose=“ DIS ”] semantics: top-down construction preserving qualifiers in a specification patient hospital SSNrecordname patient  SSN, name, record* xpath(patient, SSN) = SSN, /* name */ xpath(patient, record) = record [q2] [q2]: [diagnose=“DIS”]

19 DTD-directed construction of security views record  date, diagnosis, treatment xpath(record, date) = date /* diagnosis, treatment */ patient hospital treatment  tname* xpath(treatment, tname) = //tname DTD-directed construction view DTD conformance Never materialized the construction strategy is just to give the semantics SSNrecordnamedatetreatmentdiagnosis tname treatment tname * trial trName regular bill tname

20 Derivation of security-view definition XML security views are far more intriguing than relational views multiple XPath queries vs. a single SQL query DTD vs. relational schema One needs an algorithm to compute a security-view definition: Input: an access policy S = (D, access( )) Output: a security-view definition  = (Dv, xpath( )) –sound: accessible information only –complete: all the accessible data (structure preserving) –DTD-conformant: conforming to the view DTD efficient: O(| S | 2 ) time generic: recursive/nondeterministic document DTDs

21 Algorithm: deriving a security-view definition Top-down traversal of the document DTD D short-cutting/renaming (via dummy) inaccessible element types normalizing the view DTD Dv and reducing dummy types * hospital patient * hospital patient [q1] xpath(hospital,patient) = hospital/patient[q1] SSN name record * SSN name record * [q2] xpath(patient, record) = record[q2] treatment diagnosisdate treatment diagnosisdate xpath(record, treatment) = treatment

22 deriving a security-view definition recursive and non-deterministic productions xpath(treatment, dummy2) = regular xpath(treatment, dummy1) = trail treatment tname * treatment tname * dummy1 trName regular bill dummy2 trial reducing dummy element types: ( dummy1/treatment)* / dummy2 / tname  dummy2/tname)  ( dummy1/treatment)* / dummy2 / tname  tname* xpath(treatment, tname) = //tname treatment tname *

23 Enforce access control via query rewriting security views are virtual: not materialized Efficiency: no extra costs to support multiple security views over the same large document simultaneously Consistency/integrity: updating the underlying data introduces no difficulties/overhead XML document Security view k (view DTD, xpath( )) query Optimizer Rewriter query translation module Query translation: one needs an efficient algorithm to rewrite queries over a security view to equivalent and efficient queries over the underlying document

24 algorithm rewrite Input: –  = (Dv, xpath( )) ( security view wrt S = (D, access( )) ), and –an XPath query Qv over the view ( Dv ) Output: an equivalent XPath query Qt over the document –for any XML document T of D, Qt(T) = Qv(  (T)) Dynamic programming: for any subquery Qv’ of Qv, any node A in view-DTD graph Dv rewrite Qv’ at A by incorporating xpath( A, _)  Qt’ (A) efficient: O(| Qv | |  | 2 ) time a practical class of XPath (with union, descendant, qualifiers) vs. tree-pattern queries studied in previous security models

25 Example: query rewriting for syndrome surveillance Qv = // patient[name=“Joe”] // tname over the view * treatment tname * hospital SSN patient name record * diagnosisdate xpath(hospital, patient) [ name = “Joe”] / xpath(patient, record) / xpath(record,treatment) / xpath(treatment, tname) * treatment tname * trial trName hospital SSN patient namerecord * diagnosisdate regular bill [q1] [q2] Qt = /hospital/patient[name = “Joe” and //diagnosis = “ DIS ”] /record[diagnosis = “ DIS ”] /treatment // tname equivalent query over document

26 Query optimization with structural constraints Optimize Qt = rewrite( , Qv ) by leveraging the document DTD D Q = A[B] // E[F] //H  A [B and C] // H  // F[G] / H  Q’ = A /B / E / F / H A B C E GF H DTD graph disjunction: exclusive constraints A [B and C]  empty-set exclusive constraint : an A element cannot have both B and C children at the same time conjunction: existence (nonexistence) constraints // F[G] / H  empty-set non-existence constraint : a F element does not have a G child A[B] // E[F] // H  A /B / E / F / H exclusive constraint : B and C do not coexist under an A element

27 Example: heuristic for XPath containment Q = // *[C] //E  // E  Q’ = A /B / E Q1  Q2  Q2 if Q1  Q2 // *[C] //E  // E  // E  A /B / E A B C E DTD graph * heuristic for XPath containment (NP-hard for small fragments in the presence of DTDs) image graph : evaluation of sub-queries over DTD graph containment test: extension of simulation –Q1  Q2 if image(Q1) is simulated by image(Q2) –qualifiers : inverse simulation effective: preliminary experimental study (speedup up to a factor of 2) A B [C] E image graph for // *[C] //E A B E image graph for // E

28 Summary security views: the first model for specifying/enforcing XML security at a schema level and providing schema availability –a fine-grained access-control specification language –an effective enforcement framework via security views view DTD: characterizing accessible information algorithm for deriving security-view definitions algorithms for query rewriting/optimization: no need to materialize views or to perform runtime security checks future work: –reasoning about security views (soundness, completeness, DTD conformance – subsume XPath satisfiability with DTDs ) –inference control in the presence of external knowledge A practical solution for securing XML querying