Presentation is loading. Please wait.

Presentation is loading. Please wait.

Management of XML Documents in Object-Relational Databases Thomas Kudrass Matthias Conrad HTWK Leipzig EDBT-Workshop XML-Based Data Management Prague,

Similar presentations


Presentation on theme: "Management of XML Documents in Object-Relational Databases Thomas Kudrass Matthias Conrad HTWK Leipzig EDBT-Workshop XML-Based Data Management Prague,"— Presentation transcript:

1 Management of XML Documents in Object-Relational Databases Thomas Kudrass Matthias Conrad HTWK Leipzig EDBT-Workshop XML-Based Data Management Prague, 24 March 2002

2 © T. Kudrass, HTWK Leipzig Overview Motivation Object-Relational Database Concepts Parsing XML Documents XML-to-ORDB Mapping Meta-Data Special Issues Conclusions

3 © T. Kudrass, HTWK Leipzig Motivation Storing of XML documents in DBMS Use existing database technology Dealing with complex objects: – XML documents = complex objects – avoid any decomposition – object-relational database technology good choice to represent complex objects

4 © T. Kudrass, HTWK Leipzig User-Defined Types in ORDB Complex Data Types – Object Type Object Type – Collection Type Collection Type Object References Object Views

5 © T. Kudrass, HTWK Leipzig Example: Object Types CREATE TYPE Type_Professor AS OBJECT ( PNameVARCHAR(80), SubjectVARCHAR(120) ); object-valued object table attribute CREATE TYPE Type_Course AS OBJECT ( CREATE TABLE TabProfessor OF Name VARCHAR(100), Type_Professor; ProfessorType_Professor );

6 © T. Kudrass, HTWK Leipzig Example: Collection Types CREATE TYPE Type_Professor AS OBJECT ( PNameVARCHAR(80), SubjectVARCHAR(120) ); Array Nested Table CREATE TYPE TypeVa_ Professor AS CREATE TYPE Type_TabProfessor AS VARRAY(5) OF Type_Professor; TABLE OF Type_Professor; CREATE TABLE TabDept ( DName VARCHAR(80), Professor Type_TabProfessor ) NESTED TABLE Professor STORE AS TabProfessor_List;

7 © T. Kudrass, HTWK Leipzig Example: Object References CREATE TYPE Type_Professor AS OBJECT ( PNameVARCHAR(80), DeptVARCHAR(120) ); CREATE TABLE TabProfessor OF Type_Professor; CREATE TYPE Type_Course AS OBJECT ( NameVARCHAR(200), Prof_RefREF Type_Professor ); CREATE TABLE TabCourse OF Type_Course; Reference to objects of object table TabProfessor

8 © T. Kudrass, HTWK Leipzig Parsing DTD and XML XML V2 ParserDTD Parser XML DocumentDTD Schema Definition Well-Formedness Validity Check XML2 Oracle XML DOM TreeDTD DOM Tree -------------------------- -------------------------- -------------------------- -- -------------------------- -------------------------- -------------------------- ----- DBMS Oracle JDBC / ODBC Syntax Check

9 © T. Kudrass, HTWK Leipzig 1 2 3 4 5 6 7 8 9 10 11 12 13 14

10 © T. Kudrass, HTWK Leipzig Object–Based–Mapping DTD Classes Tables CLASS A { CREATE TABLE A ( STRING b; a_pk INTEGER NOT NULL, C c; b VARCHAR(30) NOT NULL); CLASS C { CREATE TABLE C ( STRING d;} c_pk INTEGER NOT NULL, a_fk INTEGER NOT NULL, d VARCHAR(10) NOT NULL); Modification of the Mapping Algorithm [Bourret]  No class definitions  Use objects of the DTD tree

11 © T. Kudrass, HTWK Leipzig Each Complex Element  Table Each Set-Valued Element  Table Primary Key in each Table 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 CREATE TABLE TabUniversity ( IDUniversity 2 CREATE TABLE TabStudent ( IDStudent 4 CREATE TABLE TabCourse ( IDCourse 5 CREATE TABLE TabProfessor ( IDProfessor CREATE TABLE TabSubject ( IDSubject Step 1

12 © T. Kudrass, HTWK Leipzig Other Elements & Attributes  Table Columns CREATE TABLE TabUniversity ( IDUniversity, attrStudyCourse, CREATE TABLE TabStudent ( IDStudent, attrStudNr, attrLName, attrFName, CREATE TABLE TblMatrikelNr ( IDMatrikelNr, attrMNummer, CREATE TABLE TabCourse ( IDCourse, attrName, attrCreditPts, CREATE TABLE TabProfessor ( IDProfessor, attrPName, attrDept, CREATE TABLE TabSubject ( IDSubject, attrSubject, Step 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14

13 © T. Kudrass, HTWK Leipzig Relationships between Elements  Foreign Keys CREATE TABLE TabUniversity ( IDUniversity INTEGER NOT NULL, attrStudyCourse VARCHAR(4000) NOT NULL, PRIMARY KEY (IDUniversity)); CREATE TABLE TabStudent ( IDStudent INTEGER NOT NULL, IDUniversity INTEGER NOT NULL, attrStudNr VARCHAR(4000) NOT NULL, attrLName VARCHAR(4000) NOT NULL, attrFName VARCHAR(4000) NOT NULL, PRIMARY KEY (IDStudent), CONSTRAINT conMatrikel FOREIGN KEY (IDUniversity) REFERENCES TabUniversity (IDUniversity));... Step 3

14 © T. Kudrass, HTWK Leipzig ORDBS Oracle and XML Basic Idea: – Generate an object-relational schema from the DTD – Natural representation of an XML document by combining user-defined types Different Mapping Rules: – Simple elements – Complex elements – Set-valued elements – Complex set-valued elements

15 © T. Kudrass, HTWK Leipzig XML Attributes & Simple Elements Elements of #PCDATA type and XML attributes  Attributes of the object type Domain of Simple Elements: – No type information in the DTD: numeric vs. alphanumeric? length? – Restrictions of the DBMS (e.g. VARCHAR [Oracle] 4000 characters) Mapping of an XML attribute of a simple element  Definition of an object type for both attribute and element

16 © T. Kudrass, HTWK Leipzig CREATE TABLE TabProfessor OF Type_Professor; CREATE TYPE Type_Professor AS OBJECT ( attr PAddressVARCHAR(4000), attrPNameVARCHAR(4000), attrSubject VARCHAR(4000), attrDeptType_Dept); CREATE TYPE Type_Dept AS OBJECT ( attrDept VARCHAR(4000), attrDAddressVARCHAR(4000)); XML Attributes & Simple Elements

17 © T. Kudrass, HTWK Leipzig Complex Elements Nesting of elements by composite DB object types CREATE TABLE TabUniversity ( attrStudyCourse VARCHAR(4000), attrStudent Type_Matrikel ); CREATE TYPE Type_Student AS OBJECT ( attrStudNrVARCHAR(4000), attrLNameVARCHAR(4000), attrFNameVARCHAR(4000), attrCourse Type_Vorlesung ); CREATE TYPE Type_Course AS OBJECT ( attrNameVARCHAR(4000), attrProfessorType_Professor, attrCreditPts VARCHAR(4000)); CREATE TYPE Type_Professor AS OBJECT ( attrPNameVARCHAR(4000), attrSubject VARCHAR(4000), attrDept VARCHAR(4000)); INSERT INTO TabUniversity VALUES ( ‘Computer Science', Type_Student('23374','Conrad','Matthias', Type_Course(‘Databases II‘, Type_Professor(‘Kudrass‘, ‘Database Systems‘', ‘Computer Science‘), '4'))); SELECT u.attrStudent.attrLname FROM TabUniversity u WHERE u.attrStudent.attrCourse.attrProfessor.attrPName = ‘Kudrass';

18 © T. Kudrass, HTWK Leipzig Set-Valued Elements Multiple Occurrence (in DTD): marked by + or * DBMS Restrictions – collection type applicable to set-valued elements with text- valued subelements, e.g. ARRAY OF VARCHAR – collection type not applicable to set-valued elements with complex subelements subelements may be set-valued again Solutions – use newer DBMS releases (e.g. Oracle 9i) – model relationships with object references

19 © T. Kudrass, HTWK Leipzig Set-Valued Elements CREATE TYPE Type_Student AS OBJECT ( attrJahrgang VARCHAR(4000), attrUniversity REF Type_University ); CREATE TABLE TabStudent OF Type_Student; CREATE TYPE Type_University AS OBJECT( attrStudyCourse VARCHAR(4000)); CREATE TABLE TabUniversity OF Type_University; Set-valued element Student Modeling in object type Type_Student with a reference to objects of the table TabUniversity Reference to University Objects

20 © T. Kudrass, HTWK Leipzig Set-Valued Elements CREATE TYPE TypeVA_Course AS VARRAY(100) OF Type_Course; CREATE TYPE TypeVA_Professor AS VARRAY(100) OF Type_Professor; CREATE TYPE TypeVA_Subject AS VARRAY(100) OF VARCHAR(4000); CREATE TABLE TabUniversity ( attrStudyCourse VARCHAR(4000), attrStudent Type_Matrikel ); CREATE TYPE Type_Student AS OBJECT ( attrStudNrVARCHAR(4000), attrLNameVARCHAR(4000), attrFNameVARCHAR(4000), attrCourse Type_Vorlesung ); CREATE TYPE Type_Course AS OBJECT ( attrNameVARCHAR(4000), attrProfessorType_Professor, attrCreditPts VARCHAR(4000)); CREATE TYPE Type_Professor AS OBJECT ( attrPNameVARCHAR(4000), attrSubject VARCHAR(4000), attrDept VARCHAR(4000));

21 © T. Kudrass, HTWK Leipzig Set-Valued Elements Example INSERT INTO TabUniversity VALUES ( ‘Computer Science', TypeVA_Student ( Type_Student('23374','Conrad','Matthias', TypeVA_Course ( Type_Course(‘Databases II‘, TypeVA_Professor ( Type_Professor(‘Kudrass‘, TypeVA_Subject ( ‘Database Systems,‘Operating Systems‘), ‘Computer Science‘)),‘4‘), Type_Course(‘CAD Intro‘, TypeVA_Professor ( Type_Professor(‘Jaeger‘, TypeVA_Subject ( ‘CAD‘,‘CAE‘), ‘Computer Science‘)),‘4‘),...)), Type_Student(‘00011',‘Meier',‘Ralf', … ) … )...);

22 © T. Kudrass, HTWK Leipzig Dealing with Null Values Restrictions with NOT NULL constraints in object-relational DB schema – NOT NULL constraints in table - not in object type! – NOT NULL constraints not applicable to collection types Object-valued attributes: – use CHECK constraints for NOT NULL Loss of DTD semantics DTD in the database

23 © T. Kudrass, HTWK Leipzig Dealing with CHECK Constraints CREATE TYPE Type_Address AS OBJECT ( attrStreetVARCHAR(4000), attrCityVARCHAR(4000)); CREATE TYPE Type_Course AS OBJECT ( attrNameVARCHAR(4000), attrAddress Type_Address); CREATE TABLE TabCourse OF Type_Course ( attrNameNOT NULL, CHECK (attrAdresse.attrStrasse IS NOT NULL)); // ORA-02290: Desired error message 1. INSERT INTO TabCourse ( VALUES (‘CAD Intro’,Type_Address (NULL,’Leipzig’); // ORA-02290: Undesired error message 2. INSERT INTO TabCourse ( VALUES ('RN', NULL)

24 © T. Kudrass, HTWK Leipzig Meta-Data about XML Documents Unique DocumentID for each Document Prolog Information Document Location (URL) Name Space Element vs. Attribute

25 © T. Kudrass, HTWK Leipzig Naming Conventions for DB Objects Rules: – Tab Elementname  Table Name – Type _Elementname  Object Type Name – TypeVa _Elementname  Array Name No Conflicts with Keywords Introduction of a Schema ID Naming Rule: SchemaID + Naming Convention + Name CREATE TYPE DTD01_Type_University CREATE TYPE DTD02_Type_University AS OBJECT ( AS OBJECT ( attrStudyCourse VARCHAR(4000) ); attrRegister VARCHAR(4000) );

26 © T. Kudrass, HTWK Leipzig Conclusions: Advantages Non-atomic domains possible – Natural representation of XML Documents – Nesting of any complexity possible Simple queries by using dot notation Using object references to represent relationships (OIDs)

27 © T. Kudrass, HTWK Leipzig Conclusions: Drawbacks Mapping Deficiencies – Possible restrictions of element types in collections – No adequate mapping of NOT NULL constraints Loss of Information – Prolog, Comments, Processing Instructions, Prolog – Entity References – Attribute vs. Element ? Schema Evolution – Modification of DTD  Modification of DB Type Information – Target type: VARCHAR - not sufficient!

28 © T. Kudrass, HTWK Leipzig Outlook Graph-based creation of a schema Source: XML Schema Use CLOB datatype Enhance Meta-Schema – Comments, Processing Instructions and their position in document – Entity references and their substitution text


Download ppt "Management of XML Documents in Object-Relational Databases Thomas Kudrass Matthias Conrad HTWK Leipzig EDBT-Workshop XML-Based Data Management Prague,"

Similar presentations


Ads by Google