Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation MQP Advisor: Prof. Elke A. Rundensteiner Sponsor: Verizon Laboratories Incorporated MQP Project Members: Tien Vu, Mirek Cymer, John Lee
HTML vs. XML Microsoft, IBM, Informix, Oracle, Sun,...
XML Data Management by RDBMS Advantages: Advantages: Efficient query and analysis tools. Matured database tools available. Easy integration with existing business databases. Issues: Issues: Map between XML and Relational Model. Update Propagation. Query Translation and Optimization.
Motivation for Mapping Query Performance vary with respect to how data is mapped. Query Performance vary with respect to how data is mapped. Flexible mapping: fixed translation and restructure Flexible mapping: fixed translation and restructure Mustang 2001 Ford car make model year car makemodel year FordMustang2001 Alternate Mapping
XMLData Sub system Legend Rainbow Architecture DTDXML XML Query XML User XML Query Engine DTDM Manager XML Manager Restructuring Subsystem RDBMS
Goals of our MPQ What: What: Implement and evaluate restructuring subsystems within the large-scale Rainbow system. How: How: Learn about the database technologies and web tools. Translate research ideas to software system design. Practice software engineering techniques: UML, engineer and reuse code. Design an experimental test plan and test bed. Conduct performance study and analysis.
Restructuring Subsystem DTDXML XML Query XML User XML Query Engine DTDM Manager XML Manager Restructuring Mapping RestructureOperatorLibrary Restructurer Query Storage XMLModel Sub system RelationalModel InternalProcess Legend
Restructuring Operators 11 Restructuring Operators: 11 Restructuring Operators: Rename Item/Attribute Switch Nesting Pushup/Pushdown Attribute Pushup/Pushdown Nesting Split/Merge Nesting Reference/Dereference
Mapping: Sequence of Restructure Operators Mapping is modeled as a sequence of reversable restructuring operators, Operator Name + Parameters. Mapping is modeled as a sequence of reversable restructuring operators, Operator Name + Parameters. For Example: For Example: pushUpAttribute( ‘account_number’, ‘value’, ‘invoice’, ‘account_number’ ); pushUpAttribute( ‘bill_period’, ‘value’, ‘invoice’, ‘bill_peroid’ ); renameItem( ‘invoice’, ‘summary’ ); invoice value account_numbill_period summary account_numbill_period
SQLs for Push-Up Attributes CREATE VIEW new.A (, a) AS CREATE VIEW new.A (, a) AS SELECT A., B.b SELECT A., B.b FROM old.A, old.B WHERE B.pid = A.iid CREATE VIEW new.B () AS CREATE VIEW new.B ( ) AS SELECT B. SELECT B. FROM old.B A B A B Push-up b a
Example SQLs Inline: make.value into car as Attribute make. Inline: make.value into car as Attribute make. Mapping: Mapping: pushUpAttribute( ‘account_number’, ‘value’, ‘invoice’, ‘account_number’ ); SQL statements: SQL statements: CREATE VIEW new.invoice (iid, pid, account_number) AS SELECT SELECT invoice.iid, invoice.pid, account_number.value FROM old.invoice, old.account_number WHERE account_number.pid = invoice.iid CREATE VIEW new.account_number (iid, pid) AS SELECT SELECT account_number.iid, account_number.pid FROM old.account_number
Rainbow Implementation Development Tools Development Tools Java: Visual Café2, Javadocs, JAVA2 Oracle 8i, XML 4J, JDBC1.2, SQL Queries Code Facts Code Facts 44 total system classes 17 classes of Rainbow 27 classes reused ? lines of system code ? lines of Rainbow code ? lines of code reused
Screen Shot
Screen Shot
Rainbow Test & Experimental Evaluation Experimental Setup Experimental Setup Oracle 8i Windows NT Data Data Created a DTD Randomly generated XML Hand translated queries Factors Factors Type of query Number of operations
Query Performance Evaluation
Rainbow Conclusions Technical accomplishments Technical accomplishments Functional prototype system Feasibility of Rainbow concepts Automated test bed designed Performance evaluations show that: (Ideal) Moving up data on the embedded-relational-level yields better query performance for Join queries. Knowledge gained Knowledge gained OO, Java, JDBC, SQL, RDBMS, XML, DTD Teamwork & S/W Engineering & Software Reuse Logistics of setting up an experiment Future work Future work Experiment test plans and test beds to realize the full potential of the restructuring component.
Rainbow: XML and Relational Database Design, Implementation, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor: Elke A. Rundensteiner Ph. D Student: Xin Zhang Sponsor By: Verizon Laboratories Incorporated Visit Rainbow at
Recycled!!!
XML: The Future of the Web Benefits: Benefits: Efficient query and analysis tools. Matured Data Warehousing support. Easy Integration with existing business database. Applications: Applications: E-commerce Web-based industries Jun 9 - Jul 8, 2000 Sprint $0.25
XML and Relational Database Problem Problem Many Application usually change its data very frequently. e.g., flight reservation, online billing, inventory. Current Solution Current Solution Reloading the complete XML document when changed which is very expensive. Rainbow Solution Rainbow Solution Incrementally propagate XML Document Updates to Stored XML Data. Goal: XML Repository Implemented using RDBMS Approach: Flexible Mapping Features: DTD Metadata Management in RDBDTD Metadata Management in RDB Automatic Schema CreationAutomatic Schema Creation Incremental Update PropagationIncremental Update Propagation XML Query OptimizationXML Query Optimization
Rainbow Analysis
Rainbow Analysis Cont..
HTML vs. XML HTML HTML<h1>Car</h1><h2>Make</h2> Ford Mustang Ford Mustang<h2>Seats</h2><p>5 Top Speed Top Speed 70 m.p.h 70 m.p.h XML Car Ford Mustang 5 70