Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 17 Data Warehousing, Archival, and Repositories

Similar presentations


Presentation on theme: "Chapter 17 Data Warehousing, Archival, and Repositories"— Presentation transcript:

1 Chapter 17 Data Warehousing, Archival, and Repositories
데이터베이스연구실 석사3학기 박보영

2 Introduction Common applications of XML in the enterprise
Large-scale transaction system Content management system Persistence of information using XML Traditional strategies for these processes How adding XML to the mix XML improve

3 In this chapter Data warehousing Data archival Data repositories
How data warehousing work How XML may be used to streamline the transaction and warehousing process Data archival For most system Data loses its relevance over time Moved off the system when some threshold is reached Typically want to set this data aside in some form so that, should we need to refer to it later Can retrieve the data with the minimum of effort How XML may be used to make this process easier Data repositories How XML may be used to persist detail-only information in a way that makes the relational database perform better, while taking advantage of XML technologies(such as XSLT) to present the detail information when necessary “All the examples in this chapter are designed to work with SQL Server 6.x+, ADO 2.5+, VBScript(IE 4.0+)”

4 Data Warehousing In this section
One of the problems often faced by data architects is to design a system Detail information or transactionsquick retrieval Query and summarize that dataeasily and efficiently In this section Define data warehousing Discuss the concepts that drive data warehouse design Take a look XML can facilitate the data warehousing process Look at examples of XML in use in a data warehouse

5 The Two Roles of Data Detailed information gathering
In an enterprise-level data solution, the data in the database plays two roles: • Information gathering • Querying and summarization of data Detailed information gathering The first use of the database is to gather data from external sources(such as other databases, XML, or simple delimited text files) There are some things that remain consistent across all implementations that gather data: Detail oriented Write-heavy Transacted Space-conscious Heavily normalized

6 Infromation Querying and Summarization
The other use of database is to provide the ability to query the data and summarize it to extrapolate trends, volumes, and other useful information from the details. The specific mechanisms will vary from implementation to implementation, but there are some constants: Summary-oriented Read-only Results-conscious Less normalized

7 The Traditional Solution
The traditional approach to designing a relational database to support a platform is to design one database to perform both data acquisition function and querying/summarization functions (ch17_ex01.sql) CREATE TABLE Customer ( CustomerKey integer PRIMARY KEY, Name varchar(50), Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10)) CREATE TABLE shipMethod ( shipMethodKey integer PRIMARY KEY, shipMethod varchar(5)) INSERT shipMethod (shipMethodKey, shipMethod) VALUES (1, 'FedEx') INSERT shipMethod (shipMethodKey, shipMethod) VALUES (2, 'USPS') INSERT shipMethod (shipMethodKey, shipMethod) VALUES (3, 'UPS') CREATE TABLE Invoice ( InvoiceKey integer PRIMARY KEY, invoiceDate datetime, shipDate datetime, shipMethodKey integer CONSTRAINT FK_Invoice_shipMethodKey FOREIGN KEY (shipMethodKey) REFERENCES shipMethod (shipMethodKey), CustomerKey integer CONSTRAINT FK_Invoice_Customer FOREIGN KEY (CustomerKey) REFERENCES Customer (CustomerKey)) CREATE INDEX ix_Invoice_invoiceDate ON Invoice (invoiceDate) CREATE INDEX ix_Invoice_shipDate ON Invoice (shipDate) CREATE INDEX ix_Invoice_CustomerKey ON Invoice (CustomerKey) CREATE TABLE Part ( PartKey integer PRIMARY KEY, name varchar(20), size varchar(10) NULL, color varchar(10) NULL) CREATE TABLE LineItem ( LineItemKey integer PRIMARY KEY, InvoiceKey integer CONSTRAINT FK_LineItem_Invoice FOREIGN KEY (InvoiceKey) REFERENCES Invoice (InvoiceKey), PartKey integer CONSTRAINT FK_LineItem_Part FOREIGN KEY (PartKey) REFERENCES Part (PartKey), Quantity integer, Price float) CREATE INDEX ix_LineItem_PartKey ON LineItem (PartKey) CREATE INDEX ix_LineItem_InvoiceKey ON LineItem (InvoiceKey)

8 - When run, this script creates the following table structure:
Customer Part LineItem Invoice ShipMethod This set of tables supports the addition of invoice Space is conserved for these tables where possible by normalizing out Part and Customer This normalization also facilitates querying and reporting on Part and Customer however, there are a couple of issues with using this type of dual-purpose structure the system is well tuned to either data entry or reporting serious danger of wait states for reporting and querying due to the locking being performed by the data entry

9 separate from the transaction tables
Many databases try to get around the problem by adding structures that will be used to drive reporting In this table separate from the transaction tables have all the appropriate indices to allow data to be queried and summarized from those tables in any way the business rules might call for for example, add a MonthlyPartTotal table (ch17_ex02.sql): this is a step on the right direction another problems CREATE TABLE MonthlyPartTotal ( summaryMonth tinyint, summaryYear smallint, PartKey integer CONSTRAINT FK_MPT_PartKey FOREIGN KEY (PartKey) REFERENCES Part (PartKey), Quantity integer)

10 (Ch17_03.sql) CREATE TRIGGER UpdateMPT ON LineItem FOR INSERT, UPDATE
AS BEGIN IF (SELECT COUNT(*) FROM MonthlyPartTotal, inserted, Invoice WHERE summaryMonth = DATEPART(mm, Invoice.invoiceDate) AND summaryYear = DATEPART(yyyy, Invoice.invoiceDate) AND inserted.InvoiceKey = Invoice.InvoiceKey AND MonthlyPartTotal.PartKey = inserted.PartKey) > 0 UPDATE MonthlyPartTotal SET Quantity = MonthlyPartTotal.Quantity + inserted.Quantity FROM inserted, Invoice AND MonthlyPartTotal.PartKey = inserted.PartKey ELSE INSERT MonthlyPartTotal (summaryMonth, summaryYear, PartKey, Quantity) SELECT DATEPART(mm, Invoice.invoiceDate), DATEPART(yyyy, Invoice.invoiceDate), inserted.PartKey, inserted.Quantity WHERE inserted.InvoiceKey = Invoice.InvoiceKey IF (SELECT COUNT(*) FROM deleted) > 0 SET Quantity = MonthlyPartTotal.Quantity - deleted.Quantity FROM deleted, Invoice AND deleted.InvoiceKey = Invoice.InvoiceKey AND MonthlyPartTotal.PartKey = deleted.PartKey END

11 The Data Warehousing Solution
On-Line Transaction Processing(OLTP) databases – the information gatherers On Line Analytical Processing(OLAP) databases – the query and summarization handlers Parts that make up an OLAP database The role of XML in improving the function of OLAP databases

12 On-Line Transaction Processing(OLTP)
The gathering of detail information is often referred to as OLTP Handle all data gathering processes Design tables to support the acquisition of transactional data The database is an normalized ad possible, with the spedific table or tables being kept as small as possible to reduce insert time and disk consumption Data archival strategies - In place to make OLTP database does not grow too large

13 OLTP database structure (ch17_ex04.sql):
CREATE TABLE Customer ( CustomerKey integer PRIMARY KEY, Name varchar(50), Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10)) CREATE TABLE shipMethod ( shipMethodKey integer PRIMARY KEY, shipMethod varchar(5)) INSERT shipMethod (shipMethodKey, shipMethod) VALUES (1, 'FedEx') INSERT shipMethod (shipMethodKey, shipMethod) VALUES (2, 'USPS') INSERT shipMethod (shipMethodKey, shipMethod) VALUES (3, 'UPS') CREATE TABLE Invoice ( InvoiceKey integer PRIMARY KEY, invoiceDate datetime, shipDate datetime, shipMethodKey integer CONSTRAINT FK_Invoice_shipMethodKey FOREIGN KEY (shipMethodKey) REFERENCES shipMethod (shipMethodKey), CustomerKey integer CONSTRAINT FK_Invoice_Customer FOREIGN KEY (CustomerKey) REFERENCES Customer (CustomerKey)) CREATE TABLE Part ( PartKey integer PRIMARY KEY, name varchar(20), size varchar(10) NULL, color varchar(10) NULL) CREATE TABLE LineItem ( LineItemKey integer PRIMARY KEY, InvoiceKey integer CONSTRAINT FK_LineItem_Invoice FOREIGN KEY (InvoiceKey) REFERENCES Invoice (InvoiceKey), PartKey integer CONSTRAINT FK_LineItem_Part FOREIGN KEY (PartKey) REFERENCES Part (PartKey), Quantity integer, Price float) Customer Part LineItem Invoice ShipMethod

14 On-Line Analytical processing(OLAP)
Database intended to support querying and summarization Designed with one goal in mind – the querying and summarization of detail data by any number of specifics defined in the business rules for our system Create our tables to support the querying and retrieval of this information, with the ability to rapidly insert information(lower priority) Indexing technologies designed specifically for OLAP querying  optimize(query and report)

15 Parts of an OLAP Database
Fact Tables-where the information we wish to report on is stored Measure Tables-where you store the measures used to do the reporting Schema-where the two types of table above interact to give you your reports Fact Tables Contain the data that we are planning to report on, at the lowest level of granularity we will need to access Denormalizationall facts together into oe table Fact table creation script for the inventory control team (ch17-ex05.sql) Fact table creation script for the executive team (ch17-ex06.sql) CREATE TABLE factInvoicePart ( InvoiceKey integer PRIMARY KEY IDENTITY, CustomerKey integer, ShipDateKey integer, ShipMethodKey integer, PartKey integer, Quantity integer, Price float) CREATE TABLE factDailyTotal ( DailyTotalKey integer PRIMARY KEY IDENTITY, InvoiceDate integer, partKey integer, partCount integer, partUnitPrice float)

16 Measure/Dimension Tables
Parameters(use in the WHERE clause in query) (Ch17_ex07.sql) CREATE TABLE measureCustomer ( CustomerKey integer PRIMARY KEY, Name varchar(50), Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10)) CREATE TABLE measureShipDate ( ShipDateKey integer PRIMARY KEY, Month tinyint, Day tinyint, Year smallint) CREATE TABLE measureShipMethod ( shipMethodKey integer PRIMARY KEY, shipMethod varchar(5)) CREATE TABLE measurePart ( PartKey integer PRIMARY KEY, name varchar(20), size varchar(10) NULL, color varchar(10) NULL)

17 Schema Star schema Snowflake schema
Composed of the tables in our database Joined together by the foreign keys that relate the individual tables together Two types of schema that are normally used when designing an OLAP database: Star schema Snowflake schema

18 Star schema(ch17_ex08.sql)
CREATE TABLE measureCustomer ( CustomerKey integer PRIMARY KEY, Name varchar(50), Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10)) CREATE TABLE measureShipDate ( ShipDateKey integer PRIMARY KEY, ShipMonth tinyint, ShipDay tinyint, ShipYear smallint) CREATE TABLE measureShipMethod ( shipMethodKey integer PRIMARY KEY, shipMethod varchar(5)) CREATE TABLE measurePart ( PartKey integer PRIMARY KEY, name varchar(20), size varchar(10) NULL, color varchar(10) NULL) CREATE TABLE factInvoicePart ( InvoiceKey integer PRIMARY KEY IDENTITY, CustomerKey integer CONSTRAINT fk_fact_Customer FOREIGN KEY (CustomerKey) REFERENCES measureCustomer (CustomerKey), ShipDateKey integer CONSTRAINT fk_fact_ShipDate FOREIGN KEY (ShipDateKey) REFERENCES measureShipDate (ShipDateKey), ShipMethodKey integer, CONSTRAINT fk_fact_ShipMethod FOREIGN KEY (ShipMethodKey) REFERENCES measureShipMethod (shipMethodKey), PartKey integer, CONSTRAINT fk_fact_Part FOREIGN KEY (PartKey) REFERENCES measurePart (PartKey), Quantity integer, Price float) CREATE INDEX ix_fact_Customer ON factInvoicePart (CustomerKey) CREATE INDEX ix_fact_ShipDate ON factInvoicePart (ShipDateKey) CREATE INDEX ix_fact_ShipMethod ON factInvoicePart (shipMethodKey) CREATE INDEX ix_fact_Part ON factInvoicePart (PartKey) measureCustomer factInvoicePart measureShipDate measureShipMethod measurePart This script creates the structure seen below:

19 Snowflake schema(ch17_ex09.sql)
CREATE TABLE measureCustomer ( CustomerKey integer PRIMARY KEY, Name varchar(50), Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10)) CREATE TABLE measureShipYear ( ShipYearKey integer PRIMARY KEY, ShipYear smallint) CREATE TABLE measureShipMonth ( ShipMonthKey integer PRIMARY KEY, ShipMonth tinyint, ShipYearKey integer CONSTRAINT fk_measure_ShipYear FOREIGN KEY (ShipYearKey) REFERENCES measureShipYear (ShipYearKey)) CREATE TABLE measureShipDate ( ShipDateKey integer PRIMARY KEY, ShipDay tinyint, ShipMonthKey integer CONSTRAINT fk_measure_ShipMonth FOREIGN KEY (ShipMonthKey) REFERENCES measureShipMonth (ShipMonthKey)) CREATE TABLE measureShipMethod ( shipMethodKey integer PRIMARY KEY, shipMethod varchar(5)) CREATE TABLE measurePart ( PartKey integer PRIMARY KEY, name varchar(20), size varchar(10) NULL, color varchar(10) NULL) CREATE TABLE factInvoicePart ( InvoiceKey integer PRIMARY KEY IDENTITY, CustomerKey integer CONSTRAINT fk_fact_Customer FOREIGN KEY (CustomerKey) REFERENCES measureCustomer (CustomerKey), ShipDateKey integer CONSTRAINT fk_fact_ShipDate FOREIGN KEY (ShipDateKey) REFERENCES measureShipDate (ShipDateKey), ShipMethodKey integer, CONSTRAINT fk_fact_ShipMethod FOREIGN KEY (ShipMethodKey) REFERENCES measureShipMethod (shipMethodKey), PartKey integer, CONSTRAINT fk_fact_Part FOREIGN KEY (PartKey) REFERENCES measurePart (PartKey), Quantity integer, Price float) CREATE INDEX ix_fact_Customer ON factInvoicePart (CustomerKey) CREATE INDEX ix_fact_ShipDate ON factInvoicePart (ShipDateKey) CREATE INDEX ix_fact_ShipMethod ON factInvoicePart (shipMethodKey) CREATE INDEX ix_fact_Part ON factInvoicePart (PartKey) CREATE INDEX ix_measure_Month ON measureShipDate (shipMonthKey) CREATE INDEX ix_measure_Year ON measureShipMonth (shipYearKey)

20 This SQL script creates the following set of tables:
measureCustomer factInvoicePart measureShipMethod measureShipDate measureShipMonth measureShipYear measurePart

21 Cubes To better facilitate the query and retrieval of OLAP data – each star schema or snowflake schema creates a cube(the detail information is aggregated along each of the measures) Example-star schema measure Customer factInvoicePart measureShipDate measureShipMethod measurePart factInvoicePart table-100,000 invoice(shipping method) FedEx UPS USPS ALL invoices 72,102 15,209 12,689 Add another dimension-customer FedEx UPS USPS Customer 1 22,615 5,192 3,972 Customer 2 12,541 2,188 918 Customer 3 7,342 3,182 1,212 Customer 4 15,320 716 3,132 Customer 5 14,284 3,931 3,455

22 Building your Fact Tables
Two step: Pulling the data out of the OLTP database updating the OLAP database with that data In order to perform rapidly update Native export forms are often used (such as BCP files for SQL Server) Loaded into the OLAP database Some problems The most notable of which is the relative fragility of the export and import code If the OLTP or OLAP database change, this code will need to be revisited

23 The Role of XML Make it easier to update the OLAP database from the OLTP database Effective way to handle OLTP data Using XML for OLAP Update Data Using XML for OLTP Data rather than building specialized import routines for all the different formats, if you build one XML importer and than building leverage the data provider’s ability to export in XML Can save a lot of aggravation and coding time

24 (Ch17_ex09.xml) (Ch17_ex09.dtd)
<!ELEMENT InvoiceBulk (Invoice+)> <!ELEMENT Invoice (LineItem+)> <!ATTLIST Invoice CustomerKey CDATA #REQUIRED ShipDate CDATA #REQUIRED ShipMethodKey CDATA #REQUIRED> <!ELEMENT Part EMPTY> <!ATTLIST Part PartKey CDATA #REQUIRED Quantity CDATA #REQUIRED Price CDATA #REQUIRED> <?xml version="1.0"?> <!DOCTYPE listing SYSTEM "ch27_ex09.dtd" > <InvoiceBulk> <Invoice CustomerKey="17" ShipDate="10/17/2000" ShipMethodKey="1"> <Part PartKey="33" Quantity="17" Price="0.20" /> PartKey="29" Quantity="13" Price="0.15" /> </Invoice> CustomerKey="12" ShipDate="10/11/2000" ShipMethodKey="2"> PartKey="31" Quantity="19" Price="0.10" /> </InvoiceBulk>

25 Function using XSLT(ch17_ex10.xsl):
<?xml version="1.0"?> <xsl:stylesheet xmlns:xsl=" version="1.0"> <xsl:output method="text" /> <xsl:template match="/"> <xsl:for-each select="InvoiceBulk/Invoice/Part"> <xsl:text>INSERT factInvoicePart ( CustomerKey, ShipDateKey, ShipMethodKey, PartKey, Quantity, Price) SELECT </xsl:text> <xsl:value-of /><xsl:text>, </xsl:text> <xsl:text>measureShipDate.ShipDateKey, <xsl:value-of /><xsl:text>, <xsl:value-of /><xsl:text>, <xsl:value-of /><xsl:text>, <xsl:value-of /> <xsl:text> FROM measureShipDate</xsl:text> <xsl:text> WHERE DATEPART(mm, '</xsl:text> <xsl:value-of /> <xsl:text>')=measureShipDate.shipMonth AND DATEPART(dd, '</xsl:text> <xsl:text>')=measureShipDate.shipDay AND DATEPART(yyyy, '</xsl:text> <xsl:text>')=measureShipDate.shipYear</xsl:text> <xsl:text> GO </xsl:text> </xsl:for-each> </xsl:template> </xsl:stylesheet>

26 ch17_ex10. xsl -> Ch17_ex09. xml => ch17_10a. sql
INSERT factInvoicePart ( CustomerKey, ShipDateKey, ShipMethodKey, PartKey, Quantity, Price) SELECT 17, measureShipDate.ShipDateKey, 1, 33, 17, 0.20 FROM measureShipDate WHERE DATEPART(mm, '10/17/2000')=measureShipDate.shipMonth AND DATEPART(dd, '10/17/2000')=measureShipDate.shipDay AND DATEPART(yyyy, '10/17/2000')=measureShipDate.shipYear GO 29, 13, 0.15 INSERT factInvoicePart ( CustomerKey, ShipDateKey, ShipMethodKey, PartKey, Quantity, Price) SELECT 12, measureShipDate.ShipDateKey, 2, 31, 19, 0.10 FROM measureShipDate WHERE DATEPART(mm, '10/11/2000')=measureShipDate.shipMonth AND DATEPART(dd, '10/11/2000')=measureShipDate.shipDay AND DATEPART(yyyy, '10/11/2000')=measureShipDate.shipYear GO 29, 17, 0.15

27 (Ch17_ex10b.sql): INSERT measureCustomer ( CustomerKey, [Name],
Address, City, State, PostalCode) VALUES ( 12, 'Homer J. Simpson', '742 Evergreen Terrace', 'Springfield', 'KY', '12345') GO 17, 'Kevin B. Williams', '744 Evergreen Terrace', INSERT measureShipMethod ( shipMethodKey, shipMethod) VALUES ( 1, 'Fedex') GO 2, 'USPS') 3, 'UPS') INSERT measurePart ( PartKey, [name], [size], color) VALUES ( 31, 'grommets', '3 in.', 'blue') GO INSERT measurePart ( PartKey, [name], [size], color) 29, 'sprockets', '2 in.', 'silver') 33, 'brackets', '1 in.', 'red') INSERT measureShipDate ( ShipDateKey, ShipMonth, ShipDay, ShipYear) VALUES ( 1, 10, 11, 2000 ) GO 2, 17,

28 Result table structure
Run ch17_ex08.sql against a sample database to create the tables in the firdt place Transformed ch17_ex10.xml with ch17_ex10.xsl to produce the output script ch17_ex10a.sql Run ch17_ex10b.sql to populate tables with initial data Run ch17_ex10a.sql to add data to populate table factInvoicePart with data measure Customer factInvoicePart measureShipDate measureShipMethod measurePart

29 Using XML for OLAP Data OLTP systems typically access one discrete transaction at a time Locking can be avoid XML technology can be easily leveraged

30 RDBMS structure (ch17_ex11.sql):
(ch17_ex12.dtd): <!ELEMENT Invoice (LineItem+)> <!ATTLIST Invoice CustomerKey CDATA #REQUIRED ShipDate CDATA #REQUIRED ShipMethodKey CDATA #REQUIRED> <!ELEMENT Part EMPTY> <!ATTLIST Part PartKey CDATA #REQUIRED Quantity CDATA #REQUIRED Price CDATA #REQUIRED> CREATE TABLE Customer ( CustomerKey integer PRIMARY KEY, Name varchar(50), Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10)) CREATE TABLE shipMethod ( shipMethodKey integer PRIMARY KEY, shipMethod varchar(5)) INSERT shipMethod (shipMethodKey, shipMethod) VALUES (1, 'FedEx') INSERT shipMethod (shipMethodKey, shipMethod) VALUES (2, 'USPS') INSERT shipMethod (shipMethodKey, shipMethod) VALUES (3, 'UPS') CREATE TABLE Part ( PartKey integer PRIMARY KEY, name varchar(20), size varchar(10) NULL, color varchar(10) NULL) (ch17_ex12.xml): <?xml version="1.0"?> <!DOCTYPE listing SYSTEM "ch27_ex12.dtd" > <InvoiceBulk> <Invoice CustomerKey="17" ShipDate="10/17/2000" ShipMethodKey="1"> <Part PartKey="33" Quantity="17" Price="0.20" /> PartKey="29" Quantity="13" Price="0.15" /> </Invoice> </InvoiceBulk>

31 Data Archival Classical Approches What about the LineItem data?
What about the other tables in the database? What about human readability? OLTP database structure (ch17_ex04.sql): CREATE TABLE Customer ( CustomerKey integer PRIMARY KEY, Name varchar(50), Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10)) CREATE TABLE shipMethod ( shipMethodKey integer PRIMARY KEY, shipMethod varchar(5)) INSERT shipMethod (shipMethodKey, shipMethod) VALUES (1, 'FedEx') INSERT shipMethod (shipMethodKey, shipMethod) VALUES (2, 'USPS') INSERT shipMethod (shipMethodKey, shipMethod) VALUES (3, 'UPS') CREATE TABLE Invoice ( InvoiceKey integer PRIMARY KEY, invoiceDate datetime, shipDate datetime, shipMethodKey integer CONSTRAINT FK_Invoice_shipMethodKey FOREIGN KEY (shipMethodKey) REFERENCES shipMethod (shipMethodKey), CustomerKey integer CONSTRAINT FK_Invoice_Customer FOREIGN KEY (CustomerKey) REFERENCES Customer (CustomerKey)) CREATE TABLE Part ( PartKey integer PRIMARY KEY, name varchar(20), size varchar(10) NULL, color varchar(10) NULL) CREATE TABLE LineItem ( LineItemKey integer PRIMARY KEY, InvoiceKey integer CONSTRAINT FK_LineItem_Invoice FOREIGN KEY (InvoiceKey) REFERENCES Invoice (InvoiceKey), PartKey integer CONSTRAINT FK_LineItem_Part FOREIGN KEY (PartKey) REFERENCES Part (PartKey), Quantity integer, Price float)

32 Using XML for Data Archival
(Ch17_ex13.dtd) (Ch17_ex13.xml) <!ELEMENT Invoice (Customer, LineItem+)> <!ATTLIST Invoice invoiceDate CDATA #REQUIRED shipDate CDATA #REQUIRED shipMethod (USPS | UPS | FedEx) #REQUIRED> <!ELEMENT Customer EMPTY> <!ATTLIST Customer Name CDATA #REQUIRED Address CDATA #REQUIRED City CDATA #REQUIRED State CDATA #REQUIRED PostalCode CDATA #REQUIRED> <!ELEMENT LineItem (Part)> <!ATTLIST LineItem Quantity CDATA #REQUIRED Price CDATA #REQUIRED> <!ELEMENT Part EMPTY> <!ATTLIST Part Size CDATA #REQUIRED Color CDATA #REQUIRED> <?xml version="1.0"?> <!DOCTYPE listing SYSTEM "ch27_ex13.dtd" > <Invoice invoiceDate="10/17/2000" shipDate="10/20/2000" shipMethod="USPS"> <Customer Name="Homer J. Simpson" Address="742 Evergreen Terrace" City="Springfield" State="KY" postalCode="12345" /> <LineItem Quantity="12" Price="0.10"> <Part Color="Blue" Size="3-inch" Name="Grommets" /> </LineItem> </Invoice>

33 Data Repositories Classical Approaches (Ch17_ex14a.sql)
(Ch17_ex14b.sql) CREATE TABLE Property ( PropertyKey integer PRIMARY KEY IDENTITY, NumberOfBedrooms tinyint, HasSwimmingPool bit, Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10), SellerName varchar(50), SellerAgent varchar(50)) CREATE TABLE Property ( PropertyKey integer PRIMARY KEY IDENTITY, NumberOfBedrooms tinyint, HasSwimmingPool bit) CREATE TABLE PropertyDetail ( PropertyKey integer PRIMARY KEY, Address varchar(50), City varchar(30), State char(2), PostalCode varchar(10), SellerName varchar(50), SellerAgent varchar(50))

34 Using XML for Data Repositories
(Ch17_ex15.dtd) (Ch17_ex15.sql) <!ELEMENT Property EMPTY> <!ATTLIST Property NumberOfBedrooms CDATA #REQUIRED HasSwimmingPool CDATA #REQUIRED Address CDATA #REQUIRED City CDATA #REQUIRED State CDATA #REQUIRED PostalCode CDATA #REQUIRED SellerName CDATA #REQUIRED SellerAgent CDATA #REQUIRED> CREATE TABLE Property ( PropertyKey integer PRIMARY KEY IDENTITY, NumberOfBedrooms tinyint, HasSwimmingPool tinyint, DocumentFile varchar(50)) There are a number of advantages to using XML for data repositories: Greater flexibility in providers Faster querying and summarization More presentation options Fewer locking concerns

35 Summary How XML may be used to help create a data warehouse
The benefits you can realize by using XML as your archival strategy How XML can improve the functionality of your data repository


Download ppt "Chapter 17 Data Warehousing, Archival, and Repositories"

Similar presentations


Ads by Google