Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slide 1 Web-Base Management Systems Aaron Brown and David Oppenheimer CS294-7 February 11, 1999.

Similar presentations


Presentation on theme: "Slide 1 Web-Base Management Systems Aaron Brown and David Oppenheimer CS294-7 February 11, 1999."— Presentation transcript:

1 Slide 1 Web-Base Management Systems Aaron Brown and David Oppenheimer CS294-7 February 11, 1999

2 Slide 2 Introduction Online data is stored in both databases (relational) and web sites (hypertext) Need single framework to manage both types of data and present integrated views Solution: Web Base Management Systems (WBMSs) –2 challenges 1) querying and extracting structure from semi-structured web data, transforming it, and presenting custom views 2) mapping structured database data to the web (adding navigational access paths, redundancy,...) –To address these challenges, we need a data model that maps between relational and hypertextual models

3 Slide 3 ARANEUS Data Models RelationalADMHTML Navigational access Structure

4 Slide 4 ARANEUS Data Model ADM = Logical data model for web hypertexts –Based on page schemes and navigational access paths –Page scheme = logical structure shared by a set of pages »Like a “class” –Web page = instance of page scheme »Like an “object” with identifier (URL) + attributes

5 Slide 5 ADM Example Fragment

6 Slide 6 Adding Structure to HTML RelationalADMHTML Navigational access Structure

7 Slide 7 EDITOR: Structuring HTML EDITOR starts with an existing ADM scheme –Generated by inspection of web site EDITOR maps web page text to attributes of an ADM page scheme –“Wrapping” a web page –Imposes structure on web pages EDITOR uses a procedural language to guide the wrapping process –Each page seen as object with extraction methods »One method for each attribute of page »Method accesses page’s HTML source, extracts value of corresponding attribute

8 Slide 8 Querying ADM-Structured Hypertext RelationalADMHTML Navigational access Structure

9 Slide 9 ULIXES: A Navigational Query Lang. Language for defining relational views over hypertext that follows an ADM scheme –Based on navigational expressions (path expressions) DEFINE TABLE statement creates relational views based on page schemes –local materialized view (tuples) or –virtual view »user can then pose SQL queries across multiple views »optimizer chooses optimal navigation path through site to satisfy query fetches hypertext pages and extracts attributes via EDITOR wrappers cost metric is number of HTML page fetches

10 Slide 10 ULIXES Example DEFINE TABLEVLDBPapers (Authors, Title, Reference) ASAuthorSearchPage.NameForm.Submit -> AuthorPage.WorkList INDBLPScheme USINGAuthorPage.WorkList.Authors, AuthorPage.WorkList.Title, AuthorPage.WorkList.Reference WHEREAuthorSearchPage.NameForm.Name = ‘Leonardo Da Vinci’ AuthorPage.WorkList.Reference LIKE ‘%VLDB%’

11 Slide 11 Generating ADM from existing DB RelationalADMHTML Navigational access Structure

12 Slide 12 The ARANEUS Design Methodology Database Conceptual Design (ER) Hypertext Conceptual Design (NCM) Hypertext Logical Design (ADM) DB Mapping (PENELOPE) + Page Design (HTML) Database Logical Design (relational) Web Site Generation

13 Slide 13 Database Conceptual Model Starting point for database design Conceptual description of a domain Represents essential properties of data abstractly Entity-Relationship Model –Based on entities and relationships among entities –Rectangles = entity sets »Associated attributes are connected with lines –Diamonds = relationship sets »Lines connect entity sets via relationship sets

14 Slide 14 ER Example

15 Slide 15 Hypertext Conceptual Design ER not suitable for modeling hypertext –no directed paths (links) –hypertext access paths not modeled (web page hierarchies) –no way to group related entities into a singe “macroentity” Navigational Conceptual Model (NCM) describes these conceptual properties of hypertext –macroentities (groups of related ER entites) model hypertext nodes »associated with simple (atomic) or complex (structured) attributes, either mono- or multi-valued –directed relationships model links (may be bidirectional) –union nodes model link targets that can be of different types –aggregations model hierarchical access paths

16 Slide 16 Mapping ER to NCM: Example Seminar Professor SPlace Responsible Room Room# NamePhone Title Speaker Date 1:1 1:N ER Model Department General Education Research Seminar People Professor Responsible 1:1 Room# Title Speaker Date Name Phone NCM Model

17 Slide 17 Mapping NCM to ADM 1) macroentity -> one or more pages single-valued attribute -> ADM simple attribute multi-valued attribute -> ADM list 2) directed relationship -> link to another page scheme –anchor = a descriptive key of target macroentity –reference = URL of target page scheme 3) aggregation node -> ADM “unique” page scheme –unique page scheme = page scheme with only one instance 4) long lists -> forms –list items retrieved through program running on server

18 Slide 18 Mapping NCM to ADM: Example

19 Slide 19 The ARANEUS Design Methodology Database Conceptual Design (ER) Hypertext Conceptual Design (NCM) Hypertext Logical Design (ADM) DB Mapping (PENELOPE) + Page Design (HTML) Database Logical Design (relational) Web Site Generation

20 Slide 20 Generating web site from ADM + DB RelationalADMHTML Navigational access Structure

21 Slide 21 Hypertext Views of DB Data Given a database and an ADM scheme for it –database may be local »derived from design methodology »uses derived ADM scheme –composed from one or more remote sites »derived from integrated relational view produced by one or more ULIXES queries »uses new ADM scheme concocted to match integrated view PENELOPE language used to integrate ADM and DB in a generated hypertext –PENELOPE description = ADM augmented with URL’s and references to database fields

22 Slide 22 PENELOPE Description Query: reorganize (Da Vinci’s VLDB) papers based on year DEFINE PAGE YearPage AS URLURL( ); Year:TEXT ; WorkList:LIST OF (Authors: TEXT ; Title: TEXT ; Reference: TEXT ; ToRefPage: LINK TO ConferencePage UNION JournalPage ); FROMDaVinciPapers DEFINE PAGE DaVinciYearsPage UNIQUE ASURL ‘result.html’; YearList:LIST OF(Year: TEXT ; ToYearPage:LINK TO YearPage (URL( ))); FROMDaVinciPapers

23 Slide 23 Derived Hypertext View

24 Slide 24 Resulting Web Pages

25 Slide 25 Retrospective Exceptions during wrapping –Logically homogenous pages may be physically heterogeneous »Different ways of laying out the same information »Errors masked by browsers ULIXES syntax is difficult for beginners –Alternatives »Fill out forms corresponding to pre-determined ULIXES queries »Developed POLYPHEMUS query interface User selects path for query by clicking on graphical representation of ADM page schemes Push vs. Pull –Either supported; hybrid model preferred –Dealing with updates »each DB update generates a mixed transaction that updates both the DB and any pushed (static) HTML pages Managing internal sites –PENELOPE-generated HTML includes description of page scheme and tags attributes »Like XML but uses HTML comments

26 Slide 26 Conclusion ARANEUS provides database-like functionality for mixed web/relational DB data More to be filled in later... RelationalADMHTML


Download ppt "Slide 1 Web-Base Management Systems Aaron Brown and David Oppenheimer CS294-7 February 11, 1999."

Similar presentations


Ads by Google