Download presentation
Presentation is loading. Please wait.
Published byEdgar Clark Modified over 8 years ago
1
Semi-structured Data and XML What is semistructured data? Using XML to describe semistructured data. Querying XML
2
Semi-structured Data ● Some data, like the text on a typical web page, is unstructured ● Other data, like the data represented in a relational database, is structured ● Much of the data we meet in everyday life falls somewhere in between; this is semi-structured data
3
Why study semi-structured Data ● It would be great to use the web as a database ● The web cannot be constrained by a schema ● It would be great to base searches on meaning as well as text ● XML is starting to make this possible, by implementing concepts derived from the study of semi-structured data ● XML is also emerging as a standard for data exchange
4
Examples of semi-structured data ● Minutes of a meeting: title, date, place, time, present, apologies, actions ● Poster for a show: artist, date, venue, time, ticket prices, where to buy ● Car for sale: model, colour, year, price, contact ● Need a housemate: location, rooms, rent ● Module description: module code, title, short name, aims, learning outcomes,...
5
Semistructured Data ● Has some structure: – Similar entity instances are grouped together – elements are fairly well understood – instances have similar elements ● The structure may be irregular: – elements may occur in any order – Some elements may be missing – elements don't all have the same form; elements with the same name may have different types
6
Definition Semistructured data is data that may be irregular or incomplete, and have a structure that may change rapidly or unpredictably (Connolly and Begg, 5th Ed. p.1056)
7
Example: sports Club Members ● member number:22; given name:Joan; family name:Jenkins; sports:soccer, chess; teams:Soccer A ● member number:15; name:Jeremy Fox; sport:squash; committee role:treasurer ● member number:17; member:family name:Li; sport:soccer; membership renewal due:Jan 2012
8
XML can describe semi-structured data ● member number:22, given name:Joan, family name:Jenkins, sports:soccer, chess, teams:Soccer A ● Joan Jenkins soccer badminton Soccer A
9
XML can describe semi-structured data ● member number:15; name:Jeremy Fox; sport:squash; committee role:treasurer ● Jeremy Fox squash treasurer
10
XML can describe semi-structured data ● member number:17; member:family name:Li; sport:soccer; membership renewal due:Jan 2012 ● Li soccer Jan 2012
11
Attribute or Element? ● XML attributes, like number, are usually used to describe facts about the data ● XML elements, like name, sport and so on are usually parts of the data
12
XPath ● A way to select parts of an XML document ● Used when transforming documents – for example, to identify document elements in order to transform them to html for display purposes ● But XPath can be used more generally to search documents ● Essential foundation for XQuery
13
Summary ● Semi-structured data ● Using XML to describe semi-structured data ● XPath ● Next we move on to XQuery and Native XML databases
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.