Presentation is loading. Please wait.

Presentation is loading. Please wait.

Structured-Document Processing Languages Spring 2005 Course Review Repetitio mater studiorum est!

Similar presentations


Presentation on theme: "Structured-Document Processing Languages Spring 2005 Course Review Repetitio mater studiorum est!"— Presentation transcript:

1 Structured-Document Processing Languages Spring 2005 Course Review Repetitio mater studiorum est!

2 SDPL 2005Course Review2 Goals of the Course n Learn about central models and languages for –manipulating –representing –transforming and –querying structured documents (or XML) n "Generic XML processing technology"

3 SDPL 2005Course Review3 Methodological Goals n Some central professional skills –consulting of technical specifications –experimenting with SW implementations n Ability to think…? –to find out relationships –to apply knowledge in new situations n ("Pidgin English" for scientific communication)

4 SDPL 2005Course Review4 XML? n Extensible Markup Language is not a markup language! –does not fix a tag set nor its semantics (like markup languages like HTML do) n XML is –A way to use markup to represent information –A metalanguage »supports definition of specific markup languages through XML DTDs or Schemas »E.g. XHTML a reformulation of HTML using XML

5 SDPL 2005Course Review5 XML Encoding of Structure: Example <S> S E <W><W></W> </E>Helloworld! W Hello W world! </W> </S> A=1

6 SDPL 2005Course Review6 Basics of XML DTDs n A Document Type Declaration provides a grammar (document type definition, DTD) for a class of documents Syntax (in the prolog of a document instance): [ ]> Syntax (in the prolog of a document instance): [ ]> n DTD is the union of the external and internal subset

7 SDPL 2005Course Review7 How do Declarations Look Like? <!ATTLIST item price NMTOKEN #REQUIRED unit (FIM | EUR) ”EUR” >

8 SDPL 2005Course Review8 Element type declarations The general form is where E is a content model The general form is where E is a content model  regular expression of element names n Content model operators: E | F : alternationE, F: concatenation E? : optionalE* : zero or more E+ : one or more(E) : grouping

9 SDPL 2005Course Review9 XML Schema Definition Language n XML syntax –schema documents easier to manipulate by programs (than the special DTD syntax) n Compatibility with namespaces –can validate documents using declarations from multiple sources n Content datatypes –44 built-in datatypes (including primitive Java datatypes, datatypes of SQL, and XML attribute types) –mechanisms to derive user-defined datatypes

10 SDPL 2005Course Review10 XML Namespaces </xsl:template></xsl:stylesheet>

11 SDPL 2005Course Review11 3. XML Processor APIs n How can applications manipulate structured documents? –An overview of document parser interfaces 3.1 SAX: an event-based interface 3.2 DOM: an object-based interface 3.3 JAXP: Java API for XML Processing

12 SDPL 2005Course Review12 A SAX-based application Application Main Routine startDocument() startElement() characters() Parse() Callback Routines endElement() </A>Hi! "A",[i="1"] "Hi!" "A"

13 SDPL 2005Course Review13 DOM: What is it? n An object-based, language-neutral API for XML and HTML documents –Allows programs and scripts to build, navigate, and modify documents n In contrast to “Serial Access XML” could think as “Directly Obtainable in Memory”

14 SDPL 2005Course Review14 <invoice form="00" type="estimated"> type="estimated"> John Doe John Doe Pyynpolku 1 Pyynpolku 1 70460 KUOPIO 70460 KUOPIO...... DOM structure model invoice name addressdata address form="00"type="estimated" John Doe streetaddresspostoffice 70460 KUOPIO Pyynpolku 1... Document Element NamedNodeMap Text

15 SDPL 2005Course Review15 Overview of XSLT Transformation

16 SDPL 2005Course Review16 JAXP 1.1 n An interface for “plugging-in” and using XML processors in Java applications –includes packages »org.xml.sax: SAX 2.0 interface »org.w3c.dom: DOM Level 2 interface »javax.xml.parsers: initialization and use of parsers »javax.xml.transform: initialization and use of transformers (XSLT processors) n Included in JDK starting from vers. 1.4

17 SDPL 2005Course Review17 XML.getXMLReader() JAXP: Using a SAX parser (1) f.xml.parse( ”f.xml”) ”f.xml”).newSAXParser()

18 SDPL 2005Course Review18 f.xml JAXP: Using a DOM parser (1).parse(”f.xml”).newDocument().newDocumentBuilder()

19 SDPL 2005Course Review19 XSLT JAXP: Using Transformers (1).newTransformer(…).transform(.,.)

20 SDPL 2005Course Review20 CSS - Cascading Style Sheets n A stylesheet language –mainly to specify the representation of web pages by attaching style (fonts, colours, margins, …) to HTML/XML documents Example style rule: H1 {color: blue; font-style: bold;} Example style rule: H1 {color: blue; font-style: bold;}

21 SDPL 2005Course Review21 CSS Processing Model (simplified) 0. Parse the document into a tree 1. Match style rules to elements of the tree –annotate each element with a value assigned for each relevant property »inheritance and, in case of competing rules, elaborate "cascade" rules applied to select which value is assigned 2. Generate a formatting structure of the annotated document tree –consists of nested rectangular boxes 3. Render the formatting structure –display, print, audio-synthesize,...

22 SDPL 2005Course Review22 Transformation & Formatting XSLT script I II

23 SDPL 2005Course Review23 Page regions A simple page can contain 1-5 regions, specified by child elements of the simple-page-master A simple page can contain 1-5 regions, specified by child elements of the simple-page-master

24 SDPL 2005Course Review24 Top-level formatting objects n Slightly simplified: fo:root fo:layout-master-set (fo:simple-page-master | fo:page-sequence-master)+ fo:page-sequence + fo:region- body fo:region- before? fo:region-end? fo:region-start? fo:region- after? specify masters for page sequences, by referring to simple-page-masters contents of pages fo:flow

25 SDPL 2005Course Review25 XQuery in a Nutshell n Functional expression language Strongly-typed: (XML Schema) types may be assigned to expressions statically Strongly-typed: (XML Schema) types may be assigned to expressions statically n Extends XPath 2.0 ( but not all axes required) –common for XQuery 1.0 and XPath 2.0: »Functions and Operators, W3C WD 4/4/2005 n Roughly: XQuery  XPath' + XSLT' + SQL'

26 SDPL 2005Course Review26 FLWOR ("flower") Expressions n Constructed from for, let, where, order by and return clauses (~SQL select-from-where) n Form: (ForClause | LetClause)+ WhereClause? OrderByClause? "return" Expr n FLWOR binds variables to values, and uses these bindings to construct a result (an ordered sequence of nodes)

27 SDPL 2005Course Review27 XQuery Example for $pn in distinct-values( doc(”sp.xml”)//pno) let $sp:=doc(”sp.xml”)//sp_tuple[pno=$pn] where count($sp) >= 3 order by $pn return { { {$pn}, {$pn}, {avg($sp/price)} {avg($sp/price)} } }

28 SDPL 2005Course Review28 Course Main Message n XML is a universal way to represent info as tree-like data structures n There are specialized and powerful technologies for processing it n The development is going on


Download ppt "Structured-Document Processing Languages Spring 2005 Course Review Repetitio mater studiorum est!"

Similar presentations


Ads by Google