Presentation is loading. Please wait.

Presentation is loading. Please wait.

7 September 2005NVO Summer School 2005 - Aspen1 Advanced but ever-so-useful XML Technologies: Schema, XPath, XQuery, XSL an incomplete introduction Ray.

Similar presentations


Presentation on theme: "7 September 2005NVO Summer School 2005 - Aspen1 Advanced but ever-so-useful XML Technologies: Schema, XPath, XQuery, XSL an incomplete introduction Ray."— Presentation transcript:

1 7 September 2005NVO Summer School 2005 - Aspen1 Advanced but ever-so-useful XML Technologies: Schema, XPath, XQuery, XSL an incomplete introduction Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY

2 7 September 2005NVO Summer School 2005 - Aspen2 XML Schema What it is: –A W3C* standard for defining and verifying an XML grammar An XML Schema document describes… –a set of legal XML tags and attributes, –what order they go in, and –what values are allowed. An XML Schema-aware XML parser can tell you if an XML document follows the rules of the grammar *World Wide Web Consortium Why you might care: –VO uses XML to encode metadata and service messages –XML Schema is used to define metadata encoding and message syntax –Ability to read XML Schema will help you understand what metadata is needed by an app and how to encode it VO end users and many developers will never need to know about Schema Can be helpful for debugging XML documents and messages What youll get from this session: –Rudimentary skills for reading an XML Schema document to discern the specified XML syntax –Understand the role of namespaces in supporting multiple schemas –A tool for validating an instance document

3 7 September 2005NVO Summer School 2005 - Aspen3 Schema uses XML to describe syntax http://www.w3.org/TR/xmlschema-0/ Contains a list of definitions –Elements and types –Attributes, groups, attribute-groups Types –Simple types: string, integer, dateTime, etc. The Astronomy Digital Image Library 10000 2002-09-30T07:45:00 –Complex types: contain other elements –Defining types Anonymous type – directly inside the definition of an element Global type – top-level definition, can be reused

4 7 September 2005NVO Summer School 2005 - Aspen4 Namespaces Namespace: a schemas unique identifier –Set with the targetNamespace attribute –Used by an XML document to indicate which schema it is compliant with –URI format (URL or URN) Using a schema –Instance document: XML document that follows the grammar defined by the schema –xmlns attribute used to identify the default namespace that elements belong to –Tagging elements with namespace prefixes Prefix defined (anywhere) with xmlns: prefix xmlns:res= " http://nvoss.org/Resource " Prefixes attached to element/attribute name denotes it belongs to the associated schema … –Unqualified elements: a special technique for namespaces Tag the root element (or xsi:type) only Do not use xmlns Can make using multiple schemas easier

5 7 September 2005NVO Summer School 2005 - Aspen5 Validating an instance document A validating parser can test whether an instance document is compliant with its schema xsi:schemaLocation attribute tells parser where to find Schema document(s) –Look-up list made up of namespace-location pairs xsi:schemaLocation= namespace file-or-URL … xsi:schemaLocation is just a recommendation –Parser may have local copies cached for more efficient parsing validate : a tool for validating XML documents against schemas

6 7 September 2005NVO Summer School 2005 - Aspen6 Simple schema xmltech-simple.xsd

7 7 September 2005NVO Summer School 2005 - Aspen7 Simple schema targetNamespace xmltech-simple.xsd

8 7 September 2005NVO Summer School 2005 - Aspen8 Simple schema global = Direct child of xs:schema Only global elements can serve as a documents root element globally-defined element xmltech-simple.xsd

9 7 September 2005NVO Summer School 2005 - Aspen9 Simple schema Anonymous type definition xmltech-simple.xsd

10 7 September 2005NVO Summer School 2005 - Aspen10 Simple schema sequence : a list of elements that must appear in order choice : one from a list of elements may appear Other models: group, all, any Content model xmltech-simple.xsd

11 7 September 2005NVO Summer School 2005 - Aspen11 Simple schema title: any string referenceURL : URI format type : restricted to a specified list of strings Locally-defined elements xmltech-simple.xsd

12 7 September 2005NVO Summer School 2005 - Aspen12 Simple schema Default: minOccurs=1 maxOccurs=1 minOccurs=0: optional minOccurs=1: required Occurance restrictions xmltech-simple.xsd

13 7 September 2005NVO Summer School 2005 - Aspen13 The compliant instance document NCSA Astronomy Digital Image Library http://adil.ncsa.uiuc.edu/ Archive xmltech-simple.xml

14 7 September 2005NVO Summer School 2005 - Aspen14 The compliant instance document Default namespace xmltech-simple.xml NCSA Astronomy Digital Image Library http://adil.ncsa.uiuc.edu/ Archive

15 7 September 2005NVO Summer School 2005 - Aspen15 The compliant instance document xsi namespace Prefix defined Default namespace xmltech-simple.xml NCSA Astronomy Digital Image Library http://adil.ncsa.uiuc.edu/ Archive

16 7 September 2005NVO Summer School 2005 - Aspen16 The compliant instance document xsi namespace Prefix defined Default namespace xsi:schemaLocation xsi:schemaLocation says, Load schema called http://nvoss.org/VOResource from local file, xmltech-simple.xsd xmltech-simple.xml NCSA Astronomy Digital Image Library http://adil.ncsa.uiuc.edu/ Archive

17 7 September 2005NVO Summer School 2005 - Aspen17 The compliant instance document xsi namespace Prefix defined Default namespace xsi:schemaLocation xsi:schemaLocation says, Load schema called http://nvoss.org/VOResource from local file, xmltech-simple.xsd xmltech-simple.xml NCSA Astronomy Digital Image Library http://adil.ncsa.uiuc.edu/ Archive

18 7 September 2005NVO Summer School 2005 - Aspen18 Global (Reusable) Types Define a prefix for the targetNamespace xmltech-globaltypes.xml

19 7 September 2005NVO Summer School 2005 - Aspen19 Global (Reusable) Types Type defined here… Define a prefix for the targetNamespace xmltech-globaltypes.xml

20 7 September 2005NVO Summer School 2005 - Aspen20 Global (Reusable) Types …and used here Type defined here… Define a prefix for the targetNamespace xmltech-globaltypes.xml

21 7 September 2005NVO Summer School 2005 - Aspen21 Global (Reusable) Elements Element defined here… …and used here xmltech-elrefs.xml

22 7 September 2005NVO Summer School 2005 - Aspen22 Any entity or component of a VO application that is describable and identifiable by a IVOA Identifier. the full name given to the resource URL pointing to a human-readable document describing this resource. … Schema Documentation xmltech-documented.xml Most schema components can have documentation attached Documentation is important for defining metadata schemas Carnivore registry extracts documentation directly from schema for display to users

23 7 September 2005NVO Summer School 2005 - Aspen23 Derived Types Two ways to derive a new type from an existing one –Extension Applicable only to complex types Adds additional elements or attributes to the content model ADIL Query Page http://adil.ncsa.uiuc.edu/help.html Archive http://adil.ncsa.uiuc.edu/QueryPage.html

24 7 September 2005NVO Summer School 2005 - Aspen24 Derived Types Two ways to derive a new type from an existing one –Restriction Simple types: restrict the legal values in some way Ex: integer: restrict range string: restrict to match a pattern Complex types: –Disallow optional elements, attributes –Restricting occurances –Setting default or fixed values where none were previously set

25 7 September 2005NVO Summer School 2005 - Aspen25 Extending Schemas I: plugging-in derived entities Suppose you want to define an element in terms of a base type but allow any type derived from it to be inserted in its place –a form of polymorphism Two techniques –xsi:type a label in the instance document) –Substitution groups a label in the schema document

26 7 September 2005NVO Summer School 2005 - Aspen26 Extending Schemas I: plugging-in derived entities xsi:type technique From our example… –The resource element has the type Resource –The Service type is derived from the Resource type –Declaring a service element is not necessary In the instance document… ADIL Query Page http://adil.ncsa.uiuc.ed Archive http://adil.ncsa.uiuc.edu/Q IVOA VOResource schema uses this technique In the schema document…

27 7 September 2005NVO Summer School 2005 - Aspen27 Extending Schemas I: plugging-in derived entities Substitution group technique From our example… –The Service type is derived from the Resource type –Add substitutionGroup attribute to service element definition means we can substitute service element anywhere a resource is allowed In the schema document… In the instance document… { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/2/717978/slides/slide_27.jpg", "name": "7 September 2005NVO Summer School 2005 - Aspen27 Extending Schemas I: plugging-in derived entities Substitution group technique From our example… –The Service type is derived from the Resource type –Add substitutionGroup attribute to service element definition means we can substitute service element anywhere a resource is allowed In the schema document… In the instance document… In the instance document…

28 7 September 2005NVO Summer School 2005 - Aspen28 separate schema file, separate namespace –Enables schema evolution in a backward compatible way. –Extension file uses xs:import to load schema being extended –Instance documents generally must define prefixes for both the original schema namespace and the extension namespace. Example: VOResource metadata schemas –Core metadata schema: VOResource –Extension schemas: VODataService, VORegistry, ConeSearch, SimpleImageAccess Extending Schemas II: placing extensions into a separate namespace

29 7 September 2005NVO Summer School 2005 - Aspen29 A word about namespaces in instance documents…(FYI) An instance document must indicate what namespace(s) the elements belong to –So that the document can be validated –3 ways to indicate this controlled by elementFormDefault attribute in Schema file 1.Default namespace xmlns tags a whole section Advantages: Great for simple ns use Disadvantages: Can be confusing for authors when drawing on multiple schema 2.Namespace prefixes Invidually tag elements Advantages: Visually explicit about ns membership Can mix in elements from different ns with the same name. Disadvantages: Error-prone, ugly when drawing multiple schema 3.Unqualified: tag (global) root element Advantages: Single prefix tag needed at the top of the document; no other tags needed Good for when using multiple schemas together Least error prone Makes XPaths simpler Disadvantage: Cant have 2 elements from different ns w/same name elementFormDefault="qualified"elementFormDefault="unqualified" … Archive … IVOA VOResource uses this technique

30 7 September 2005NVO Summer School 2005 - Aspen30 A word about namespaces in instance documents…(FYI) An instance document must indicate what namespace(s) the elements belong to –So that the document can be validated –3 ways to indicate this controlled by elementFormDefault attribute in Schema file 1.Default namespace xmlns tags a whole section Advantages: Great for simple ns use Disadvantages: Can be confusing for authors when drawing on multiple schema 2.Namespace prefixes Invidually tag elements Advantages: Visually explicit about ns membership Can mix in elements from different ns with the same name. Disadvantages: Error-prone, ugly when drawing multiple schema 3.Unqualified: tag (global) root element Advantages: Single prefix tag needed at the top of the document; no other tags needed Good for when using multiple schemas together Least error prone Makes XPaths simpler Disadvantage: Cant have 2 elements from different ns w/same name elementFormDefault="qualified"elementFormDefault="unqualified" … Archive … IVOA VOResource uses this technique

31 7 September 2005NVO Summer School 2005 - Aspen31 XPath What it is: –A W3C standard syntax for pointing to elements, attributes, and/or their values in an XML file Why you might care: –XPath is used in two other important XML technologies: XQuery and XSL –XPath is used in ADQL to query a registry via the standard Registry Interface What youll get from this session: –Ability to form simple XPath queries

32 7 September 2005NVO Summer School 2005 - Aspen32 Can you tell me how to get to Sesame Street? An XPath is a set of directions from one point in an XML document to another sunny /NewYork/Burough[@name="Brooklyn]/light[3]/SesameStreet Begin at the start of the document

33 7 September 2005NVO Summer School 2005 - Aspen33 Can you tell me how to get to Sesame Street? An XPath is a set of directions from one point in an XML document to another sunny /NewYork/Burough[@name="Brooklyn]/light[3]/SesameStreet Go to NewYork

34 7 September 2005NVO Summer School 2005 - Aspen34 Can you tell me how to get to Sesame Street? An XPath is a set of directions from one point in an XML document to another sunny /NewYork/Burough[@name="Brooklyn"]/light[3]/SesameStreet Find the Borough named Brooklyn

35 7 September 2005NVO Summer School 2005 - Aspen35 Can you tell me how to get to Sesame Street? An XPath is a set of directions from one point in an XML document to another sunny /NewYork/Burough[@name="Brooklyn"]/light[3]/SesameStreet Go to the 3 rd light

36 7 September 2005NVO Summer School 2005 - Aspen36 Can you tell me how to get to Sesame Street? An XPath is a set of directions from one point in an XML document to another sunny /NewYork/Burough[@name="Brooklyn"]/light[3]/SesameStreet And theres SesameStreet

37 7 September 2005NVO Summer School 2005 - Aspen37 XPath syntax XPath fields (between the /s) –Each represents a descent in the XML hierarchy // means drop any number of levels –Points to an XML node element name or, if preceeded with an @, an attribute name –Other useful node-matching symbols * = wildcard (any name). = current node.. = parent node Context node = starting point –If it does not start with a /, XPath is relative to a context-specific starting point. /NewYork/Burough[@name="Brooklyn"] @name is an XPath relative to /NewYork/Burough Predicates: [ ] –Read […] as where … –XPaths inside resolve to string value inside element/attribute pointed to –Operators: = != = and or –Many Useful Functions: contains( string, string ), position(), count(), last(), local-name() –[3] is short-hand for [position()=3]

38 7 September 2005NVO Summer School 2005 - Aspen38 XPath as a Query An XPath returns matched XML nodes /NewYork/Burough[@name="Brooklyn]/light[3]/SesameStreet On… sunny Returns… sunny

39 7 September 2005NVO Summer School 2005 - Aspen39 XPath as a Query An XPath returns matched XML nodes /NewYork/Burough/light On… sunny Returns… sunny

40 7 September 2005NVO Summer School 2005 - Aspen40 XPath as a Query An XPath returns matched XML nodes /NewYork/Burough/light/SesameStreet/weather On… sunny Returns… sunny

41 7 September 2005NVO Summer School 2005 - Aspen41 XPath as a Query An XPath returns matched XML nodes string(/NewYork/Burough/light/SesameStreet/weather) On… sunny Returns… sunny

42 7 September 2005NVO Summer School 2005 - Aspen42 XPath as a Query An XPath returns matched XML nodes /NewYork/Burough/light/SesameStreet[weather='sunny'] On… sunny Returns… sunny weather was automatically converted to a string before comparison operator was applied.

43 7 September 2005NVO Summer School 2005 - Aspen43 XPath as a Query An XPath returns matched XML nodes –If path is ambiguous, all matching nodes are returned –If path does not resolve to an existing node, the empty set is returned –Predicates, [ … ], provide constraints In some contexts, XPath is automatically converted to string value inside the matched element or attribute. Examples querying a set of VOResource documents /Resource[contains(content/description, 'cluster')] –Return all resource elements where the description contains the word cluster /Resource[facility] –Return all resources that have a facility element /Resource/capability[@xsi:type='cs:ConeSearch']/interface/accessURL –Return the interface URLs of all ConeSearch services /Resource[@xsi:type='vs:DataCollection']/coverage/stc:STCResourceProfile//stc:AllSky –Return all data collections that purport to have data distributed overall

44 7 September 2005NVO Summer School 2005 - Aspen44 XPath as a Query An XPath returns matched XML nodes –If path is ambiguous, all matching nodes are returned –If path does not resolve to an existing node, the empty set is returned –Predicates, [ … ], provide constraints In some contexts, XPath is automatically converted to string value inside the matched element or attribute. Examples querying a set of VOResource documents /Resource[contains(content/description, 'cluster')] –Return all resource elements where the description contains the word cluster /Resource[facility] –Return all resources that have a facility element /Resource/capability[@xsi:type='cs:ConeSearch']/interface/accessURL –Return the interface URLs of all ConeSearch services /Resource[@xsi:type='vs:DataCollection']/coverage/stc:STCResourceProfile//stc:AllSky –Return all data collections that purport to have data distributed overall

45 7 September 2005NVO Summer School 2005 - Aspen45 XQuery (a.k.a. XML Query) What it is: –A W3C standard for querying XML documents An analogue to SQL for tables Why you might care: –It is one of the supported query languages in the standard VO Registry Interface. –It can handle certain complex registry queries that ADQL cannot What youll get from this session: –Ability to query XML documents by modifying existing an XQuery

46 7 September 2005NVO Summer School 2005 - Aspen46 XQuery: an analogy to SQL SQL queries tables –Result of an SQL statement is a table –Columns of result table controlled by the SELECT clause –Rows controlled by the WHERE clause XQuery queries XML documents –Result of an XQuery is an XML document –The form of the XML document is set by the return clause –The contents of the result is controlled by the for, let, and where clauses

47 7 September 2005NVO Summer School 2005 - Aspen47 XQuery syntax: think FLWOR XQuery supports several types of expressions that can return XML results –XPath, Constructors, … –FLWOR = for let where orderby return FLWOR clauses –for/let clause selects data from source XML documents

48 7 September 2005NVO Summer School 2005 - Aspen48 XQuery syntax: think FLWOR Searching for Cone Search services suitable for getting data about galaxy clusters To be used with the Carnivore Registry declare namespace vr= "http://www.ivoa.net/xml/VOResource/v0.10"; declare namespace vs= "http://www.ivoa.net/xml/VODataService/v0.5"; for $vr in //Resource/capability[@xsi:type="cs:ConeSearch"] where contains($vr//description, "quasar") return {string(title)} {string($vr/capability/interface/accessURL)}

49 7 September 2005NVO Summer School 2005 - Aspen49 XQuery syntax: think FLWOR Searching for Cone Search services suitable for getting data about galaxy clusters To be used with the Carnivore Registry Declare namespace prefixes to use in query declare namespace vr= "http://www.ivoa.net/xml/VOResource/v0.10"; declare namespace vs= "http://www.ivoa.net/xml/VODataService/v0.5"; for $vr in //Resource/capability[@xsi:type="cs:ConeSearch"] where contains($vr//description, "quasar") return {string(title)} {string($vr/capability/interface/accessURL)}

50 7 September 2005NVO Summer School 2005 - Aspen50 XQuery syntax: think FLWOR for clause sets up a loop around matching occurrences –XPath both selects the Resource element node to put into the variable and constrains which Resources are included let clause (not used here) can also set variables. –Used to join across documents, self-joins –If $vr is used in the variable definition, the new value would be different in each pass of the loop. Loop over all ConeSearch resources declare namespace vr= "http://www.ivoa.net/xml/VOResource/v0.10"; declare namespace vs= "http://www.ivoa.net/xml/VODataService/v0.5"; for $vr in //Resource/capability[@xsi:type="cs:ConeSearch"] where contains($vr//description, "quasar") return {string(title)} {string($vr/capability/interface/accessURL)} Declare namespace prefixes to use in query

51 7 September 2005NVO Summer School 2005 - Aspen51 XQuery syntax: think FLWOR where clause further restricts the output –Optional, often dont need it: for $vr in //vr:Resource[@xsi:type="cs:ConeSearch and contains($vr//vr:description, "quasar")] Restrict output to ConeSearch services about quasars declare namespace vr= "http://www.ivoa.net/xml/VOResource/v0.10"; declare namespace vs= "http://www.ivoa.net/xml/VODataService/v0.5"; for $vr in //Resource/capability[@xsi:type="cs:ConeSearch"] where contains($vr//description, "quasar") return {string(title)} {string($vr/capability/interface/accessURL)} Loop over all ConeSearch resources Declare namespace prefixes to use in query

52 7 September 2005NVO Summer School 2005 - Aspen52 XQuery syntax: think FLWOR return clause sets the output template –{ } used to denote non-literal output –Applied to each value of $vr XQuery supports several other expression types not shown here –Conditionals, function definition, etc. –Many more predefined funtions XQuery: a full XML processing language Loop over all ConeSearch resources Extract and display Desired information Restrict output to ConeSearch services about quasars Declare namespace prefixes to use in query declare namespace vr= "http://www.ivoa.net/xml/VOResource/v0.10"; declare namespace vs= "http://www.ivoa.net/xml/VODataService/v0.5"; for $vr in //Resource/capability[@xsi:type="cs:ConeSearch"] where contains($vr//description, "quasar") return {string(title)} {string($vr/capability/interface/accessURL)}

53 7 September 2005NVO Summer School 2005 - Aspen53 XSL – XML Stylesheet Language What it is –A language for describing the transformation of XML data from one form to another XML -> HTML XML -> XML XML -> Plain text Why you might care –XSL can be used to create human readable renderings of XML data from the VO –XSL is used by the Java Skynode toolkit for converting ADQL to local SQL What youll get from this session: –Ability to modify XML transformations via simple changes to a stylesheet

54 7 September 2005NVO Summer School 2005 - Aspen54 XML Stylesheet Language An XML document that describes a transformation Contains a list of templates –Each describes the transformation for one type of node (e.g. an element with a particular name) –Node is identified by an XPath Relative to current context node –Usually one template for /, the root of the document –A template can call other templates XSL provides a number of programming structures –Conditionals, looping, user-defined functions, built-in functions, extensibilty –Variables are immutable! XSLT = XSL Transformation

55 7 September 2005NVO Summer School 2005 - Aspen55 A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text Resource Description Record

56 7 September 2005NVO Summer School 2005 - Aspen56 A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text Define prefixes for all namespaces well be using Resource Description Record

57 7 September 2005NVO Summer School 2005 - Aspen57 A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text Three output types –xml, html, text Define prefixes for all namespaces well be using Our output format will be plain text Resource Description Record

58 7 September 2005NVO Summer School 2005 - Aspen58 A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text When raw text appears, XSLT engine will preserve spacing (like carriage returns) around text Resource is the root of our data document Define prefixes for all namespaces well be using Our output format will be plain text Our root document template sets up the output document and calls next template Resource Description Record

59 7 September 2005NVO Summer School 2005 - Aspen59 A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text When a template runs, it changes the context node to the node matched by the template Subsequent XPaths within template are relative to that context node ( ) IVOA Identifier: Target Communities:, Resource template

60 7 September 2005NVO Summer School 2005 - Aspen60 A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text text tags can be used to take explicit control of spacing Resource template ( ) IVOA Identifier: Target Communities:,

61 7 September 2005NVO Summer School 2005 - Aspen61 A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text our XPaths point to elements –Relative to vr:Resource ! value-of will convert it to a string text tags can be used to take explicit control of spacing value-of will print the string values of nodes Resource template ( ) IVOA Identifier: Target Communities:,

62 7 September 2005NVO Summer School 2005 - Aspen62 A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text text tags can be used to take explicit control of spacing value-of will print the string values of nodes Pass control to other templates Resource template ( ) IVOA Identifier: Target Communities:,

63 7 September 2005NVO Summer School 2005 - Aspen63 Normally, apply-templates will automatically loop over multiple occurances Here, we need to insert commas for-each also changes the context node A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text text tags can be used to take explicit control of spacing value-of will print the string values of nodes Pass control to other templates Resource template Loop over all occurances of contentLevel ( ) IVOA Identifier: Target Communities:,

64 7 September 2005NVO Summer School 2005 - Aspen64 Use choose/when for if-then-else blocks A tour through a stylesheet xmltech-VOResource.xsl to transform xmltech-adil.xml into plain text text tags can be used to take explicit control of spacing value-of will print the string values of nodes Pass control to other templates Resource template Loop over all occurances of contentLevel If block ( ) IVOA Identifier: Target Communities:,

65 7 September 2005NVO Summer School 2005 - Aspen65 XSL for metadata Transformation is a powerful paradigm for metadata processing –Consider all uses of metadata as a transformation to another form… User display An SQL statement A workflow script Compile-able code –XSL stylesheet somewhere between configuration file and script Rapid prototyping and adaptation xsl:import : ability to extend & override other stylesheets


Download ppt "7 September 2005NVO Summer School 2005 - Aspen1 Advanced but ever-so-useful XML Technologies: Schema, XPath, XQuery, XSL an incomplete introduction Ray."

Similar presentations


Ads by Google