Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vocabulary Services “Huuh - what is it good for…” (in WDTS anyway…) 4 th September 2009 Jonathan Yu CSIRO Land and Water.

Similar presentations


Presentation on theme: "Vocabulary Services “Huuh - what is it good for…” (in WDTS anyway…) 4 th September 2009 Jonathan Yu CSIRO Land and Water."— Presentation transcript:

1 Vocabulary Services “Huuh - what is it good for…” (in WDTS anyway…) 4 th September 2009 Jonathan Yu CSIRO Land and Water

2 CSIRO. Talk outline Water Data Transfer Standards What are vocab services? How are they used? How are we are using it in WDTS? Future work

3 CSIRO. Talk outline What are vocab services? Set of services for querying and managing vocabularies Vocabularies themselves (water regulation codes, units of measure, pizza classifications, wine ontologies) Metadata about a domain: concepts, properties, relationships, assertions RDF-based vocabulary languages: RDFS, SKOS, OWL Embedding RDF into xHTML (RDFa) Interfaces SPARQL Protocol and RDF Query Language (SPARQL) queries HTTP Get/Post, REST Html forms How are they used? Dictionary lookup: What does this term mean? What is a margherita pizza? What is the German equivalent that pizza?? marguerite ?marguerite Discovery and analysis: What is it related to? What pizza has similar toppings to a margherita pizza? Where does the concept “Gold” occur in Geological surveys in Australia? Interoperability and shared definitions: Oh, this concept in my Business model maps to this other concept in your Business model! Oh, fruitti de mare pizza is actually a seafood pizza! Data validation: Is this a valid pizza order? Is my XML data consistent with an International standard ? Config. management and Generating code: Fill templates or Spit out some code based on concepts, properties, or conceptual structure in the vocabulary (pizza ordering website, sitemaps, Java code, Schematron rules) How are we are using it in WDTS? Validation services Validate potentially lots and lots of XML data in WDTF format > 200 data providers transferring their water data to BoM Need to ensure format is followed Hang on, can’t we just use XML Schema to enforce validation rules? XML Schema not sufficient enough and can’t capture a lot of the semantics in business rules Specific cardinality constraints and vocabulary checking Examples: Using HTTP get queries like: Is this a valid vocabulary definition? Is this URN valid? Does this Water regulation code parameter have the right measurement unit associated with it? Generating Schematron code To check cardinality between one element and another Example: your HydroCollection xml data may have as many nodes but must only have one node Future work Validation using complex business rules Currently unknown – suspect, continue to push the boundary with leveraging of vocabulary services Documentation generation Leverage on vocabulary service to aid documentation generation – i.e. constraints

4 CSIRO. Joint effort by CSIRO & Bureau of Meteorology (BoM) Problem space: standardising format of water observation data Currently water data providers sending data in various formats Water Data Transfer Standards (WDTS)

5 CSIRO. WDTF Develop Water Data Transfer Format (WDTF) Standardised format for sending and receiving water related data using XML (e.g. groundwater, river flow) Primarily used by water providers to send their data for ingestion by BoM But also for exchange between other organisations Part of integrated national water information system to help with water crisis

6 CSIRO. Validating WDTF Potentially lots of agencies (over 200) submitting WDTF Can’t possibly examine each XML file for valid structure and content ! Need mechanism(s) for validating WDTF

7 CSIRO. Why not use XML Schema? Hang on, can’t we just use XML Schema to enforce validation rules? XML Schema not sufficient enough and can’t capture a lot of the semantics in business rules http://www.bom.gov.au/std/water/xml/wio0.2/procedure/QualityMethod/bom/611 http://www.bom.gov.au/std/water/xml/wio0.2/party/laboratory/w00233/SWC-LAB http://www.bom.gov.au/std/water/xml/wio0.2/property//bom/WaterpH_pH Reg200806.s3.9g Unclassified 7.56 Approp. identifiers Valid content and contextual use

8 CSIRO. What are Vocab Services? Set of services for querying and managing vocabularies 1.Interfaces SPARQL Protocol and RDF Query Language (SPARQL) queries HTTP Get/Post, REST Html forms 2.Vocabularies Vocabularies Descriptions about a domain in specification language: concepts, properties, relationships, assertions

9 CSIRO. Vocabularies What do they look like?

10 CSIRO. Vocabularies What do they look like? water regulation codes, units of measure, pizza classifications http://www.co-ode.org/ontologies/pizza/2007/02/12/ wine vocabularies

11 CSIRO. Example of specification languages Limit to RDF/XML based RDF-based vocabulary languages: Simple Knowledge Organisation System (SKOS) Simple taxonomic descriptions Broader, narrower, related to relationships Web Ontology Language (OWL) Ability to describe in very specific logic i.e. Class A is disjoint from Class B, C and D and has this custom relationship that has this defined cardinality constraint with Class B but is a subclass of Class X

12 CSIRO. What are vocabs services good for? 1.Dictionary lookup: What does this term mean? What is beetroot? What is Metres? http://localhost:8080/VocabLookup/get/concept/vocab1.0/unit:Metres 2. Discovery and analysis: What is it related to? I know I have beetroot in my fridge, what other related food is in my fridge? What water regulation parameters use the unit Metres? Where does the concept “Gold” occur in Geological surveys in Victoria? http://portal.auscope.org/gmap.html 3. Interoperability and shared definitions and semantics: Oh, this concept in my Business model maps to this other concept in your Business model! Oh, your parameter of WaterCourseLevel is measured in metres? Mine is in millimetres – let’s talk 4. Data validation: Do I have milk in my fridge? Is this a valid water parameter? Is my XML data consistent with WDTF? 5. Config. management and Generating code: Fill templates or Spit out some code or artifact based on concepts, properties, or conceptual structure in the vocabulary (my dinner, sitemaps, Java code, Schematron rules)

13 CSIRO. Vocab Services in WDTS Leveraging Vocabulary Services for… Representing schema control lists currently being maintained in Excel spreadsheet Validation services: Validate potentially lots and lots of XML data in WDTF format > 200 data providers transferring their water data to BoM Need to ensure format is followed

14 CSIRO. Typical usage of vocabulary service Specific cardinality constraints and vocabulary checking Using HTTP get queries like: Is this a valid vocabulary definition? http://localhost:8080/VocabLookup /get/concept /vocab1.0 /param%3AWaterpH_pH http://localhost:8080/VocabLookup /check/concept /vocab1.0 /param%3AWaterpH_pH

15 CSIRO. More query examples Is this Urn or http-Uri valid? E.g. urn:ogc:def:crs:EPSG::28349 http://localhost:8080/VocabLookup /check/property /vocab1.0 /dc:identifier /%27urn:ogc:def:crs:EPSG::28349%27 Does this Water regulation code parameter have the right measurement unit associated with it? http://localhost:8080/VocabLookup /check/relation /vocab1.0 /param:WaterpH_pH /skos:related /dc:identifier/%27[pH]%27

16 CSIRO. Generating Schematron code from query To check cardinality between one element and another Example: your HydroCollection xml data may have as many nodes but must only have one node http://localhost:8080/VocabLookup/get/cardinality/wdtf- structure/wdtf:HydroCollection/wdtf:metadata Or just get all of them http://localhost:8080/VocabLookup/getall/cardinality /wdtf-structure2/

17 CSIRO. Problems and areas of difficulty Emerging requirements Open source tools Standards for representing vocabularies Implementation specific details of how vocab stored, managed Exploring what we can and can’t do with vocabulary services Method or approach used Encapsulating vocabulary in the ‘right’ way Various ways to represent something Best practices for querying Versioning

18 CSIRO. Conclusion WDTS Project and driving problem space for Vocabulary services Vocabulary services: Huuh - what is it good for… absolutely something! At least, for validating content and business rules in WDTF

19 CSIRO. Future work Continuing to develop solutions to current business rules and best practices for WDTF 1.0, 1.1 Validation of future (complex) business rules Currently unknown – suspect that we will continue to push the boundary with leveraging of vocabulary services Documentation generation Leverage on vocabulary service to aid documentation generation – i.e. populating constraints WaterML 2.0 Worldwide standard for a water data exchange format

20 CSIRO. Questions?


Download ppt "Vocabulary Services “Huuh - what is it good for…” (in WDTS anyway…) 4 th September 2009 Jonathan Yu CSIRO Land and Water."

Similar presentations


Ads by Google