Semantics Session 1 (mon 19, 16:30-18:00, Vulcania 1) Vocabularies: –Overview of vocabulary document (APM) –Discussion to resolve WD open issues (NG, AG,...)

1 Semantics Session 1 (mon 19, 16:30-18:00, Vulcania 1) Vocabularies: –Overview of vocabulary document (APM) –Discussion to resolve WD open issues (NG, AG,...) Contributions: –Mapping between Vocabulary terms (A. Gray) –Towards an IVOA Vocabulary (APM, NG, SD,...) –Publishing and maintaining vocabularies (NG...)

2 Status: –July2007: vocabs in XML format + ucd-like syntax –October2007: agreement on the standard W3C format RDF and SKOS –March2008: WD “Vocabularies in the Virtual Observatory” v1.00 –Open Issues: discuss and validate them –WDv1.0 > PRv1.1

3 Vocabularies: Open Issues see summary Note by NG at /ivoa/vocabularies/issues Semantics: Session 1

4 Vocabularies: Open Issues 1.Format of the master vocabulary CLOSED? 2.Format of the distributed vocabularies CLOSED? 3.Identifying vocabulary versions 4.Who maintains vocabularies? CLOSED? 5.What vocabularies are included in the standard? CLOSED? 6.Inclusion of mappings in vocabularies

5 1. Format of the master vocabulary what should be the format of the master files? Possible resolution 1: nothing mandated in the document -- the format of the master file should be whatever is most convenient, as long as the generated and distributed files are valid SKOS. Possible resolution 2: SKOS, in Turtle notation, possibly requiring some post-processing to add omitted-but-inferrable relations. This is easy to read and write, and it is simple enough that it would be feasible to create from scratch a parser for the relevant subset of it. Possible resolution 3: some more fundamental no-punctuation format, such as that for the Lexicon program. Provisional resolution: option (1) above – nothing mandated. Only the distribution format is to be specified (no objections on the list).

6 2. Format of the distributed vocabularies in which format should vocabularies be distributed? Possible resolution 1: the standard simply mandates that they be distributed in at least one well-known RDF format (which means either RDF/XML or Turtle, which is equivalent to N3 for this purpose). This implies that an RDF parser will, realistically, be required in order to process the vocabulary files. Possible resolution 2: the standard requires them to be distributed in a format which is parseable as RDF, but which is also regular enough that it's usefully interpretable as ‘normal’ XML. Provisional resolution: option (1) above – distribution in any RDF serialization.

7 3. Identifying vocabulary versions do vocabulay users refer to a concept URI with explicit version, or to a constant URI which always refers to the latest version? Possible resolution 1: users always refer to the same concept URI, as for example in and this refers to the latest version of the vocabulary. The Dublin Core metadata set does this. Possible resolution 2: users refer to a concept URI without a version; this URL returns a vocabulary with a versioned namespace (it violates good practice) Possible resolution 3: users will refer to concepts which have a version explicit within the namespace, as for example in (the precise location of the version number or date in the URI is a distribution/maintenance detail).

8 4. Maintenance (1/2) By whom, and by what process, are vocabularies maintained? Option 1: the vocabularies in the standardized document are regarded purely as examples, with no normative force and no specified maintenance process. Option 2: the document's vocabularies are normative, and the document should define a maintenance process, possibly modelled on the UCD process. Option 3: the document's vocabularies are normative, but not claimed to be more than merely adequate. They will not be developed as part of this standard's evolution, but instead be maintained by other interest groups, either within or without the IVOA process.

9 4. Maintenance (2/2) Are there minimal standards of curation which conforming vocabularies must abide by? For example, need we require vocabulary maintainers to use the mechanisms, or just rely on their good sense? Provisional resolution: Option 3. The final published standard will include a number of SKOS vocabularies produced as part of this process. These will be usable and citable, and the community will be encouraged to use them, but they will not be maintained after the standard is complete. Instead, the `owners' of the underlying vocabularies (for example the UCD maintenance group) will be encouraged to maintain the SKOS version alongside their other forms. In particular, the IVOA-T vocabulary will be developed and maintained in a parallel standard to this one.

10 5. What vocabularies are included in the standard? (1/2) There are six vocabularies which have been associated with the draft standardization process, namely: the A&A journal keyword list, the IVOA AOIM list, and the 1993 IAU thesaurus, whose inclusion should be completely uncontroversial; an IVOA Thesaurus based on the IAU-93, which may or may not be in this standard depending on whether people would prefer a completely separate process to develop it; a UCD1+ vocabulary (though this deals with a different set of concepts – namely data types – from the other vocabularies and might arguably connect poorly to them); and a SKOS version of the list of constellations, which is very simple, and which might reasonably find a home in this standard on that ground alone.

11 5. What vocabularies are included in the standard? (2/2) In addition, there are multiple informal keyword lists associated with VOEvent. These haven't been SKOSified at all, and Rick's excellent suggestion is that these be left as homework for the VOEvent group. Plus Theory/Simulations.. Provisional resolution: include all five/six. The A&A, AOIM, UCD1, IAU-93 and constellations vocabularies will be finished and immediately useable. The IVOAT vocabulary will be developed in a parallel process to this vocabularies standard: it will be referred to, and a snapshot of it may be included in the standard, but it will be clearly marked as a work-in-progress.

12 6. Inclusion of mappings in vocab.s (1/2) Consideration 1: The mappings spec is still in flux, and likely to remain so for some time after the SKOS core document is standardized Consideration 2: the situation could develop where there are multiple third-party mappings between vocabularies, maintained by specific communities, or which describe mappings at different levels of granularity, or which represent significant labour on the part of individuals, adding value to the network of vocabularies.

13 6. Inclusion of mappings in vocab.s (2/2) Suggested resolution: include mappings as non- normative parts of this standard, published alongside, but separate from, the normative SKOS versions of the vocabularies, and using whatever are the then- current best mapping practices. In this standard, and in the best-practice guidelines we include, we should proscribe inter-vocabulary mappings being published as part of a vocabulary. Vocabularies and the mappings between them are conceptually separate entities, although they will in practice likely be maintained together

