Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.isocat.org Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1.

Similar presentations


Presentation on theme: "Www.isocat.org Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1."— Presentation transcript:

1 www.isocat.org Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1

2 www.isocat.org Linguistic resources Data category registries Relation registries MPI DCR ISO DCR Typological Database System RRMPI RR MPI archive TDS databaseresource Vision 20 June 2013CLARIN-NL ISOcat tutorial2

3 www.isocat.org How to make semantics explicit? Associate data categories with your resources using the PIDs Where to put the PIDs? Preferably in a schema Or in the resource itself (redundant) Or in the metadata of the resource (less specific) 20 June 2013CLARIN-NL ISOcat tutorial3

4 www.isocat.org What is a schema? “comes from the Greek word "σχήμα" (skhēma), which means shape, or more generally, plan.” (wikipedia) A collection of building blocks and rules on how to combine them into a valid resource – XML document: DTD, XML Schema, Relax NG, … easy; see http://www.isocat.org/12620/http://www.isocat.org/12620/ – RDF graph annotation property easy; see http://www.isocat.org/ns/dcr.rdfhttp://www.isocat.org/ns/dcr.rdf – Text document: A grammar – Extended Backus–Naur Form (EBNF) –... how to embed Data Category PIDs? – … 20 June 2013CLARIN-NL ISOcat tutorial4

5 www.isocat.org XML resource nihongo … … … 20 June 2013CLARIN-NL ISOcat tutorial5

6 www.isocat.org XML resource <lmf:writtenForm dcr:datcat=“http://www.isocat.org/datcat/…”> nihongo … … … 20 June 2013CLARIN-NL ISOcat tutorial6

7 www.isocat.org XML Relax NG schema ipa … 20 June 2013CLARIN-NL ISOcat tutorial7

8 www.isocat.org CGN/DCOI grammar with DC references 20 June 2013CLARIN-NL ISOcat tutorial8 http://lux13.mpi.nl/schemacat/schema/CGN (early alpha version) (* @dcr:datcat 'N' http://www.isocat.org/datcat/DC-4909 *)http://www.isocat.org/datcat/DC-4909... tag = 'N', '(', NTYPE, ',', GETAL, ',', GRAAD, ',', GENUS, ',', NAAMVAL, ')‘... (* @dcr:datcat NTYPE http://www.isocat.org/datcat/DC-4908 *) (* @dcr:datcat 'soortnaam' http://www.isocat.org/datcat/DC-4910 *) (* @dcr:datcat 'eigennaam' http://www.isocat.org/datcat/DC-4911 *) NTYPE = 'soortnaam' | 'eigennaam' ;http://www.isocat.org/datcat/DC-4908http://www.isocat.org/datcat/DC-4910http://www.isocat.org/datcat/DC-4911...

9 www.isocat.org Multiple DCRs? Actually we don’t need multiple DCRs to have overlapping subsets – Overlaps are created due to Data categories are typed, and might not have the type you need – POS field (closed DC) of the lexical entry “walk” gets the value ‘verb’ (simple DC) » PoS = ‘verb’ – Verb (open DC) feature of a feature structure gets the value “walk” » Verb = ‘walk’ External sets are imported just as they are – NKJP, GOLD, STTS, … – Only some take the effort to also provide mappings There might be very fine differences between your data category and an existing one, and the owner doesn’t want to adapt Still we would like to know that these data categories are the same or almost the same! 20 June 2013CLARIN-NL ISOcat tutorial9

10 www.isocat.org Relation Registry - RELcat http://lux13.mpi.nl/relcat/ (alpha version) Stores user specific sets of relations: 20 June 2013CLARIN-NL ISOcat tutorial10 time coverage isocat:DC-1502 dc:coverage relcat:subClassOf language name isocat:DC-2484 language ID isocat:DC-2482 dc:language relcat:sameAs

11 www.isocat.org Relation types There already exist large collections of relations with their own vocabularies, e.g., OWL (2), SKOS,... RELcat has a basic relation type hierarchy – rel:related rel:sameAs rel:almostSameAs rel:broaderThan – rel:superClassOf – rel:hasPart rel:narrowerThan – rel:subClassOf – rel:partOf which can be extended for other vocabularies rel:sameAs owl:sameAs skos:exactMatch rel:almostSameAs skos:closeMatch 20 June 2013CLARIN-NL ISOcat tutorial11

12 www.isocat.org RELcat usage RELcat is still in an alpha phase – no user interface yet – upload of relations via the system administrator isocat@mpi.nl – however, there is an read-only API which is in use by (experimental) parts of the CLARIN infrastructure, e.g., the CMDI semantic mapping component 20 June 2013CLARIN-NL ISOcat tutorial12

13 www.isocat.org Another new kitten: SCHEMAcat Resource schemata of any type should be stored somewhere persistently – Get a PID These schemata are preferably annotated with data categories – SCHEMAcat  ISOcat These data categories will then have (typed) relationships among each other – SCHEMAcat  RELcat Status: very early alpha, but some schemata are already available – CGN: http://lux13.mpi.nl/schemacat/schema/CGNhttp://lux13.mpi.nl/schemacat/schema/CGN 20 June 2013CLARIN-NL ISOcat tutorial13

14 www.isocat.org Data Category Registry - ISOcat Linguistic knowledge baseLinguistic resource (schema) Data categories Containers Concepts Concept Registry Relation Relation Registry - RELcat A whole litter! Schema Registry - SCHEMAcat 20 June 2013CLARIN-NL ISOcat tutorial14


Download ppt "Www.isocat.org Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1."

Similar presentations


Ads by Google