Www.isocat.org CLARIN-NL Call 4 ISOcat follow-up 2/10/20131CLARIN-NL Call 4 ISOcat follow-up.

Slides:



Advertisements
Similar presentations
DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
Advertisements

INTER-VIEWs Curation of Interview Data 1 feb. – 1 nov CLST, Nijmegen,, Henk van den Heuvel Centre for.
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
ISOcat Data Category Registry Defining widely accepted linguistic concepts Menzo Windhouwer 1CLARIN-NL MD tutorial, September 2009.
Bulk loading ISOcat data categories with the Data Category Interchange Format 10/24/20111CLARIN-NL ISOcat Call 2 followup.
ISO DSDL ISO – Document Schema Definition Languages (DSDL) Martin Bryan Convenor, JTC1/SC18 WG1.
Principles of ISOcat, a Data Category Registry Marc Kemps-Snijders a, Menzo Windhouwer a, Sue Ellen Wright b a Max Planck Institute for.
ISOcat introduction 19 June 20121CLARIN-NL ISOcat workshop.
Data Category specifications 19 June 20121CLARIN-NL 2012 ISOcat tutorial.
CLARIN-NL/VL procedure 20 June 20131CLARIN-NL ISOcat workshop.
11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
TLA/CLARIN CLAVAS Use Cases: Overview CMDI integration – Metadata editing Resource Annotation Kinship data.
4/16/2007Declare a Schema File I1. 4/16/2007Declare a Schema File I2 Declare a Schema File A collection of semantic validation rules designed to constrain.
ISOcat: known issues 10 May /20111CLARIN-NL ISOcat workshop.
OCLC Online Computer Library Center Two Paths to Interoperable Metadata Jean Godby, Devon Smith, Eric Childress DC-2003 September 29, 2003.
Data Category specifications 20 March 20121CLARIN-NL ISOcat workshop.
Agenda CMDI Workshop 9.15 Welcome 9.30 Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.15Coffee 10.30Use of ISOCat within CMDI.
CLARIN-NL ISOcat workshop 2011 part 2 Ineke Schuurman Menzo Windhouwer.
The ISO-DCR 17 January /20111CMDI tutorial Marc Kemps-Snijders a, Menzo Windhouwer b, Sue Ellen Wright c a Meertens Institute, b MPI for.
ISOcat demo and providing RELcat input Menzo Windhouwer The Language Archive tla.mpi.nl Data Archiving and Networked Solutions
CLARIN-NL Call 3 ISOcat follow-up 10/10/20121CLARIN-NL ISOcat Call 3 follow-up.
Content of the Data Category Registry 10 May /20111CLARIN-NL ISOcat workshop.
NERC DataGrid Vocabulary Server Access Vocabulary Workshop, RAL, February 25, 2009.
METADATA HARMONISATION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.
November 1, 2006IU DLP Brown Bag : Fall Data Integrity and Document- centric XML Using Schematron for Managing Text Collections Dazhi Jiao, Tamara.
CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.
ISOcat: known issues 20 June 20131CLARIN-NL ISOcat workshop.
Report on the ISOcat project Marc Kemps-Snijders Menzo Windhouwer Peter Wittenburg Sue Ellen Wright January 8,
Lifecycle Metadata for Digital Objects (INF 389K) September 18, 2006 The Big Metadata Picture, Web Access, and the W3C Context.
Linguistics with CLARIN Storing resources in CLARIN Jan Odijk LOT Winterschool Amsterdam,
NERC DataGrid NERC DataGrid Vocabulary Server Use Cases Vocabulary Workshop, RAL, February 25, 2009.
ISOcat introduction 20 June 20131CLARIN-NL ISOcat workshop.
ISOcat introduction 20 March 20121CLARIN-NL ISOcat workshop.
XML Schema Integration Ray Dos Santos July 19, 2009.
ISOcat: known issues 19 June 20121CLARIN-NL ISOcat workshop.
11 CMDI/ISOcat And Semantic Operability Ineke Schuurman ISOcat content coördinator CLARIN-NL Menzo Windhouwer ISOcat system administrator Utrecht
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
METS Application Profiles Morgan Cundiff Network Development and MARC Standards Office Library of Congress.
ISOcat: How to create a DC (including “do’s and don’ts”) 19 June 20121CLARIN-NL ISOcat tutorial.
CLARIN-NL Requirements and Desiderata Jan Odijk CLARIN-NL Call 3 Info-session Utrecht, 25 Aug 2011.
Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
Working with Ontologies Introduction to DOGMA and related research.
Agenda CMDI Tutorial 9.30 Welcome & Coffee Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.30CMDI & ISO-DCR 10.50The CMDI.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
ISO TC 37/CLARIN SEMANTIC DATA REGISTRY WORKSHOP UTRECHT, DECEMBER ISOcat: Metadata Registry SUE ELLEN WRIGHT DECEMBER 2013.
CLARIN Concept Registry: the new semantic registry Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, Daniel Zeman
The ISO Data Category Registry ISO 12620:2009 introduces – A web-based electronic Data Category Registry (DCR) for simple, complex and (in the future)
ISOcat status
CLARIN Requirements for a Semantic Registry Daan Broeder The Language Archive – MPI Ineke Schuurman CLARIN-NL/VL – KU Leuven & Utrecht.
1 ISOCAT Proposed solutions for Problems encountered in DUELME-LMF Jan Odijk Nijmegen 21 Sep 2010.
1 CLARIN? ISOCAT! Ineke Schuurman Hilversum,
ISOcat tutorial DCR data model and guidelines. Simple and complex DCs Simple Data CategoryComplex Data CategoryConceptual Domain Data CategoryDescription.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
ISOcat: How to create a DC (including “do’s and don’ts”) 20 June 20131CLARIN-NL ISOcat tutorial.
TDS-Curator DANS MPI for Psycholinguistics Utrecht Institute of Linguistics OTS languagelink.let.uu.nl/tds/ 9/21/20101CLARIN-NL - Call 1 - ISOcat status.
Partially Populated for ADT Messages
ISOcat tutorial Hands-on session. Supported browsers Internet Explorer 7 and 8 – IE 8 is regularly tested Firefox 3 and higher – Firefox 3.5 is regularly.
1 Annotation Framework March Terminology CV - abbreviation for controlled vocabulary CRS - Community Review System (a collection within DLESE)
CMD and TEI CMDI interoperability workshop Utrecht Matej Ďurčo, ICLTT, Vienna.
Linking to Linguistic Data Categories in ISOcat Menzo Windhouwer a, Sue Ellen Wright b a The Language Archive - MPI for Psycholinguistics,
ISOcat introduction 10 May /20111CLARIN-NL ISOcat workshop.
1 XML and XML in DLESE Katy Ginger November 2003.
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
Relations between Data Categories
The Re3gistry software and the INSPIRE Registry
Data Model.
S-127 – Marine Traffic Management Release Candidate NIPWG 6 30 January 2019 Raphael Malyankar Eivind Mong Sponsored by IHO.
WebDAV Design Overview
Presentation transcript:

CLARIN-NL Call 4 ISOcat follow-up 2/10/20131CLARIN-NL Call 4 ISOcat follow-up

Topics Data Category types Bulk import Beyond ISOcat – CMDI – SCHEMAcat – RELcat 2/10/2013CLARIN-NL Call 4 ISOcat follow-up2

Data Category types 2/10/2013CLARIN-NL Call 4 ISOcat follow-up3 writtenForm string open grammaticalGender string neuter masculine feminine closed simple: string constrained complex:

Data Category types 2/10/2013CLARIN-NL Call 4 ISOcat follow-up4 language alphabet writtenForm japanese ipa lexicon entry lemma container:

Which type? Which type is appropriate depends on the place of the data category in the structure of your resource: 1.Can it have a value? Complex Data Category with an data type – Any of the values of the data type? » Open Data Category – Can you enumerate the values? » Closed Data Category Fill its value domain with simple Data Categories – Is there a rule to constrain the values? » Constrained Data Category Express the rule/constraint in one of the rule languages 2.Is it a value? Simple Data Category 3.Does it group other (container or complex) Data Categories? Container Data Categories If a Data Category both has a value and groups Data Categories – Complex Data Category 2/10/2013CLARIN-NL Call 4 ISOcat follow-up5

CMDI example CMD component relates to a container DC CMD element relates to a complex DC CMD value relates to a simple DC The ISOcat search in the CMD Component Editor enforces this – Also a DC should be public and member of the Metadata profile However, if you link to a DC nothing of the specification is taken over into your profile  2/10/2013CLARIN-NL Call 4 ISOcat follow-up6

Some examples 2/10/2013CLARIN-NL Call 4 ISOcat follow-up7 categorynoun phrase agreement person numbersingular third S NPVP VNP DetN Text=“John” Text=“hit” Text=“the”Text=“ball” /category/ a closed DC /noun phrase/ a simple DC /agreement/ a container DC /number/ a closed DC /singular/ a simple DC /person/ a closed DC /third/ a simple DC (Encoded as TEI P5 FSR the XML elements and attributesTEI P5 FSR are seen as syntactic sugar) /S/ a container DC /NP/ an open DC /VP/ a container DC /V/ an open DC /NP/ a container DC /Det/ an open DC /N/ an open DC (Text is seen as syntactic sugar)

Bulk import: DCIF Create a valid DCIF XML document – In general by converting an existing digital resource XSLT, Perl, … – DCIF Schema: Human readable: – DCIF Validation levels: Structure: Relax NG validation Referential integrity: Schematron validation Example: example.dcifhttp:// example.dcif 2/10/20138CLARIN-NL Call 4 ISOcat follow-up

DCIF Validation Scenario in oXygen 2/10/20139CLARIN-NL Call 4 ISOcat follow-up

What will be overwritten? PIDs – Just invent your own URI, e.g., my:DC-1 – Use them to relate DCs: Closed DC conceptual domain to simple DC Simple DC is-a relation to another simple DC – Will be overwritten by ISOcat PIDs Unless you have ISOcat acceptable PIDs Version -> 1:0 Registration status -> private Creation date -> date of import 2/10/201310CLARIN-NL Call 4 ISOcat follow-up

Contact ISOcat sysadmin If you need: – Additional languages – Additional profiles – Additional constraint rule languages If you’re done: – Send DCIF file – Will be validated (again ) – Test import cycles on the ISOcat test server – Actual import on isocat.org If you want to do bulk updates 2/10/201311CLARIN-NL Call 4 ISOcat follow-up

Relationships In ISOcat – Between simple and closed DCs – Simple DCs can have an is-a relationship with one super simple DC This can be used to hierarchically structure a value domain – In the definition, notes or examples you can embed a link (an untyped relationship) to another DC “… noun (DC-1234) …” (in the future ISOcat will recognize the DC-nnnn pattern and make it clickable, it can’t recognize: an arbitrary ‘word’ (“noun”) as a DC identifier, or an arbitrary number (1234) as a DC key, so use the DC-nnnn pattern!) In SCHEMAcat/CMDI Component Registry – A schema annotated with DC references also relates these DCs together Depending on the schema type the type of this relationship is (un)known In RELcat – Typed relationships among DCs, but also with other Semantic Registries 2/10/2013CLARIN-NL Call 4 ISOcat follow-up12

Beyond ISOcat: CMDI Tutorial in November 2013 (TBA) 2/10/2013CLARIN-NL Call 4 ISOcat follow-up13

Beyond ISOcat: SCHEMAcat Annotate your resource schema with ISOcat DC PIDs 1.Use what your schema language provides to link to an external semantic specification ODD: in an XML-based schema language RNG: 3.Embed annotation in a comment in another (text- based) schema language EBNF: MORFL …/DC-nnn *) 4.Embed annotation in a description or note or … MDF: …/DC-nnn 5.Contact 2/10/2013CLARIN-NL Call 4 ISOcat follow-up14

Beyond ISOcat: RELcat Collect typed relationships between your new DCs and existing DCs in an Excel spreadsheet or CSV file with at least three columns 1.Your ISOcat DC PID 2.typed relationship sameAs: same semantics just different types or an uncooperative DC owner almostSameAs: minor, but for you important, differences subClassOf: yours is more specific superClassOf: yours is more general hasPart/partOf: partitive relationships 3.Related ISOcat DC PID (or an URL to an entry in another persistent concept/data category registry) 2/10/2013CLARIN-NL Call 4 ISOcat follow-up15

ISOcat user interface Problematic: – Simple DC selector for a closed value domain Too slow especially when the closed DC is a member of the Private profile, if more specific, e.g., Metadata, the number of simple DCs loaded will be much smaller Upcoming: replace full list by a search or selection from the basket or viewed DCS – Links between DCs Upcoming: become clickable Later: integration with RELcat for typed relationships 2/10/2013CLARIN-NL Call 4 ISOcat follow-up16