11 CMDI/ISOcat And Semantic Operability Ineke Schuurman ISOcat content coördinator CLARIN-NL Menzo Windhouwer ISOcat system administrator Utrecht 4-3-2014.

Slides:



Advertisements
Similar presentations
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Advertisements

DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
CLARIN Metadata & ISO DCR Daan Broeder. Max-Planck Institute for Psycholinguistics TKE ES05 Workshop, August 14th Dublin.
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
ISOcat Data Category Registry Defining widely accepted linguistic concepts Menzo Windhouwer 1CLARIN-NL MD tutorial, September 2009.
ISOcat introduction 19 June 20121CLARIN-NL ISOcat workshop.
Data Category specifications 19 June 20121CLARIN-NL 2012 ISOcat tutorial.
CLARIN-NL/VL procedure 20 June 20131CLARIN-NL ISOcat workshop.
11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
The NSDL Registry Diane Hillmann  Jon Phipps. What We’re Doing Received an NSF grant in Oct. 2006, to: Register metadata schemas, vocabularies, application.
OLC Spring Chapter Conferences Metadata, Schmetadata … Tell Me Why I Should Care? OLC Spring Chapter Conferences, 2004 Margaret.
ISOcat: known issues 10 May /20111CLARIN-NL ISOcat workshop.
Update on INSPIRE: INSPIRE maintenance and implementation and INSPIRE related EEA activities on biodiversity CDDA/European protected areas technical meeting.
CLARIN-NL: Dealing with ISOcat Ineke Schuurman. ISOcat and CLARIN Projects call 1 CLARIN-NL Joint Flemish/Dutch pilot Whenever relevant, elements are.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
Agenda CMDI Workshop 9.15 Welcome 9.30 Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.15Coffee 10.30Use of ISOCat within CMDI.
CLARIN-NL ISOcat workshop 2011 part 2 Ineke Schuurman Menzo Windhouwer.
The ISO-DCR 17 January /20111CMDI tutorial Marc Kemps-Snijders a, Menzo Windhouwer b, Sue Ellen Wright c a Meertens Institute, b MPI for.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the.
ISOcat demo and providing RELcat input Menzo Windhouwer The Language Archive tla.mpi.nl Data Archiving and Networked Solutions
3 rd Annual European DDI Users Group Meeting, 5-6 December 2011 The Ongoing Work for a Technical Vocabulary of DDI and SDMX Terms Marco Pellegrino Eurostat.
CLARIN-NL Call 3 ISOcat follow-up 10/10/20121CLARIN-NL ISOcat Call 3 follow-up.
DC specifications or “Do’s and don’ts” when creating a DC.
Content of the Data Category Registry 10 May /20111CLARIN-NL ISOcat workshop.
Metadata & CMDI CLARIN Component Metadata Infrastructure Daan Broeder et al. Max-Planck Institute for Psycholinguistics CLARIN NL CMDI Metadata Tutorial.
CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.
ISOcat: known issues 20 June 20131CLARIN-NL ISOcat workshop.
Metadata Registries Workshop April 15, 1998 Slide 1 of 20 ANSI X Douglas D. Mann Stewardship Naming & Identification Classification.
The Final Study Period Report on MFI 6: Model registration procedure SC32WG2 Meeting, Sydney May 26, 2008 H. Horiuchi, Keqing He, Doo-Kwon Baik SC32WG2.
CLARIN-NL Call 4 ISOcat follow-up 2/10/20131CLARIN-NL Call 4 ISOcat follow-up.
In Dublin’s fair city, where the metadata are so pretty… John Roberts Archives New Zealand.
Linguistics with CLARIN Storing resources in CLARIN Jan Odijk LOT Winterschool Amsterdam,
CLARIN for Linguists Portal & Searching for Resources Jan Odijk LOT Summerschool Nijmegen,
ISOcat introduction 20 June 20131CLARIN-NL ISOcat workshop.
ISOcat introduction 20 March 20121CLARIN-NL ISOcat workshop.
CLARIN-NL ISOcat workshop 2012 part 2 ( ) Ineke Schuurman Menzo Windhouwer.
ISOcat: known issues 19 June 20121CLARIN-NL ISOcat workshop.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
ISOcat: How to create a DC (including “do’s and don’ts”) 19 June 20121CLARIN-NL ISOcat tutorial.
CLARIN-NL Requirements and Desiderata Jan Odijk CLARIN-NL Call 3 Info-session Utrecht, 25 Aug 2011.
The data standards soup … Is the most exciting topic you can dream of.
Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1.
1 CLARIN - NL What is going on? Jan Odijk Amsterdam 26 Aug 2010.
Agenda CMDI Tutorial 9.30 Welcome & Coffee Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.30CMDI & ISO-DCR 10.50The CMDI.
ISO TC 37/CLARIN SEMANTIC DATA REGISTRY WORKSHOP UTRECHT, DECEMBER ISOcat: Metadata Registry SUE ELLEN WRIGHT DECEMBER 2013.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
CLARIN Concept Registry: the new semantic registry Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, Daniel Zeman
ISOcat status
CLARIN Requirements for a Semantic Registry Daan Broeder The Language Archive – MPI Ineke Schuurman CLARIN-NL/VL – KU Leuven & Utrecht.
1 ISOCAT Proposed solutions for Problems encountered in DUELME-LMF Jan Odijk Nijmegen 21 Sep 2010.
Pete Johnston, Eduserv Foundation 16 April 2007 An Introduction to the DCMI Abstract Model JISC.
1 CLARIN? ISOCAT! Ineke Schuurman Hilversum,
Creating & Testing CLARIN Metadata Components A CLARIN-NL project Folkert de Vriend Meertens Institute, Amsterdam 18/05/2010.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
ISOcat: How to create a DC (including “do’s and don’ts”) 20 June 20131CLARIN-NL ISOcat tutorial.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Group work and standardization features in ISOcat Menzo Windhouwer 8/14/20101Standardizing Data Categories in ISOcat - Implementing Group.
ISOcat tutorial Hands-on session. Supported browsers Internet Explorer 7 and 8 – IE 8 is regularly tested Firefox 3 and higher – Firefox 3.5 is regularly.
CMD and TEI CMDI interoperability workshop Utrecht Matej Ďurčo, ICLTT, Vienna.
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
ISOcat introduction 10 May /20111CLARIN-NL ISOcat workshop.
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
Writing the Document Based Question (DBQ) Essay
Attributes and Values Describing Entities.
Presentation transcript:

11 CMDI/ISOcat And Semantic Operability Ineke Schuurman ISOcat content coördinator CLARIN-NL Menzo Windhouwer ISOcat system administrator Utrecht

2 Uhhh? “Give me a list with all forms of ‘wijf’ in 14 th century documents in Dutch by female authors, the same for the 16 th and 18 th century. Contrast them with documents by male authors and by unknown authors. Present the results ordered per region and per genre.” How to find data that could answer such a research question?

3 Metadata and machine Not ‘just by hand’ ► machine Subset selection ► metadata Some problem(s): question not formulated in ‘Metadatish’ What is clear for us is not clear for a machine What is meant by the concepts used (‘author’, ‘region’, ‘Dutch’) Several ‘definitions’ / ‘encoding schemes’ in use

4 CLARIN Not one metadata scheme favored You may combine elements of several schemes ► “semantic interoperability” is to be ensured –Is a ‘kopiist’ an author? –What defines a ‘genre’, a ‘region’? ► May differ in various metadata schemes coming with documents!

5 Consequence Within CLARIN, metadata concepts are to be defined CMDI ISOcat other Concept Registries related RELcat

Each CMD record contains some information to be used for interoperability: –Metadata header information Author Metadata profile used … Share profiles/components (structure and semantics) Still different profiles/components can also share semantics by sharing concepts –Main focus of this presentation 6 CMD Infrastructure

77 ISOcat ISOcat: Data Category Registry Data Category Registry defining widely accepted data categories (DCs) Registry that stores DCs for language resources and their metadata, together with properties of the DCs (definition, administration, examples, etc.)

ISOcat and CLARIN(-NL) ISOcat is used by CLARIN –For defining metadata concepts in CMDI Focus of this tutorial –For defining resource (content) concepts This has been the main focus of the ISOcat tutorial –Ineke Schuurman is the CLARIN-NL ISOcat content coordinator Guidelines (do’s and don’ts) for (reusing) DC specifications Review and recommendation 8

9 A good example NEHOL project Alphabet (DC-4143) –any set of characters representing the simple sounds used in a language or in speech generally In principle good because: -No language / project dependency -No tautology -Reusable (not too strict)

10 Some ‘rules’ Adopt an existing entry, if not possible: create a new entry In all cases: the entries should be GOOD ones But: what makes an entry a good one, one that you can (re)use?

11 Do’s Create a DCS for your scheme (name project, annotation scheme, …) –Share your DCS with the CLARIN-NL/VL ISOcat group Not a member yet? Contact Ineke Schuurman to get an invitation Adopt DC’s where possible (see don’ts) –Check ‘adopted’ DC’s regularly till standardization Use the Atom feed of your DCS Provide clear definition (short, to the point) for your scheme, application, …. –as general as possible, as specific as necessary Take care not to leave concepts used in your definition undefined or vague Use appropriate profile (for CMDI: metadata) Use appropriate vocabulary (per profile) Keep track of relationships to existing DC’s

12 Don’ts Be (too) language specific in definition Mention project/scheme in definition Use several definitions in one DC Circular definitions Rely only on authority Definition should fit YOUR purpose!

13 Athens Core Use these DCs! –We will take care of those definitions that are tautological too strict … –When you spot DCs that are imperfect, let us know!

14 Obsolete/flagged DCs Try to avoid linking with ‘deprecated’ or ‘superseded’ DCs –Can be needed for legacy data In other cases the flags show whether the DC specification is correct from a purely technical point of view –Note that only DCs with a green marking are qualified for recommendation –Reuse might trigger the owner into fixing the DC

CMDI and DC types A CMD component should map to a container DC A CMD element/attribute should map to a complex DC A CMD value should map to a simple DC The Component Registry enforces this mapping in the ISOcat search dialogue 15

16 CLARIN-NL Thank you for your attention. Any questions?