CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.

Slides:



Advertisements
Similar presentations
OAForum – September 2003 Muriel Foulonneau Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Muriel.
Advertisements

IRCS Workshop on Open Language Archives IMDI & Endangered Languages Archives Heidi Johnson / AILLA.
Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.
The Seven Pillars of Open Language Archiving: A Vision Statement Gary Simons and Steven Bird Workshop on Web-based Language Documentation and Description.
The Seven Pillars of Open Language Archiving: Introducing the OLAC Vision Gary Simons SIL International LSA Symposium: The Open Language Archives Community.
Building metadata components Dieter Van Uytvanck Max Planck Institute for Psycholinguistics CLARIN-NL Info Session Nijmegen
CLARIN Metadata & ISO DCR Daan Broeder. Max-Planck Institute for Psycholinguistics TKE ES05 Workshop, August 14th Dublin.
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
Interoperability aspects in the The Virtual Language Observatory Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
Advanced Metadata Usage Daan Broeder TLA - MPI for Psycholinguistics / CLARIN Metadata in Context, APA/CLARIN Workshop, September 2010 Nijmegen.
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
From CLARIN Component Metadata to Linked Open Data
CMDI Interoperability Workshop Daan Broeder TLA / MPI for Psycholinguistics CLARIN NL.
Flexible Syntax and Concept Registries as a basis for Metadata Daan Broeder TLA - MPI for Psycholinguistics & CLARIN Metadata in Context, APA/CLARIN Workshop,
The JISC IE Metadata Schema Registry Pete Johnston UKOLN, University of Bath JISC Joint Programmes Meeting Brighton, 6-7 July 2004
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
CLARIN Centers for a Sustainable Infrastructure Daan Broeder, MPI for Psycholinguistics Jan Odijk, Utrecht University.
Populating the Infrastructure using Standards Daan Broeder CLARIN NL EB TLA - MPI for Psycholinguistics CLARIN Coordinators Meeting June 29,30 Budapest.
CLARIN-NL First Call Jan Odijk CLARIN-NL Kick-off Meeting Utrecht, 27 May 2009.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
Agenda CMDI Workshop 9.15 Welcome 9.30 Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.15Coffee 10.30Use of ISOCat within CMDI.
Sharing Resources in CLARIN-NL Jan Odijk, Arjan van Hessen LRTS Workshop IJCNLP Chiang Mai, Thailand, 12 Nov 2011.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the.
ISOcat demo and providing RELcat input Menzo Windhouwer The Language Archive tla.mpi.nl Data Archiving and Networked Solutions
CLARINO WP2 National Registry and Long- Term Archiving Freddy Wetjen and Oddrun Pauline Ohren National Library of Norway Bergen, 12. September 2013.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
Metadata & CMDI CLARIN Component Metadata Infrastructure Daan Broeder et al. Max-Planck Institute for Psycholinguistics CLARIN NL CMDI Metadata Tutorial.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Why should we invest in DWF? Peter Wittenburg CLARIN Research.
CMDI Component Registry Patrick Duin Max Planck Institute for Psycholinguistics 2011.
CLARIN Infrastructure Vision (and some real needs) Daan Broeder CLARIN EU/NL Max-Planck Institute for Psycholinguistics.
Wishes from Hum infrastructures Examples: DOBES and CLARIN Peter Wittenburg Max Planck Institute for Psycholinguistics.
Metadata Lessons Learned Katy Ginger Digital Learning Sciences University Corporation for Atmospheric Research (UCAR)
CLARIN for Linguists Portal & Searching for Resources Jan Odijk LOT Summerschool Nijmegen,
11 CMDI/ISOcat And Semantic Operability Ineke Schuurman ISOcat content coördinator CLARIN-NL Menzo Windhouwer ISOcat system administrator Utrecht
ESIP & Geospatial One-Stop (GOS) Registering ESIP Products and Services with Geospatial One-Stop.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands NP CMDI-1 Metadata Component Framework New Standardization.
CLARIN Issues Peter Wittenburg MPI for Psycholinguistics Nijmegen, NL.
Technology – Broad View Aspects that play a role when integrating archives leave the details of some core topics to the 2. day Bernhard Neumair:Base Technologies.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
Metadata – use data discovery e.g. a library catalog data assessment determine the fitness-for-purpose of a data set data retrieval e.g., format.
CMDI Software Components. MD Service Delivers services for the Catalog & Search GUI – Query – Populate UI Acts as a WS and exposes the query and “queryModel()*”
CLARIN-NL Requirements and Desiderata Jan Odijk CLARIN-NL Call 3 Info-session Utrecht, 25 Aug 2011.
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
1 CLARIN - NL What is going on? Jan Odijk Amsterdam 26 Aug 2010.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands TLA/MPI requirements for a Semantic Registry.
Agenda CMDI Tutorial 9.30 Welcome & Coffee Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.30CMDI & ISO-DCR 10.50The CMDI.
CLARIN Concept Registry: the new semantic registry Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, Daniel Zeman
Creating & Testing CLARIN Metadata Components A CLARIN-NL project Folkert de Vriend Meertens Institute, Amsterdam 18/05/2010.
IULA-UPF repositories: management, integration, how to survive Marta Villegas.
ISO TC 37/CLARIN DISCUSSION UTRECHT, DECEMBER 9/ Thinning Down a Bloated Cat SUE ELLEN WRIGHT DECEMBER 2013.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
AAI needs of the Distributed Computing Infrastructures - CLARIN Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
ISWG / SIF / GEOSS OOS - August, 2008 GEOSS Interoperability Steven F. Browdy (ISWG, SIF, SCC)
Enhancing the Quality of Metadata by using Authority Control Thorsten Trippel, Claus Zinn LDL 2016 Workshop at LREC May 23-28, Portorož (Slovenia)
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Broad Functional Classification a Data Type Registry Use Case
CLARIN Federated Identity Vision
Metadata for research outputs management
Darja Fišer CLARIN ERIC Director of User Involvement
Session 2: Metadata and Catalogues
CMDI Component Registry
Presentation transcript:

CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics CLARIN NL Info session

Content  Component metadata  Infrastructure  Intermediate solutions  CMD Toolkit  Create CMD components now  Virtual Language Observatory  What can we do with metadata

Context  Other Metadata Infrastructures in our domain:  IMDI, OLAC/DC, TEI  Problems:  Inflexible: too many (IMDI) or too few (OLAC) fields  Limited interoperability  Problematic (unfamiliar) terminology for some sub- communities.  etc.

CLARIN Project - CMDI  Metadata infrastructure based on a “Component Metadata Model”  Aims  Flexibility  Researcher should themselves decide what metadata fits their needs  Offer ready made metadata components  Allow creation of new metadata components needed  Interoperability built-in  Complete Infrastructure: software for editing, harvesting, exploitation  Compatibility with existing frameworks: OLAC, IMDI

CMDI history  Berlin WP2 workshop, Oct  Oxford WP2 workshop Feb  Documents:  Metadata Infrastructure for Language Resources and Technology v3 Dec 2008  Metadata Infra Work Document, Feb 2009  Requirements for Virtual Collections Mar 2009, limited circulation.  CMDI developers wiki  Nijmegen Developers Workshop, May 2009

Metadata Components Technical Metadata Sample frequency Format Size … Lets describe a sound recording

Metadata Components Language Technical Metadata Name Id … Lets describe a sound recording

Metadata Components Language Technical Metadata Actor Sex Language Age Name … Lets describe a sound recording

Metadata Components Language Technical Metadata Actor Location … Continent Country Address Lets describe a sound recording

Metadata Components Language Technical Metadata Actor Location Project … Name Contact Lets describe a sound recording

Metadata Components Language Technical Metadata Actor Location Project Lets describe a sound recording Metadata schema Metadata profile

Metadata Components Language Technical Metadata Actor Location Project Lets describe a sound recording Metadata schema Metadata description

Metadata Components Country dcr:1001 Language dcr:1002 Location Country Coordinates Actor BirthDate MotherTongue Text Language Title Recording CreationDate Type Component registry BirthDate dcr:1000 ISOcat concept registry user Dance Name Type User selects appropriate components to create a metadata description Semantic interoperability partly solved via references to ISOcat concept registry Selecting metadata components from the registry Title: dc:title DCMI concept registry

CLARIN MD Live-cycle Search Service Joint Metadata Repository Metadata Repository Metadata Repository Relation Registry ISOcat Concept Registry DCMI Concept Registry other Concept Registry CLARIN Component Registry Semantic Mapping Create metadata schema from selection of existing components. Allow creation of new components if they have references to ISOcat Perform search/browsing on the metadata catalog using the ISO DCR and other concept registries and CLARIN relation registry Metadata component profile was selected from metadata component registry Metadata harvesting by OAI protocol Metadata descriptions created

Current solution What if you want to contribute metadata now?  The CLARIN ad-hoc registry (800+ resources, 130+ tools)  Provide IMDI or OLAC metadata  Harvesting (metadata transport) via:  OAI protocol for OLAC records or provide static records  XML harvesting for IMDI  Harvested metadata will be shown in a special CLARIN catalog.  Using the standard MPI/LAT catalog software  and integrated in VLO specializations

Use & Create CMD components now What if you are adventurous?  CLARIN metadata toolkit allows to start creating metadata components or use existing ones.  We have an existing set of components derived from:  IMDI metadata for sessions  IMDI catalog metadata  Small CLARIN NL project planned to test and report on this  But you can try it too!

THE END