Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.

Similar presentations


Presentation on theme: "CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics."— Presentation transcript:

1 CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics CLARIN NL Info session 1-7-2009

2 Content  Component metadata  Infrastructure  Intermediate solutions  CMD Toolkit  Create CMD components now  Virtual Language Observatory  What can we do with metadata

3 Context  Other Metadata Infrastructures in our domain:  IMDI, OLAC/DC, TEI  Problems:  Inflexible: too many (IMDI) or too few (OLAC) fields  Limited interoperability  Problematic (unfamiliar) terminology for some sub- communities.  etc.

4 CLARIN Project - CMDI  Metadata infrastructure based on a “Component Metadata Model”  Aims  Flexibility  Researcher should themselves decide what metadata fits their needs  Offer ready made metadata components  Allow creation of new metadata components needed  Interoperability built-in  Complete Infrastructure: software for editing, harvesting, exploitation  Compatibility with existing frameworks: OLAC, IMDI

5 CMDI history  Berlin WP2 workshop, Oct. 2008  Oxford WP2 workshop Feb. 2009  Documents:  Metadata Infrastructure for Language Resources and Technology v3 Dec 2008  Metadata Infra Work Document, Feb 2009  Requirements for Virtual Collections Mar 2009, limited circulation.  CMDI developers wiki  Nijmegen Developers Workshop, May 2009

6 Metadata Components Technical Metadata Sample frequency Format Size … Lets describe a sound recording

7 Metadata Components Language Technical Metadata Name Id … Lets describe a sound recording

8 Metadata Components Language Technical Metadata Actor Sex Language Age Name … Lets describe a sound recording

9 Metadata Components Language Technical Metadata Actor Location … Continent Country Address Lets describe a sound recording

10 Metadata Components Language Technical Metadata Actor Location Project … Name Contact Lets describe a sound recording

11 Metadata Components Language Technical Metadata Actor Location Project Lets describe a sound recording Metadata schema Metadata profile

12 Metadata Components Language Technical Metadata Actor Location Project Lets describe a sound recording Metadata schema Metadata description

13 Metadata Components Country dcr:1001 Language dcr:1002 Location Country Coordinates Actor BirthDate MotherTongue Text Language Title Recording CreationDate Type Component registry BirthDate dcr:1000 ISOcat concept registry user Dance Name Type User selects appropriate components to create a metadata description Semantic interoperability partly solved via references to ISOcat concept registry Selecting metadata components from the registry Title: dc:title DCMI concept registry

14 CLARIN MD Live-cycle Search Service Joint Metadata Repository Metadata Repository Metadata Repository Relation Registry ISOcat Concept Registry DCMI Concept Registry other Concept Registry CLARIN Component Registry Semantic Mapping Create metadata schema from selection of existing components. Allow creation of new components if they have references to ISOcat Perform search/browsing on the metadata catalog using the ISO DCR and other concept registries and CLARIN relation registry Metadata component profile was selected from metadata component registry Metadata harvesting by OAI protocol Metadata descriptions created

15 Current solution What if you want to contribute metadata now?  The CLARIN ad-hoc registry (800+ resources, 130+ tools)  Provide IMDI or OLAC metadata  Harvesting (metadata transport) via:  OAI protocol for OLAC records or provide static records  XML harvesting for IMDI  Harvested metadata will be shown in a special CLARIN catalog.  Using the standard MPI/LAT catalog software  and integrated in VLO specializations

16 Use & Create CMD components now What if you are adventurous?  CLARIN metadata toolkit allows to start creating metadata components or use existing ones.  We have an existing set of components derived from:  IMDI metadata for sessions  IMDI catalog metadata  Small CLARIN NL project planned to test and report on this  But you can try it too!

17 THE END


Download ppt "CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics."

Similar presentations


Ads by Google