Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright

Slides:



Advertisements
Similar presentations
OLIF V2 Gr. Thurmair April OLIF April 2000 OLIF: Overview Rationale Principles Entries Descriptions Header Examples Status.
Advertisements

Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004.
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
ISOcat Data Model: Workflow & Guidelines Marc Kemps-Snijders a, Sue Ellen Wright b, Menzo Windhouwer a a Max Planck Institute for Psycholinguistics, b.
ISOcat Data Category Registry Defining widely accepted linguistic concepts Menzo Windhouwer 1CLARIN-NL MD tutorial, September 2009.
Bulk loading ISOcat data categories with the Data Category Interchange Format 10/24/20111CLARIN-NL ISOcat Call 2 followup.
Principles of ISOcat, a Data Category Registry Marc Kemps-Snijders a, Menzo Windhouwer a, Sue Ellen Wright b a Max Planck Institute for.
ISOcat introduction 19 June 20121CLARIN-NL ISOcat workshop.
Data Category specifications 19 June 20121CLARIN-NL 2012 ISOcat tutorial.
ICT Monica Monachini – 1° KYOTO Workshop – Amsterdam 2/ KYOTO (ICT ) Yielding Ontologies for Transition-Based Organization Intelligent.
11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam
Edition 3 Metadata registry (MDR) Ray Gates May 12, /05/20151.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
MLIF: A Metamodel to Represent and Exchange Multilingual Textual Information ISO TC37 SC4 WG Samuel Cruz-Lara, Gil Francopoulo, Laurent Romary,
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.
Procedures to Develop and Register Data Elements in Support of Data Standardization September 2000.
Data Category specifications 20 March 20121CLARIN-NL ISOcat workshop.
CLARIN-NL First Call Jan Odijk CLARIN-NL Kick-off Meeting Utrecht, 27 May 2009.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Commonalities and Differences.
Principles of the GOLD Ontology & Conversion of GOLD to DCIF Presenters: Anthony Aristar, Evelyn Richter.
Provo, 16 Aug 2007 LMF meeting 1 Lexical Markup Framework: ISO Provo meeting Gil Francopoulo.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
CLARIN web services and workflow Marc Kemps-Snijders.
The ISO-DCR 17 January /20111CMDI tutorial Marc Kemps-Snijders a, Menzo Windhouwer b, Sue Ellen Wright c a Meertens Institute, b MPI for.
ISOcat demo and providing RELcat input Menzo Windhouwer The Language Archive tla.mpi.nl Data Archiving and Networked Solutions
CLARIN-NL Call 3 ISOcat follow-up 10/10/20121CLARIN-NL ISOcat Call 3 follow-up.
Content of the Data Category Registry 10 May /20111CLARIN-NL ISOcat workshop.
Classification and the Metadata Registry Judith Newton NIST IRS XML Stakeholders/ XML Working Group May 18, 2004.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. TBX TermBase Exchange Format.
Nancy Lawler U.S. Department of Defense ISO/IEC Part 2: Classification Schemes Metadata Registries — Part 2: Classification Schemes The revision.
Report on the ISOcat project Marc Kemps-Snijders Menzo Windhouwer Peter Wittenburg Sue Ellen Wright January 8,
Metadata Registries Workshop April 15, 1998 Slide 1 of 20 ANSI X Douglas D. Mann Stewardship Naming & Identification Classification.
24 Jan 2005 Kick off meeting (Luxembourg) 1 LIRICS Linguistic Infrastructure for Interoperable Resources and Systems ►Kick off meeting presentation ►Proposal.
CLARIN-NL Call 4 ISOcat follow-up 2/10/20131CLARIN-NL Call 4 ISOcat follow-up.
ISOcat introduction 20 June 20131CLARIN-NL ISOcat workshop.
ISOcat introduction 20 March 20121CLARIN-NL ISOcat workshop.
CLARIN work packages. Conference Place yyyy-mm-dd
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
11 CMDI/ISOcat And Semantic Operability Ineke Schuurman ISOcat content coördinator CLARIN-NL Menzo Windhouwer ISOcat system administrator Utrecht
Creating a European entity Management Architecture for eGovernment CUB - corvinus.hu Id Réka Vas
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands TLA/MPI requirements for a Semantic Registry.
ISO TC 37/CLARIN SEMANTIC DATA REGISTRY WORKSHOP UTRECHT, DECEMBER ISOcat: Metadata Registry SUE ELLEN WRIGHT DECEMBER 2013.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
CLARIN Concept Registry: the new semantic registry Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, Daniel Zeman
The ISO Data Category Registry ISO 12620:2009 introduces – A web-based electronic Data Category Registry (DCR) for simple, complex and (in the future)
ISOcat status
CLARIN Requirements for a Semantic Registry Daan Broeder The Language Archive – MPI Ineke Schuurman CLARIN-NL/VL – KU Leuven & Utrecht.
Menzo Windhouwer.  The Typological Database System (TDS) provides integrated access to multiple, independently created typological databases.  Users.
1 ISOCAT Proposed solutions for Problems encountered in DUELME-LMF Jan Odijk Nijmegen 21 Sep 2010.
1 CLARIN? ISOCAT! Ineke Schuurman Hilversum,
Developing OLIF, Version 2 Susan M. McCormick Christian Lieske OLIF2 Consortium SAP/Walldorf, Germany.
Formats, interoperability and standards Marc Kemps-Snijders.
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
ISO TC 37/CLARIN DISCUSSION UTRECHT, DECEMBER 9/ Thinning Down a Bloated Cat SUE ELLEN WRIGHT DECEMBER 2013.
ISOcat tutorial DCR data model and guidelines. Simple and complex DCs Simple Data CategoryComplex Data CategoryConceptual Domain Data CategoryDescription.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
TDS-Curator DANS MPI for Psycholinguistics Utrecht Institute of Linguistics OTS languagelink.let.uu.nl/tds/ 9/21/20101CLARIN-NL - Call 1 - ISOcat status.
Group work and standardization features in ISOcat Menzo Windhouwer 8/14/20101Standardizing Data Categories in ISOcat - Implementing Group.
Linking to Linguistic Data Categories in ISOcat Menzo Windhouwer a, Sue Ellen Wright b a The Language Archive - MPI for Psycholinguistics,
ISOcat introduction 10 May /20111CLARIN-NL ISOcat workshop.
Progress Update MSIS: Bratislava, April 2005
The Re3gistry software and the INSPIRE Registry
Edition 3 Metadata registry (MDR)
Session 2: Metadata and Catalogues
Open Archival Information System
ISO/IEC (MFI-6) Scope definition & Document Structure
European Statistical System Metadata Handler ESS MH (Super) Providers
Presentation transcript:

Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright ISOcat Data Category Registry Defining widely accepted linguistic concepts Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright CLARIN-NL Info dag, 1 July 2009

ISOcat: a reference implementation Terminology and other content and language resources — Specification of data categories and management of a Data Category Registry for language resources ISO 12620:1999 was a fixed list of data categories, this revision provides a data model and management procedures ISO Technical Committee 37 Terminology and other language and content resources CLARIN-NL Info dag, 1 July 2009

ISO 24613:2008 Lexical Markup Framework Lexicon 1..* Lexical Entry partOfSpeech writtenForm grammaticalNumber lexicalType Word Form Lemma 1..* 0..* Form Sense 0..* CLARIN-NL Info dag, 1 July 2009

Data categories “result of the specification of a given data field ” (ISO 12620:2009) data element concept (ISO 11179) “concept for which the definition, identification and conceptual domain are specified independently of any particular representation” complex data categories are data element concepts CLARIN-NL Info dag, 1 July 2009

Data category types complex: open closed constrained simple: writtenForm string open grammaticalGender string neuter masculine feminine closed email string constrained Constraint: .+@.+ simple: CLARIN-NL Info dag, 1 July 2009

Data category specification Administration Information Section Description Section Data Element Name Language Section Name Section Conceptual Domain Linguistic Section Mandatory: A mnemonic identifier An English definition An English name A conceptual domain CLARIN-NL Info dag, 1 July 2009

Data Category Selections Anyone can register with ISOcat can create data categories can create data category selections (DCSs) can share DCSs can make DCSs public can submit DCSs for standardization CLARIN-NL Info dag, 1 July 2009

ISO standardization process Submission group Thematic Domain Group Evaluation Data Category Registry Board Validation Stewardship group ISO Publication CLARIN-NL Info dag, 1 July 2009

Using data categories Each data category has a Persistent Identifier (PID): http://www.isocat.org/datcat/DC-1297 This PID can be embedded in the schemata of linguistic resources: <rng:element name=“gender” dcr:datcat=“…/DC-1297”> The full data category specification can be downloaded from ISOcat in the Data Category Interchange Format (DCIF) CLARIN-NL Info dag, 1 July 2009

ISOcat demonstration http://www.isocat.org/ CLARIN-NL Info dag, 1 July 2009

Status of ISOcat ISOcat is under active development: Now: Future: You can access public data categories and selections You can create your own data categories and selections Future: Group features Cleanup by TDGs Standardization workflow CLARIN-NL Info dag, 1 July 2009

Relation Registry ISOcat contains a flat list of concepts The Relation Registry will support storing (user-specific) relations between these concepts is-a part-of equivalent-to related-to … Will support: Ontologies and taxonomies on top of data categories Searches across related data categories … CLARIN-NL Info dag, 1 July 2009

Thanks for your attention! http://www.isocat.org/ Menzo.Windhouwer@mpi.nl CLARIN-NL Info dag, 1 July 2009