Www.isocat.org ISOcat: known issues 10 May /20111CLARIN-NL ISOcat workshop.

Slides:



Advertisements
Similar presentations
EHR-S Reconciliation Worksheet Instructions. The spreadsheet is an extract from the EHR-S Database. Each column is Filterable by click- ing on the header.
Advertisements

Chapter Two The Scope of Semantics.
European Interoperability Architecture e-SENS Workshop : Cartography Tool in practise 7-8 January 2015.
ISOcat introduction 19 June 20121CLARIN-NL ISOcat workshop.
Matti Vartiala, a young civil engineer, is reading an English text. Click onwards to see how he deals with some difficult words in the text…
Data Category specifications 19 June 20121CLARIN-NL 2012 ISOcat tutorial.
CLARIN-NL/VL procedure 20 June 20131CLARIN-NL ISOcat workshop.
11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam
© ISIS How do we talk to zoo keepers? How do we talk to zoo keepers? Michele Peters University of St. Thomas.
Albert Gatt LIN3021 Formal Semantics Lecture 5. In this lecture Modification: How adjectives modify nouns The problem of vagueness Different types of.
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
Search Engines and Information Retrieval
Environmental Terminology System and Services (ETSS) June 2007.
Information Retrieval in Practice
Reporting on Policies and Measures Introductory presentation by the UNFCCC secretariat Workshop on the preparation of fourth national communications from.
Matti Vartiala, a young civil engineer, has a problem: He is reading an English instruction sheet on surveying, which says he should ’…take a test sample.
Data Category specifications 20 March 20121CLARIN-NL ISOcat workshop.
CLARIN-NL: Dealing with ISOcat Ineke Schuurman. ISOcat and CLARIN Projects call 1 CLARIN-NL Joint Flemish/Dutch pilot Whenever relevant, elements are.
CLARIN-NL ISOcat workshop 2011 part 2 Ineke Schuurman Menzo Windhouwer.
2002 October 10SFWR ENG 4G030 Translating from English into Mathematics SFWR ENG 4G Robert L. Baber.
Search Engines and Information Retrieval Chapter 1.
The ISO-DCR 17 January /20111CMDI tutorial Marc Kemps-Snijders a, Menzo Windhouwer b, Sue Ellen Wright c a Meertens Institute, b MPI for.
STANDARDS AND INTEROPERABILITY; RIGHTS ISSUES Status and summary 1.
ISOcat demo and providing RELcat input Menzo Windhouwer The Language Archive tla.mpi.nl Data Archiving and Networked Solutions
MY E-PORTFOLIO (WHAT I’VE LEARNED DURING THESE MEETINGS, WHAT IS NOT SO CLEAR, WHAT I DON’T GET AT ALL)
CLARIN-NL Call 3 ISOcat follow-up 10/10/20121CLARIN-NL ISOcat Call 3 follow-up.
DC specifications or “Do’s and don’ts” when creating a DC.
Content of the Data Category Registry 10 May /20111CLARIN-NL ISOcat workshop.
Controlled Vocabulary Working Group PRESENTED BY JOHN PORTER.
Scientific writing style Exact  Word choice: make certain that every word means exactly what you want to express. Choose synonyms with care. Be not.
Clustering User Queries of a Search Engine Ji-Rong Wen, Jian-YunNie & Hon-Jian Zhang.
ISOcat: known issues 20 June 20131CLARIN-NL ISOcat workshop.
CLARIN-NL Call 4 ISOcat follow-up 2/10/20131CLARIN-NL Call 4 ISOcat follow-up.
The World Wide Web is a great place to find more information about a topic. But there are a lot of sites out there—some are good and some are not so good.
EPA’s Environmental Terminology System and Services (ETSS) Michael Pendleton Data Standards Branch, EPA/OEI Ecoiformatics Technical Collaborative Indicators.
ISOcat introduction 20 June 20131CLARIN-NL ISOcat workshop.
ISOcat introduction 20 March 20121CLARIN-NL ISOcat workshop.
Content analysis and CERN Roman Chyla. Artificial intelligence Natural language processing Web of data Content analysis.
CLARIN-NL ISOcat workshop 2012 part 2 ( ) Ineke Schuurman Menzo Windhouwer.
ISOcat: known issues 19 June 20121CLARIN-NL ISOcat workshop.
11 CMDI/ISOcat And Semantic Operability Ineke Schuurman ISOcat content coördinator CLARIN-NL Menzo Windhouwer ISOcat system administrator Utrecht
The World Wide Web is a great place to find more information about a topic. But there are a lot of sites out there—some are good and some are not so good.
ISOcat: How to create a DC (including “do’s and don’ts”) 19 June 20121CLARIN-NL ISOcat tutorial.
CLARIN-NL Requirements and Desiderata Jan Odijk CLARIN-NL Call 3 Info-session Utrecht, 25 Aug 2011.
Beyond ISOcat 20 June 2013CLARIN-NL ISOcat tutorial1.
An Iterative Approach to Extract Dictionaries from Wikipedia for Under-resourced Languages G. Rohit Bharadwaj Niket Tandon Vasudeva Varma Search and Information.
Introduction to Science Research Writing This textbook is very useful for engineering students who are writing research papers. The approach taken here.
CLARIN Concept Registry: the new semantic registry Ineke Schuurman, Menzo Windhouwer, Oddrun Ohren, Daniel Zeman
ISOcat status
CLARIN Requirements for a Semantic Registry Daan Broeder The Language Archive – MPI Ineke Schuurman CLARIN-NL/VL – KU Leuven & Utrecht.
1 ISOCAT Proposed solutions for Problems encountered in DUELME-LMF Jan Odijk Nijmegen 21 Sep 2010.
1 CLARIN? ISOCAT! Ineke Schuurman Hilversum,
Adding and Subtracting Decimals © Math As A Second Language All Rights Reserved next #8 Taking the Fear out of Math 8.25 – 3.5.
Creating & Testing CLARIN Metadata Components A CLARIN-NL project Folkert de Vriend Meertens Institute, Amsterdam 18/05/2010.
Data from a Distance: Let your website speak for you Gillian Byrne QEII Library, Memorial University of Newfoundland.
OECD Expert Group on Statistical Data and Metadata Exchange (Geneva, May 2007) Update on technical standards, guidelines and tools Metadata Common.
ISO TC 37/CLARIN DISCUSSION UTRECHT, DECEMBER 9/ Thinning Down a Bloated Cat SUE ELLEN WRIGHT DECEMBER 2013.
ISOcat: How to create a DC (including “do’s and don’ts”) 20 June 20131CLARIN-NL ISOcat tutorial.
Group work and standardization features in ISOcat Menzo Windhouwer 8/14/20101Standardizing Data Categories in ISOcat - Implementing Group.
Producing Good Education, Health and Care Plans Quality Assurance January 2016 Spring Term 2016 DfE NeedsOutcomesProvision Aspirations.
CMD and TEI CMDI interoperability workshop Utrecht Matej Ďurčo, ICLTT, Vienna.
Co-Financed by European Regional Development Fund and made possible by the INTERREG IVC MITKE Overall progress 5th Steering Group Meeting Rzeszów, 30th.
ISOcat introduction 10 May /20111CLARIN-NL ISOcat workshop.
Marc Kemps-Snijders Menzo Windhouwer Sue Ellen Wright
Relations between Data Categories
HOW NOVEL.
RECENT TRENDS IN METADATA GENERATION
Unit 4 Introducing the Study.
The scope of Semantics Made Simple
ISOCAT ISOCAT Problems
Presentation transcript:

ISOcat: known issues 10 May /20111CLARIN-NL ISOcat workshop

Known issues ISOcat: ongoing effort As will be clear from the last session by Menzo there are still a series of ‘loose ends’ – RELcat – Searching – Mapping – Definitions

RELcat Linking DC is not just a ‘nice’ feature – Proper noun – Common noun – Mass noun – Count noun are all instances of ‘noun’ (i.e. have an IsA relation with it)

RelCat Essential for several Dutch tagsets N(soort, ….) comes with 2 DCs: 1.Noun 2.Common How to relate this with one of the DCs for ‘common noun’, even in case we would find the definition perfect? Good news: in progress!

Searching How to detect which DCs are Standardized? Or have a Dutch language section? How to search using the keys? And what about language of keywords? How to detect which DCs ‘belong together’ (unless one mentions the tag set in the definition)

Searching How to search for alternative names (Data Element Names): transitief/vergankelijk; prepositie/ voorzetsel And the results: when not using ‘exact’ match and a specific field, MANY results come up, apparently unordered. While using exact + field may make you miss relevant entries.

Consequences of mapping Suppose, you map with a specific DC, and some essential changes are made to that DC – You may no longer want to map, but how do you know? Suppose there are several relevant DCs, you select one and just that one doesn’t get standardized – You have to redo your work (but you first are to be aware that …)

Ill-defined DCs Profile: morphosyntax – Definition: semantic ‘concept’ in definition not defined in ISOcat, or That concept comes with several DCs (which one was meant?)

Too many DCs There are too many ‘almost the same’ DCs, even within the same profile Too vague DCs There are many DCs with rather ‘empty’ definitions – Proper noun: a noun or adjective denoting a single object – Common noun: a noun or adjective denoting a class of objects

Too language-specific DCs Quite a number of DCs are too specific, mostly Polish ones, this makes it difficult ot map with them i.e. stuff that belongs in the Polish language section is in the general, English one

Therefore, while for some technical issues solutions will come up YOU should also be very careful yourself, especially wrt the ‘soundness’ of the DCs, in particular as far as definitions, profile, and translation are concerned! Only in that case ISOcat can become a success story!

Follow-up Contact – Menzo for technical problems – Ineke for content problems Next workshop – September or October – Before 15 August share a first substantial selection for your project with the CLARIN-NL group, and a spreadsheet for RELcat 10 May /2011CLARIN-NL ISOcat workshop12

Thanks !