Presentation is loading. Please wait.

Presentation is loading. Please wait.

11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam 30-08-2012.

Similar presentations


Presentation on theme: "11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam 30-08-2012."— Presentation transcript:

1 11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam

2 22 Overview ISOcat –general –use in CLARIN An example Your task wrt ISOcat

3 33 ISOcat ISOcat: Data Category Registry Data Category Registry defining widely accepted data categories (DCs) Registry that stores DCs for language resources and their metadata, together with properties of the DCs (definition, administration, examples, etc.)

4 44 Use in CLARIN what is in resource A meant with DC X ? –There may be several (valid) definitions !!! Does X have the same meaning in resources A and B ? In CLARIN needed first and foremost for tools (so that they ‘know’ what the meaning of elements in resources are) –Especially important for: search in data and metadata –But also for other tools that apply to data (cf. last talk on TTNWW) Human use is only secondary, but … humans must after all fill the ISOcat registry, and make the right mappings

5 5 An example with ‘ev’ Have a look at these two tags: –WW(pv,tgw,ev) –N(soort,ev,dim,onz,stan) All parts of such tags, like ev, are to be included in ISOcat. The full tags are to be included as well. ev, enkelvoud, sg, sing, singular, singulier, …

6 6 singular All these representations can be mapped on one DC: singular -DC-4918 word form indicating that one entity is involved In full:

7 77 Other cats ISOcat: defining DCs ongoing RELcat: relating DCs started SCHEMAcat: a registry of Schemas, a schema being a description of the structure of your dataformatjust started

8 88 Call 4 projects Each call 4 project must check, for each DC used in your resource or its metadata, whether a corresponding DC exists in ISOcat –If not, extend ISOcat with such a DC, with all its properties (definitions, examples, etc.) create a schema with a mapping that maps each DC used in the resources and metadata to an ISOcat DC All this will be explained in tutorials

9 99 Call 4 projects do NOT underestimate this ISOcat task! Good news: DCs used in some common formats are already included in ISOcat –CGN / D-Coi tagset –TEI header elements –Many DCs concerning metadata Contact ASAP a CLARIN-centre to help you with this OR contact the helpdesk

10 10 CLARIN-NL Thank you for your attention. Any questions?

11 11

12 12 XML-format CGN CGN-format VU-DNC FoLiA-format is … is


Download ppt "11 CLARIN? ISOCAT! Ineke Schuurman ISOcat content coördinator CLARIN-NL Amsterdam 30-08-2012."

Similar presentations


Ads by Google