Presentation is loading. Please wait.

Presentation is loading. Please wait.

ISO 16642 TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.

Similar presentations


Presentation on theme: "ISO 16642 TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria."— Presentation transcript:

1 ISO 16642 TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria

2 Background - ISO etc. The need for abstraction Structure and content of terminological data - picture virtual-actual The meta-model (structural skeleton) Describing data categories Styles and vocabularies XTMF as a mapping tool - examples Further work: extending the model to a wider scope (language engineering)

3 Overview

4 General principles 4 Expressing constraints on the representation of computerized terminologies What is the underlying structure of computerized terminologies? Which data-category is used and under which conditions? 4 Maintaining interoperability between representations Providing a conceptual tool to compare two given formats

5 Definitions 4 TMF: Terminological Mark-up Framework Definition of underlying structures and mechanisms needed for the computer representation of terminological data Independence with regards any specific format 4 TML: Terminological Mark-up Language One specific representation format generated within TMF E.g.: DXLT is a possible TML

6 A family of formats TMF TML 1 TML 2 TML 3 TML 1 … (DXLT)(Geneter)

7 Meta-model Representing the underlying structure of terminological data

8 * * 1 * 1 * * 1 * 1 * * 11 * 1 0:1 Terminological Data Collection Global Information Terminological Entry Complementary Information Terminology- related Information Language Section Term Section Term Component Section

9 The structural skeleton Terminological Data Collection (TDC) Global Information (GI)Complementary Information (CI) Terminological Entry (TE) Language Section (LS) Term Level (TL) Term Component Level (TCL) * * * *

10 How does this work? Walking through an example…

11 DXLT example manufacturing A value between 0 and 1 used in... alpha smoothing factor fullForm Alfa...

12 Identifying the structural skeleton id=‘ID67’ [attribute] subjectField=‘ manufacturing ’ [typedElement] definition=‘A value…’ [typedElement] lang=‘ hu ’ [attribute]lang=‘ en ’ [attribute] term=‘…’ [element] term=‘alpha smoothing factor’ [element] termType=‘fullForm’ [typedElement] TE LS TS TE: Terminological Entry LS: Language Section TS: Term Section

13 TMF information model TE TS LS TS id=‘ID67’ subjectField=‘ manufacturing ’ definition=‘A value…’ lang=‘ hu ’ lang=‘ en ’ term=‘…’ term=‘alpha smoothing factor’ termType=‘fullForm’

14 GMT representation ID67 manufacturing A value between 0 and 1 used in... en alpha smoothing factor fullForm hu Alfa...

15

16 TML à la mode ISO –Ingredients –A structural skeleton »(take the TMF Metamodel) –A reference Data Category Registry »ISO 12620 is a good place to find one –Recette –Choose some data categories from the registry »You can even constrain the values of your datcats –Associate a style and vocabulary to each datcat »You can inspire yourself from others (DXLT) –Serve it hot to your software guy with a piece of SALT software

17 GMT Generic Mapping Tool

18 Background 4 Interoperability principle –If any two TMLs have exactly the same DCS, even though they differ radically in style and vocabulary, they are equivalent. 4 Consequence –It is always possible to define a filter from one TML to another when they are interoperable GMT is the intermediate representation to do so

19 From one TML to another 4 GMT - Generic mapping tool –an abstract XML representation identification of levels – … »a recursive element representation of data-categories – …

20 GMT description cont. Bracketing features xxx Lenoc

21 GMT description cont Annotating information pencil whose casing is fixed around a cental graphite medium which is used for writing or making marks

22

23 The tmf element Description: –The tmf element is the root element for any valid XTMF document. It contains both the global information that corresponds to a terminological data collection, the collection itself, and the complementary information comprising external resources in particular, which are needed for describing the various terminological entries. Content model:

24 The struct element Description –The struct element should be used to represent a locus in a given structural skeleton. One such locus will be represented by exactly one struct element. The struct element is recursive and may also contain feat and/or brack elements to express attributes belonging to the corresponding level of the meta model. Attributes: –type: level in the meta model (TDC, TE, LS, TS or TCS) Content model:

25 Styles and vocabularies

26 Implementating a DatCat –Definitions: ‘ style ’ — The way a given DatCat is implemented as an XML object… ‘ vocabulary ’ — symbols needed to express the implementation of a given DatCat in its associated style ; –E.g.: »DatCat: /definition/ »Vocabulary = [def] »Style = Element » pencil whose casing … DatCat value

27 Implementating a DatCat (Cont.) –Definition: ‘ anchor ’ — the XML element(s) to which the implementation of a given DatCat can be attached –E.g.: alpha smoothing factor

28 Styles - element 4 Element Def.: The Datcat is implemented as an element, child of its anchor Vocabularies : the name of the corresponding element E.g.: pencil whose casing … alpha smoothing factor

29 Styles - attribute 4 Attribute Def.: The Datcat is implemented as an attribute of its anchor Vocabularies : the name of the corresponding attribute E.g.: … DatCat value

30 4 TypedElement 4 ValuedElement 4 TypedValuedElement

31 Data Categories A Formal Description

32 Data Category Registry dcsd:DataCategory rdf:about Data Category DCRegistry Description VersionNumber

33 Data Category description DCDefinition DCName Content dcsd:DCDefinition dcsd:DCName dcsd:Content dcsd:DCIdentifier dcsd:Level DCType (S, C) dcsd:DCType Salt 2000-11-08/SEW dcsd:DCAdmin DCComment dcsd:DCComment Data Category Locus DCAdmin DCIdentifier DCParent dcsd:DCParent DCExample dcsd:DCExample

34 Levels and content Content DataType TargetType Ref to other datcat(s) dcsd:DataType dcsd:TargetType rdf:Alt rdf:li List of References Ref to other datcats rdf:Alt rdf:li Level/Loci rdf:Alt Ref to other datcat(s) rdf:li List of References

35 Administrative properties dcsd:DCAdmin Data Category DCAdmin Status dcsd:Status StatusDate dcsd:StatusDate StatusNote dcsd:StatusNote EditionDate dcsd:EditionDate ShortFormAdmittedNameForbiddenName Source dcsd:Source VariantNames dcsd:VariantNames Dcsd:ShortForm Dcsd:AdmittedName Dcsd:ForbiddenName

36 Actualizing a DatCat TMF specific properties

37 Styling properties dcsd:Style Data Category Style StyleName dcsd:StyleName ElementName dcsd:ElementName AttributeName dcsd:AttributeName TypeValue dcsd:TypeValue Simple Element Attribute TypedElement ValuedElement TVElement Value dcsd:Value Pour simple Anchor dcsd:Anchor

38 Attribute style description dcsd:StyleName=“Attribute” –Conditions of use: Not valid for annotations –Required properties dcsd:AttributeName –Example: dcsd:AttributeName=“id” …

39 Element style description dcsd:StyleName=“Element” –Required properties dcsd:ElementName –Example: dcsd: ElementName =“definition” …

40 TypedElement style description dcsd:StyleName=“TypedElement” –Required properties dcsd:ElementName, dcsd:TypeValue –Example: dcsd:ElementName =“termNote” dcsd:TypeValue=“partOfSpeech” N

41 ValuedElement style description dcsd:StyleName=“ValuedElement” –Conditions of use: Not valid for annotations –Required properties dcsd:ElementName –Example: dcsd:ElementName =“pos”

42 TVElement style description dcsd:StyleName=“TVElement” –Conditions of use: Not valid for annotations –Required properties dcsd:ElementName, dcsd:TypeValue –Example: dcsd:ElementName =“free” dcsd:TypeValue=“pos”

43 Simple style description dcsd:StyleName=“Simple” –Conditions of use: Express the value of simple data categories –Required properties: dcsd:Value –Example: dcsd:Value =“Nom” Nom

44 Bracketing information

45 Annotating content

46 Rationale 4 Why should we annotate specific content? –To identify components which are not explicitly expressed as a specific part of a terminological entry E.g.: Characteristics of a concept –To relate a component to another entry or an external resource E.g.: bibliographical reference

47 Dealing with languages

48 Two types of languages 4 Working language The language used at a given place in a document, along the XML hierarchy Representation: xml:lang 4 Object language The language about which you speak at a given place in your terminological entry (e.g. describes the Language Section level) Representation: as a data category “language”, with a narrow scope

49 Example — DXLT Une valeur entre 0 et 1 utilisée… alpha smoothing factor fullForm

50 Example — GMT en Une valeur entre 0 et 1 utilisée… alpha smoothing factor fullForm

51 Conclusion –A general model for analysing and representing terminological data collection –An underlying formalism expressed in XML,RDF –Associated tools (Salt project) DCSEditor, DCSBrowser, Automatic generation of XSLT filters and XML schemas from a given TML specification

52 Useful pointers 4 SALT project –http://www.loria.fr/projets/SALT –http://www.ttt.org/ 4 The TMF site –http://www.loria.fr/projets/TMF


Download ppt "ISO 16642 TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria."

Similar presentations


Ads by Google