Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004.

Similar presentations


Presentation on theme: "Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004."— Presentation transcript:

1 Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004

2 The OLIF Format The Open Lexicon Interchange Format The Open Lexicon Interchange Format XML-compliant standard XML-compliant standard Supports exchange of lexical and terminological data for language technology applications Supports exchange of lexical and terminological data for language technology applications Handles basic exchange as well as more complex applications such as MT lexicons Handles basic exchange as well as more complex applications such as MT lexicons

3 The OLIF2 Consortium OLIF v.2 was developed by the OLIF2 Consortium, a group of language technology companies and organizations interested in issues of MT data/term data exchange OLIF v.2 was developed by the OLIF2 Consortium, a group of language technology companies and organizations interested in issues of MT data/term data exchange Led by SAP Led by SAP Members include Xerox, Microsoft, Trados, IBM, Systran, IAI, DFKI and Comprendium Members include Xerox, Microsoft, Trados, IBM, Systran, IAI, DFKI and Comprendium

4 Developing OLIF v.2 Based on OLIF prototype Based on OLIF prototype Developed in EC-funded OTELO project – proposing standards for users of disparate language tools Developed in EC-funded OTELO project – proposing standards for users of disparate language tools Original purpose of OLIF was to facilitate terminology exchange for industrial users of MT Original purpose of OLIF was to facilitate terminology exchange for industrial users of MT

5 Developing OLIF v.2 Version 2 adapted from OLIF prototype using input from Version 2 adapted from OLIF prototype using input from Developers/users of 3+ MT systems Developers/users of 3+ MT systems Developers/users of terminology management systems Developers/users of terminology management systems Other language standards projects: Other language standards projects: EAGLES EAGLES SALT SALT ISLE ISLE MARTIF, TBX MARTIF, TBX

6 OLIF Version 2 Released as open standard in 2002 XML-compliant XML-compliant Covers 6 European languages Covers 6 European languages English, German, French, Spanish, Danish, Portuguese English, German, French, Spanish, Danish, Portuguese Includes options for modeling administrative, morphological, syntactic and semantic data Includes options for modeling administrative, morphological, syntactic and semantic data

7 Available to Users XML implementation of OLIF specification in a DTD XML implementation of OLIF specification in a DTD Available from OLIF2 Consortium web site: Available from OLIF2 Consortium web site: www.olif.net

8 The OLIF File Follows Terminology Markup Framework (TMF) structure: Header Header Body Body Shared resources Shared resources

9 The OLIF Entry Collection of monolingual data on a specified sense of a word or phrase Optional links for cross-reference and transfer Optional links for cross-reference and transfer Transfer is bilingual and unidirectional Transfer is bilingual and unidirectional Multiple transfers in multiple languages possible for single word sense Multiple transfers in multiple languages possible for single word sense

10 Key Data Categories The OLIF entry is uniquely identified by 5 key data categories: The OLIF entry is uniquely identified by 5 key data categories: Canonical form Canonical form Language Language Part of speech Part of speech Subject field Subject field Semantic reading Semantic reading

11 Basic Well-Formed OLIF Entry table en noun general 86

12 <entry> table en noun general 86 </entry> Weber Weber ver ver like book,books like book,books cnt cnt [gencomp-opt] [gencomp-opt] inform inform

13 OLIF Entry with Cross-Reference <entry> table en noun general 86 </entry> <crossRefer> row row en en noun noun general general 69 69 has-meronym has-meronym

14 OLIF Entry with Transfer <entry> table en noun general 86 </entry> <transfer> Tabelle Tabelle de de noun noun general general 86 86

15 Data Category Values Allowed values specified by OLIF Allowed values specified by OLIF Administrative, terminological, linguistic values based on Administrative, terminological, linguistic values based on General industry standards General industry standards E.g., allowed values for date derived from recommendations from ISO 8601:1988 E.g., allowed values for date derived from recommendations from ISO 8601:1988 MT/Terminology standards MT/Terminology standards E.g., suggested values for subject field adapted from EC E.g., suggested values for subject field adapted from EC Widely-recognized linguistic standards Widely-recognized linguistic standards E.g., allowed values for gender based on longstanding gender description for European languages E.g., allowed values for gender based on longstanding gender description for European languages

16 User Extensions: The OLIF Data Category Registry Users may declare and use their own values for certain data categories: Users may declare and use their own values for certain data categories: Subject field Subject field Semantic reading Semantic reading Morphological structure Morphological structure Part of speech Part of speech Inflection Inflection Aspect Aspect Syntactic type Syntactic type Syntactic frame Syntactic frame Semantic type Semantic type Concept hierarchy Concept hierarchy

17 Organizing Based on Concept Users may link monolingual entries via a concept identifier Users may link monolingual entries via a concept identifier These IDs can be used to organize entries as equivalent word senses associated with the same concepts rather than source word senses associated with transfers. These IDs can be used to organize entries as equivalent word senses associated with the same concepts rather than source word senses associated with transfers.

18 Entries Linked by Concept <entry ConceptUserId= 0731F16CCCD2D3119B4D> 0731F16CCCD2D3119B4D> table table en en noun noun general general 86 86 </entry> <entry ConceptUserId= 0731F16CCCD2D3119B4D> 0731F16CCCD2D3119B4D> Tabelle Tabelle de de noun noun general general 86 86 </entry>

19 Whats Available to the OLIF User? On www.olif.net On www.olif.netwww.olif.net Complete XML DTD for download Complete XML DTD for download Hyperlinked DTD for viewing Hyperlinked DTD for viewing Graphical view of structure of DTD Graphical view of structure of DTD Current specification for OLIF v.2 Current specification for OLIF v.2 Formalization of OLIF data categories Formalization of OLIF data categories Alphabetic list of XML elements and attributes Alphabetic list of XML elements and attributes Fixed and recommended values for elements and attributes Fixed and recommended values for elements and attributes Guidelines for formulating canonical forms Guidelines for formulating canonical forms Sample OLIF entries Sample OLIF entries

20

21 Using OLIF Some applications: Some applications: SAP has implemented an OLIF converter to exchange terminological data from its central termbase SAPterm SAP has implemented an OLIF converter to exchange terminological data from its central termbase SAPterm MT developers in OLIF2 Consortium currently developing OLIF converters (Comprendium, Systran) MT developers in OLIF2 Consortium currently developing OLIF converters (Comprendium, Systran) OLIF User Forum = 60+ members OLIF User Forum = 60+ members

22 Whats New: XML Schema OLIF XSD offers 40+ built-in data types 40+ built-in data types Allows creation of user-defined data types Allows creation of user-defined data types Supports inheritance Supports inheritance

23 Whats New: The OLIF API Based on OLIF XSD, Java classes created Based on OLIF XSD, Java classes created Supports: Supports: Converting.csv files to OLIF Converting.csv files to OLIF Converting from XML format to OLIF Converting from XML format to OLIF Creating OLIF documents from scratch Creating OLIF documents from scratch Modifying OLIF documents Modifying OLIF documents

24 What to Expect this Year from OLIF OLIF XSD and API are available to the user from www.olif.net OLIF XSD and API are available to the user from www.olif.netwww.olif.net OLIF web site upgraded, updated OLIF web site upgraded, updated Requirements for modeling Japanese entries integrated Requirements for modeling Japanese entries integrated

25 OLIF User Forum Users of OLIF can access and post questions, messages and sample data from the OLIF group site: Users of OLIF can access and post questions, messages and sample data from the OLIF group site:http://groups.yahoo.com/group/olifConsortium/


Download ppt "Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004."

Similar presentations


Ads by Google