Representing dictionaries with the TEI Proposal for basic guidelines Laurent Romary - Max Planck Digital Library With the help of Susanne Alt - CNRS.

Slides:



Advertisements
Similar presentations
Chapter 7 System Models.
Advertisements

Author: Graeme C. Simsion and Graham C. Witt Chapter 7 Extensions and Alternatives.
Music Encoding Initiative (MEI) DTD and the OCVE
EAD Revision: Technical Considerations Terry Catapano EAD Roundtable Meeting
ISO DSDL ISO – Document Schema Definition Languages (DSDL) Martin Bryan Convenor, JTC1/SC18 WG1.
Systems Analysis and Design 9th Edition
MLIF: A Metamodel to Represent and Exchange Multilingual Textual Information ISO TC37 SC4 WG Samuel Cruz-Lara, Gil Francopoulo, Laurent Romary,
Inside View of DDI Version 3.0: Structural Reform Group Report Presented to IASSIST 25 May 2005 Edinburgh Scotland UK.
TC3 Meeting in Montreal (Montreal/Secretariat)6 page 1 of 10 Structure and purpose of IEC ISO - IEC Specifications for Document Management.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 8 Slide 1 System models.
Uncovering the TEI and ODD A pedagogical strip-tease Laurent Romary - Max Planck Digital Library.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
Mgt 20600: IT Management & Applications Databases Tuesday April 4, 2006.
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Unit 4 – XML Schema XML - Level I Basic.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 7 Slide 1 System models l Abstract descriptions of systems whose requirements are being.
Main challenges in XML/Relational mapping Juha Sallinen Hannes Tolvanen.
TMF - a tutorial TMF - Terminological Markup Framework Laurent Romary - Laboratoire Loria.
12 December, 2012 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: European Filing Rules CWA1Page 1.
2 1 Chapter 2 Data Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 7 Slide 1 System models l Abstract descriptions of systems whose requirements are being.
Chapter 4 System Models A description of the various models that can be used to specify software systems.
System models Abstract descriptions of systems whose requirements are being analysed Abstract descriptions of systems whose requirements are being analysed.
Using the TEI framework as a possible serialization for LMF Laurent Romary INRIA & HUB-IDSL
Software Engineering Chapter 8 Fall Analysis Extension of use cases, use cases are converted into a more formal description of the system.Extension.
Experiments with ODD outside the TEI framework Laurent Romary & Piotr Banski The ISO-TEI connection.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
Richard Siegersma General Manager Thorpe-Bowker Australian ISBN agency since 1997.
Developing a common set of federal NDR’s Mark Crawford Draft April 28, 2005.
©Ian Sommerville 1995/2000 (Modified by Spiros Mancoridis 1999) Software Engineering, 6th edition. Chapter 7 Slide 1 System models l Abstract descriptions.
TEI and Scholarly publishing Laurent Romary INRIA & HUB-ISDL TEI council, chair.
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
TUTORIAL Dolphy A. Fernandes Computer Science & Engg. IIT Bombay.
Chapter 7 System models.
System models l Abstract descriptions of systems whose requirements are being analysed.
Modified by Juan M. Gomez Software Engineering, 6th edition. Chapter 7 Slide 1 Chapter 7 System Models.
Software Engineering, 8th edition Chapter 8 1 Courtesy: ©Ian Somerville 2006 April 06 th, 2009 Lecture # 13 System models.
Sommerville 2004,Mejia-Alvarez 2009Software Engineering, 7th edition. Chapter 8 Slide 1 System models.
1 Metadata Standards Catherine Lai MUMT-611 MIR January 27, 2005.
Overview of EAD Jenn Riley Metadata Librarian Digital Library Program.
Interface Design Web Design Professor Frank. Design Graphic design and visual graphics are equally important Both work together to create look, feel and.
Tutorial 13 Validating Documents with Schemas
RELATORS, ROLES AND DATA… … similarities and differences.
Transitioning from FGDC CSDGM Metadata to ISO 191** Metadata
Using XML to store Descriptive Metadata Richard Murphy Rosarie O’Riordan Central Statistics Office Ireland.
XML Schema. Why Validate XML? XML documents can generally have any structure XML grammars define specific document structures Validation is the act of.
TMF - Terminological Markup Framework Laurent Romary Laboratoire LORIA (CNRS, INRIA, Universités de Nancy) ISO meeting London, 14 August 2000.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 System models.
WP 3: Standardisation of shared metadata Mode of operation –All partners are involved –Building on practice outside the project Achievements of Year 1.
Towards a roadmap for standardization in language technology Laurent Romary & Nancy Ide Loria-INRIA — Vassar College.
Systems Analysis and Design 8th Edition
XML blocks XML STRUCTURE The most basic building blocks of an XML file are elements, attributes and comments. Compiled based on Tutorial PhUSE 2008 XML.
Slang. Informal verbal communication that is generally unacceptable for formal writing.
TEI presentation for IS 590 Robert Patrick Waltz July 10 th, 2012.
Engineering, 7th edition. Chapter 8 Slide 1 System models.
Mission Sentences and Paragraphs
Abstract descriptions of systems whose requirements are being analysed
Markup Languages Gilok Choi 9/17/2018
eXtensible Markup Language
Lecture Software Process Definition and Management Chapter 3: Descriptive Process Models Dr. Jürgen Münch Fall
Mission Sentences and Paragraphs
Presentation transcript:

Representing dictionaries with the TEI Proposal for basic guidelines Laurent Romary - Max Planck Digital Library With the help of Susanne Alt - CNRS

Background The P5 edition of the TEI guidelines –XML –ODD - Roma Modules and classes –DTD, RelaxNG, W3C schemas The dictionary chapter –Very close to the P4 version –Work to be done Enhancing the coherence with the class system Providing more examples …

Proposal for today Browse through the main features of the dictionary chapter –Identify questionable issues –Select best practices Work with Roma and implement (part of) the best practices –Minimal schema that dictionary project can start with Bottom approach to customization Discuss about conformance

Dictionaries as TEI documents Same general document structure as any other TEI document –, Define a common strategy concerning source identification with general text sources Specific documentation of previous editions Intuition that is not to be retained here –,, –Divisions… Strong case for unnumbered s Can we recommend/implement a basic dictionary oriented typology?

Issues [see Wuerzburg.xml] Providing precise guidelines for – publicationStmt Elicit the role and possible content of – sourceDesc Base the guidelines on ( ?) and biblStructbiblItem

Describing dictionary entries A variety of possible objects –,, –, First issue: dealing with the editorial workflow –Keep for ongoing tagging activity depends on the degree of structure of the dictionary –Stay consistent in the use of entry/entryFree/superEntry/hom Strong feeling for limiting ourselves to –Point to the importance of Embedded entries

Finding the right granularity The core lexical unit: –Should be used coherently in a dictionary project to gather up homogenous lexical objects Possible combination with: – to group sets of homographs Should only be used to record such a feature when it exists in legacy data Should be avoided for new editorial projects – to subdivide senses in groups of homonyms

Example Recording a series of homographs with Issues –Values of ‘n’ attribute according to the source –Values of type defined in ‘att.entryLike’

Example Recording a series of homographs with Issues –Weak boundary between polysemes and homonyms –Why not just have separate entries?

From word to senses… Background –Semasiological vs. onomasiological views on lexical data Two complementary data organisations Two sets of standards –In ISO: TMF (ISO 16642) vs. LMF –In the TEI: Terminology vs. Print dictionary chapters

The LMF Model Lexical DB 1..1 Global Info 1..1 Lexical Entry 0..n 1..1 Form n n 1..1 Sense

Consequences for dictionaries Strong to orientation – qualifies the entry, with the identification of the headword and its morphological variations – is subordinated to the choice made for –Role of grammatical information Overall qualification of the entry Qualification of morphological variants Issue – does not necessarily fit into the theory

Example Basic structure of an chat Petit animal familier

Representing form and grammar General issues –Multiple forms,, etc. –Compounds May be represented using embedded forms –Role of grammar ( ) In isolation: qualifies the entry Within a form: marks special features associated with the form –Inflexions Can be reprensented by means of additional ’s

Example A simple entry chat ∫a N f

Example Simple entry with inflected form chat N m chats p

: the case of the Campe dictionary Step 1: Dealing with the presence of determiners Das Aak

: the case of the Campe dictionary Step 2: adding grammatical information Das n Aak N n

: the case of the Campe dictionary Step 3: dealing with inflected forms des … -es G

Main arguments for the proposed changes Coherent use of and –Accounts for a coherent access to orthographic information in form/orth Coherent use of grammatical features –Danger of tag abuse with Das –‘type’ attribute should indicate a grammatical feature – content should be the value of that feature –Non differentiation of features (art_n -> pos + gen)

: main components Core elements – : to provide the definition – Need to establish guidelines on the identification of sources – : a complex issue…etym

Documentation des exemples Ta gamine est assise trop, elle ne dépasse pas de la table. BENOIT M, MICHEL C. Le Parler de Metz et du pays messin Metz Serpenoise 2001 p. 38 Ta gamine est assise trop, elle ne dépasse pas de la table. Ta gamine est assise trop, elle ne dépasse pas de la table. Benoit M., Michel C., Le Parler de Metz...

A quick glimpse into Roma A journey in three steps –Adding the PD module and generating a schema –Checking out elements –Expressing constraints on specific values

Final discussion What is it, being TEI conformant?