Presentation is loading. Please wait.

Presentation is loading. Please wait.

Representing dictionaries with the TEI Proposal for basic guidelines Laurent Romary - Max Planck Digital Library With the help of Susanne Alt - CNRS.

Similar presentations


Presentation on theme: "Representing dictionaries with the TEI Proposal for basic guidelines Laurent Romary - Max Planck Digital Library With the help of Susanne Alt - CNRS."— Presentation transcript:

1 Representing dictionaries with the TEI Proposal for basic guidelines Laurent Romary - Max Planck Digital Library With the help of Susanne Alt - CNRS

2 Background The P5 edition of the TEI guidelines –XML –ODD - Roma Modules and classes –DTD, RelaxNG, W3C schemas The dictionary chapter –Very close to the P4 version –Work to be done Enhancing the coherence with the class system Providing more examples …

3 Proposal for today Browse through the main features of the dictionary chapter –Identify questionable issues –Select best practices Work with Roma and implement (part of) the best practices –Minimal schema that dictionary project can start with Bottom approach to customization Discuss about conformance

4 Dictionaries as TEI documents Same general document structure as any other TEI document –, Define a common strategy concerning source identification with general text sources Specific documentation of previous editions Intuition that is not to be retained here –,, –Divisions… Strong case for unnumbered s Can we recommend/implement a basic dictionary oriented typology?

5 Issues [see Wuerzburg.xml] Providing precise guidelines for – publicationStmt Elicit the role and possible content of – sourceDesc Base the guidelines on ( ?) and biblStructbiblItem

6 Describing dictionary entries A variety of possible objects –,, –, First issue: dealing with the editorial workflow –Keep for ongoing tagging activity depends on the degree of structure of the dictionary –Stay consistent in the use of entry/entryFree/superEntry/hom Strong feeling for limiting ourselves to –Point to the importance of Embedded entries

7 Finding the right granularity The core lexical unit: –Should be used coherently in a dictionary project to gather up homogenous lexical objects Possible combination with: – to group sets of homographs Should only be used to record such a feature when it exists in legacy data Should be avoided for new editorial projects – to subdivide senses in groups of homonyms

8 Example Recording a series of homographs with Issues –Values of ‘n’ attribute according to the source –Values of type defined in ‘att.entryLike’

9 Example Recording a series of homographs with Issues –Weak boundary between polysemes and homonyms –Why not just have separate entries?

10 From word to senses… Background –Semasiological vs. onomasiological views on lexical data Two complementary data organisations Two sets of standards –In ISO: TMF (ISO 16642) vs. LMF –In the TEI: Terminology vs. Print dictionary chapters

11 The LMF Model Lexical DB 1..1 Global Info 1..1 Lexical Entry 0..n 1..1 Form 1..1 0..n 1..1 0..n 1..1 Sense

12 Consequences for dictionaries Strong to orientation – qualifies the entry, with the identification of the headword and its morphological variations – is subordinated to the choice made for –Role of grammatical information Overall qualification of the entry Qualification of morphological variants Issue – does not necessarily fit into the theory

13 Example Basic structure of an chat Petit animal familier

14 Representing form and grammar General issues –Multiple forms,, etc. –Compounds May be represented using embedded forms –Role of grammar ( ) In isolation: qualifies the entry Within a form: marks special features associated with the form –Inflexions Can be reprensented by means of additional ’s

15 Example A simple entry chat ∫a N f

16 Example Simple entry with inflected form chat N m chats p

17 : the case of the Campe dictionary Step 1: Dealing with the presence of determiners Das Aak

18 : the case of the Campe dictionary Step 2: adding grammatical information Das n Aak N n

19 : the case of the Campe dictionary Step 3: dealing with inflected forms des … -es G

20 Main arguments for the proposed changes Coherent use of and –Accounts for a coherent access to orthographic information in form/orth Coherent use of grammatical features –Danger of tag abuse with Das –‘type’ attribute should indicate a grammatical feature – content should be the value of that feature –Non differentiation of features (art_n -> pos + gen)

21 : main components Core elements – : to provide the definition – Need to establish guidelines on the identification of sources – : a complex issue…etym

22 Documentation des exemples Ta gamine est assise trop, elle ne dépasse pas de la table. BENOIT M, MICHEL C. Le Parler de Metz et du pays messin Metz Serpenoise 2001 p. 38 Ta gamine est assise trop, elle ne dépasse pas de la table. Ta gamine est assise trop, elle ne dépasse pas de la table. Benoit M., Michel C., Le Parler de Metz...

23 A quick glimpse into Roma A journey in three steps –Adding the PD module and generating a schema –Checking out elements –Expressing constraints on specific values

24 Final discussion What is it, being TEI conformant?


Download ppt "Representing dictionaries with the TEI Proposal for basic guidelines Laurent Romary - Max Planck Digital Library With the help of Susanne Alt - CNRS."

Similar presentations


Ads by Google