Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Descriptive Grammar as a (Meta)Database Jeff Good University of Pittsburgh and Max Planck Institute for Evolutionary Anthropology.

Similar presentations


Presentation on theme: "The Descriptive Grammar as a (Meta)Database Jeff Good University of Pittsburgh and Max Planck Institute for Evolutionary Anthropology."— Presentation transcript:

1 The Descriptive Grammar as a (Meta)Database Jeff Good University of Pittsburgh and Max Planck Institute for Evolutionary Anthropology

2 Structure of Presentation Discuss major features found in four descriptive grammars surveyed for this paper Discuss special features particular to each of the grammars Propose a conceptual model for the structure of the information found in descriptive grammars Propose a basic XML representation of that model

3 The Grammars Surveyed A “best practice” grammar Haspelmath’s (1993) Lezgian Grammar (Northeast Caucasian/Nakh- Daghestanian) A subcommunity grammar Maganga and Schadeberg’s (1992) Kinyamwezi Grammar (Bantu) A Lingua Questionnaire grammar Huttar and Huttar’s (1994) Ndyuka Grammar (Atlantic creole) A “legacy” grammar Williamson’s (1965) Ijaw Grammar (Niger-Congo)

4 Four common features Nested, labeled sections Descriptive prose Exemplars Reference to multiple ontologies A fifth feature: “Structured description” See Penton, Bow, Bird, and Hughes “Towards a General Model for Linguistic Paradigms”

5 Sections Like most relatively long documents, grammars are divided into marked sections Sections can be nested inside other sections The content of sections tends to be partially standardized (e.g., most grammars will have a sections on consonants, vowels, basic sentence structure, etc.) Sections typically are associated with a label and often with a title Sectioning can also be sensitive to ontologies

6 Ndyuka Grammar Sections

7 Lezgian Grammar Sections Verbal inflection Introduction The three stems of strong verbs Verbal inflectional categories Forms derived from the Masdar stem Forms derived from the Imperfective stem Forms derived from the Aorist stem Secondary verbal categories Prefixal negation and the Periphrasis forms Illustrative verbal paradigms Irregular verbs The copulas Verbs lacking a Masdar and Aorist stem Secondary verbal categories Verbs with root in ä(g)- Functions of basic tense-aspect categories Imperfective Future Aorist Perfect Continuative Imperfective and Continuative Perfect Past Functions of non-indicative finite verb forms Imperative Prohibitive Hortative Optative Conditional Interrogative... Morphological criteria Semantic criteria

8 Ontologies Descriptive grammars make extensive use of multiple ontologies (i.e., structured sets of categories) Three kinds of ontologies General (assumed to be understood by the entire linguistics community) Subcommunity (assumed to be understood by a well-defined subcommunity of linguists) Local (only taken to be meaningful in the context of the particular language being described)

9 General and Local Ontologies in Ijaw Grammar Arbitrary labels: Local Ontology Terms from a General Ontology

10 Subcommunity Ontology in Kinyamwezi Grammar Numbering scheme consistent throughout Bantu

11 Ontologies In the description, the use of different ontologies often overlaps In Ijaw, the labels I, II, III, IV, and V are drawn from a local ontology—but these labels designate tone classes, a concept drawn from a general ontology Furthermore, the terms from local and subcommunity ontologies are often explicitly defined using terms from general ontologies (e.g., a Bantu extension is defined as a type of suffix)

12 Descriptive Prose Descriptive prose forms the heart of the traditional grammar In addition to free-form prose, it can contain References to lexical items References to other sections References to exemplar data References to terms drawn from ontologies

13 Descriptive Prose The various references can have a standardized format References to lexical items, for example, can have a orthographic_form ‘gloss’ format References to other sections make use of a (typically numeric) label like “1.2.1” References to exemplar data also make use of a label of some kind like “(3a)” General practice appears to be that references to ontologies are implicit

14 Descriptive Prose in Lezgian Grammar Reference to lexical item Reference to exemplar data { Implicit grouping of prose and exemplar data

15 Exemplar Data I use the term exemplar for data specifically selected as an example of some phenomena in a descriptive grammar There appear to be two major classes of exemplars Lexical exemplars (often arranged in a paradigm) Textual exemplars (typically in the form of interlinear text)

16 Exemplar Data Some features of exemplar data It is typically associated with a label (most commonly a number and/or a number followed by a letter) Exemplars can be grouped together Data may deviate from standard presentation format for illustrative purposes

17 Exemplar Data from Lezgian Grammar Lexical Exemplars (part of an exemplar group) Textual Exemplars Syntactic bracketing References to external set of texts Comparison forms

18 Structured Description I use the term structured description to refer to description, typically in tabular format, covering a particular, coherent domain of a language’s grammar Structured description, as understood here, is broader than the notion of a paradigm, as discussed in the Penton, Bow, Bird, and Hughes at this conference However, there is a large degree of overlap between the two

19 Schematic Structured Description from Kinyamwezi Grammar Schematic tone patterns

20 Particular features: Lezgian A subject index with conventions for explicitly indicating the lack of grammatical phenomenon An example index indicating what examples, anywhere in the grammar, exemplify a given phenomenon A typographic distinction between language-particular morphological categories and universal and semantic categories (and, by extension, terms drawn from local and general ontologies)

21 Subject Index from Lezgian Grammar “Negative” Indexation

22 Example Index from Lezgian Grammar

23 Particular features: Lezgian Terms referring to language-specific categories are capitalized Ergative case Involuntary Agent construction “Universal” and semantic categories are not capitalized complement clause adverbial modifier The choice of labels for language-specific categories often implies a default mapping to a universal category

24 Particular features Ndyuka: Based on Comrie and Smith’s 1977 Lingua “Questionnaire” Kinyamwezi: Extensive use of a subcommunity ontology Ijaw: Extensive use of “legacy” formalisms

25 Excerpt from Lingua “Questionnaire”, Basis of Ndyuka Grammar

26 Legacy Formalism in Ijaw Grammar

27 Towards a Model A descriptive grammar can be understood as a series of annotations over a lexicon and set of texts for a given language—that is, as a type of metadatabase over more “primary” linguistic data

28 Towards a Model The structure of an annotation

29 Towards a Model Relationships among annotations

30 Towards an XML representation Grammar Ontologies Annotations Descriptive Prose Exemplar Set Lexicon Texts Various positions: Ontology References

31 An internal general ontology, or a reference to an external general ontology would be placed in this element. An annotation can be associated with a reference to an ontology. Descriptive prose for an annotation would be placed here. In addition, there could be inline references to a lexical item via an element like the following. There can also be an exemplar set using the markup immediately below. The descriptive prose could also draw a term from an ontology by using an ontology reference as follows.... Towards an XML representation

32 ... Ontology references can also be directly related to exemplars. An internal lexicon, or reference to an external lexicon. An internal set of texts, or reference to an external set of texts. Towards an XML representation

33 Future Research Representations for structured description other than paradigms Representations for special annotations on exemplar data Incorporating machine-readable formal representations of phenomena into annotations Development of methods to transform XML representation into human- readable forms


Download ppt "The Descriptive Grammar as a (Meta)Database Jeff Good University of Pittsburgh and Max Planck Institute for Evolutionary Anthropology."

Similar presentations


Ads by Google