Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zum Aufbau eines multimedialen Spracharchivs Dagmar Jung (Institut für Linguistik, Allgemeine Sprachwissenschaft, Universität zu Köln) CCeH Eröffnungsworkshop.

Similar presentations


Presentation on theme: "Zum Aufbau eines multimedialen Spracharchivs Dagmar Jung (Institut für Linguistik, Allgemeine Sprachwissenschaft, Universität zu Köln) CCeH Eröffnungsworkshop."— Presentation transcript:

1 Zum Aufbau eines multimedialen Spracharchivs Dagmar Jung (Institut für Linguistik, Allgemeine Sprachwissenschaft, Universität zu Köln) CCeH Eröffnungsworkshop “IT in der Forschung an der Philosophischen Fakultät der Universität zu Köln”

2 A brief step back in time: setting the scene of language documentation The linguist/anthropologist The native speaker The transcription The translation The analysis The ideal outcome: texts, grammar, dictionary CCeH

3 Pliny Earl Goddard 1914: The present condition of our knowledge of North American Languages “There remains a great amount of linguistic work to be done. With so little known of the origin of languages, and the conditions controling their development and their dispersion, it is important that a record should be preserved of every language spoken. In order that that record be adequate, great care must be taken in phonetic representation. The sounds which correspond to the characters employed in writing should be so carefully described as to their manner of articulation and their acoustic effects as to make them thoroughly intelligible for all time. Sufficient material from each dialect should be recorded in the connected form of texts to furnish a fairly complete lexicon of the words it contains and a representation of the grammatical forms in use.” (1914:592, American Anthropologist Vol. 16) CCeH

4 The DoBeS-Program (Dokumentation bedrohter Sprachen) Funded by the VolkswagenFoundation Started in 2000 – ca. 45 projects worldwide Technical team and archive development: MPI Two main goals: – Documentation of endangered languages (gathering of audio and video data in the field and annotating them) – Creation of a web-accessible, multi-media digital archive that will persist over a longer period of time CCeH

5 The DoBeS projects (2008) CCeH

6 KÖBES – Kölner Dobes-Projekte Prof. G. Dimmendaal (Afrikanistik): “A multi-media documentation of verbal communication among the Tima” (2006- 2009) „A linguistic and anthropological documentation of Tima” (2009-2011) Dr. K. Haude (ASW): “Documenting Movima, an unclassified language of the Moxos region (Bolivia)” (2006-2009) „Making Movima visible: documenting a linguistic isolate in the Moxos cultural complex” (2009-2011) Dr. D. Jung (ASW): “Beaver knowledge systems: language documentation from a placenames’ perspective” (2004-2008) “Real places and virtual representation - Beaver language documentation” (2008- 2010) CCeH

7 Challenges today Once the fieldwork situation is set up, a myriad of language data can be recorded There is no limit to the quantity of recordings set by hardware any longer Potentially a flood of audio and video data is collected -> how can it be processed to be useful? CCeH

8 Flexible Annotation Tools ELAN (time-aligned video/audio annotation) Toolbox (parsing tool and lexical database) Interoperable with other representational and analytic tools (e.g. by providing XML- interfaces) CCeH

9 Elan: annotation of multi-modal data CCeH

10 Elan: multiple tiers CCeH

11 Toolbox CCeH

12 Tools: LEXUS (under development) Web-based lexical database: allows for customized lexicon creation Also import from Toolbox Multi-media links allowed Its on-line nature ideal for collaborative efforts CCeH

13 The Multi-Media Archive Is not a place to merely ‘dump’ data and forget about them, but serves for: Data preservation Data presentation Data analysis (e.g. by making use of metadata or intelligent searches) And last but not least (for the scientific community): Data accountability – unique resource identifiers CCeH

14 The Archive Location CCeH

15 The Archive: flexible corpus structures CCeH

16 Metadata Necessary for archival organization – Identity of resources: language name, etc. – also physical characteristics: quality, quantity Desirable for scientific use of resources – Sociolinguistic data of participants – Characteristics of genre – Key words (free) CCeH

17 ANNEX searches in the archive Allows for simple searches or advanced multi- tier searches within annotations CCeH

18 ANNEX: multiple views CCeH

19 Ways of Access and Visualization: Google Earth layer CCeH

20 Ways of Access and Visualization CCeH

21 Ways of access: web-accessible stories (derived from ELAN) CCeH

22 Ways of access: Community Portal CCeH

23 Changes in Language Resources: Data and Tools Data are not the same (audio, video, quantity and quality) Archive is inherently work-in-progress, NOT published end-product Tools are certainly not the same (annotation, presentation, search engines) Linguistic work has become more cooperative: with communities, with international colleagues, with other disciplines New foundation for linguistics as an empirical science CCeH

24 PS Goddard, Documentation of Beaver Athabaskan (1917) Rousselot- Apparatus (Kymography) CCeH


Download ppt "Zum Aufbau eines multimedialen Spracharchivs Dagmar Jung (Institut für Linguistik, Allgemeine Sprachwissenschaft, Universität zu Köln) CCeH Eröffnungsworkshop."

Similar presentations


Ads by Google