Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the.

Similar presentations


Presentation on theme: "The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the."— Presentation transcript:

1 The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the years to come. Paul Trilsbeek and Alexander König

2 Users of EL Archives Regionally oriented archives: Largest proportion of users are community members* “Global” archives: Largest proportion are researchers* How to attract more users and a larger variety of users to both types of archives? How to increase the language documentation effort? * Cf. P. Austin: Who uses endangered languages archives?

3 Community involvement Getting community members to engage in the documentation process? Providing easy, YouTube-like upload mechanism Some possible issues: – Technical quality of the recordings – Metadata – Ethics, methodology

4 Technical quality of the data Limited by available equipment (camcorder, photo camera, mobile phone?) Limited by a/v recording skills Offering (online) training could help

5 Metadata Already difficult enough to get current depositors to provide high quality metadata, while they are generally obliged to do so Resources without any kind of metadata are useless Come up with a core set of essential fields that should be filled in Controlled vocabularies might be problematic, perhaps re-using previously entered values as auto- complete suggestions and provide mappings (curation process) to standard CVs

6 Ethics We assume that current depositors are aware of generally applicable ethical guidelines such as the DOBES code of conduct, i.e. they know that it is for example required to obtain informed consent from the people being recorded For third-party deposits, one does not know whether this is the case Provide guidelines prominently in the deposit site might help

7 Integration with large-scale infrastructures Currently a lot of developments in the field of data and research infrastructures such as the CLARIN and DARIAH infrastructure projects funded by the European Commission The idea is to make data and service providers interoperable such that a researcher can use them all together seamlessly in “virtual research environments” Some topics: – Federated authentication and authorization – Interoperable metadata framework – Interoperable data providers and web services providers

8 Integration with large-scale infrastructures What should EL archives do? – Provide metadata records in formats required by these infrastructures (OLAC, CMDI, …)CMDI – Make use of central data category registries such as ISOcat ISOcat – Follow the developments regarding federated AAI infrastructure and participate when possible – Follow the developments regarding web service specifications for language resources

9 Examples of metadata aggregators VLO faceted browser OLAC faceted browser NaLiDa faceted browser

10 Example of interoperable web services WebLicht

11 Access restrictions Access restrictions are necessary in the field of endangered languages archives to respect the whishes of the speech communities and to protect their privacy They do however frustrate many users of EL archives Some researchers keep material restricted for personal career reasons. In the case of a young scholar writing a PhD thesis, this is understandable and acceptable. Less so for established researchers who are the main expert on a certain language and who have been able to collect the data with public funding.


Download ppt "The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the."

Similar presentations


Ads by Google