1 Workshop Goals DELAMAN and DAM-LR Peter Wittenburg MPI for Psycholinguistics Access Management Nijmegen November 2004.

Slides:



Advertisements
Similar presentations
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
Advertisements

The way to open resources Laurent Romary CNRS. Two aspects of scientific communication Research papers –All types (Conferences, journals, grey literature.
IRCS Workshop on Open Language Archives IMDI & Endangered Languages Archives Heidi Johnson / AILLA.
Accessing Distributed Resources Information: An OLAC perspective Steven Bird Gary Simons Chu-Ren Huang Melbourne SIL Academia Sinica ENABLER/ELSNET Workshop.
The Open Language Archives Community: Building a worldwide library of digital language resources Gary Simons, SIL International LSA Tutorial on Archiving.
LSA Archiving Tutorial January 2005 Archives, linguists, and language speakers.
Getting Involved in OLAC Steven Bird University of Pennsylvania LREC Symposium: The Open Language Archives Community 29 May 2002.
The Seven Pillars of Open Language Archiving: Introducing the OLAC Vision Gary Simons SIL International LSA Symposium: The Open Language Archives Community.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Interoperability aspects in the The Virtual Language Observatory Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
Advanced Metadata Usage Daan Broeder TLA - MPI for Psycholinguistics / CLARIN Metadata in Context, APA/CLARIN Workshop, September 2010 Nijmegen.
1 MPI WP2/3 Report Metadata Integrated Resource Domain Portal Creation Peter Wittenburg MPI for Psycholinguistics Nijmegen NL Intera INTERA WP2 Summary.
Legal & Ethical Aspects of Access Management DELAMAN Access Management Workshop Nov 2004 Heidi Johnson (AILLA)  Gary Holton (ANLC)
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Metadata Component Framework Possible Standardization Work.
Steven KrauwerCLARIN-NL Launch CLARIN-EU: Where do we stand? Steven Krauwer Utrecht institute of Linguistics UiL OTS CLARIN-EU Coordinator.
CLARIN: Common Language Resources and Technology Infrastructure for the Social Sciences and Humanities Steven Krauwer Utrecht institute of Linguistics.
Steven KrauwerLREC20081 CLARIN: Common Language Resources and Technology Infrastructure for the Humanities and Social Sciences Kimmo Koskenniemi (University.
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
DASISH Common Solutions to Common Problems. DASISH – Data Service Infrastructure for the Social Sciences and Humanities DASISH brings together 5 ESFRI.
A portal for Danish Research - some considerations and ideas Anne Sofie Fink Danish Data Archives IASSIST 2002.
Workshop Summary I think it was an excellent and inspiring meeting (do I have to say that?) I hoped to have a kind of kick-off effect like at the LREC.
Current Trends in Language Documentation and the Hans Rausing Endangered Languages Project Lenore A. Grenoble Dartmouth College Lenore A. Grenoble Linguistics.
July 11, 2003E-MELD 2003 E-MELD “School” of Best Practice Helen Aristar-Dry & Gayathri Sriram The LINGUIST List Eastern Michigan University.
CLARIN-NL First Call Jan Odijk CLARIN-NL Kick-off Meeting Utrecht, 27 May 2009.
Language-Sites: Accessing Language Resources via Geographic Information Systems Dieter van Uytvanck, Alex Dukers, Paul Trilsbeek Jacquelijn Ringersma (Peter.
Resource Discovery (metadata and searching) Working Group Report.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
Agenda CMDI Workshop 9.15 Welcome 9.30 Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.15Coffee 10.30Use of ISOCat within CMDI.
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
1 The NSDL: A Case Study in Interoperability William Y. Arms Cornell University.
June 20, 2006E-MELD 2006, MSU1 Toward Implementation of Best Practice: Anthony Aristar, Wayne State University Other E-MELD Outcomes.
Eureka! User friendly access to the MPI linguistic data archive Max Planck Institute for Psycholinguistics Alexander Koenig Jacquelijn Ringersma Claus.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the.
The Archive of the Indigenous Languages of Latin America Goals and Visions.
Max Planck Institute for Psycholinguistics Tool development report H. Brugman MPI Nijmegen.
Standards and Tools: DOBES and CLARIN Views - resumé after about 8 years - Peter Wittenburg, André Moreira The Language Archive - Max Planck Institute.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Why should we invest in DWF? Peter Wittenburg CLARIN Research.
AILLA:The Archive of the Indigenous Languages of Latin America Heidi Johnson / The University of Texas at Austin.
CLARIN Metadata Infrastructure Component Metadata and intermediate solutions Daan Broeder Claus Zinn Dieter van Uytvanck - Max-Planck Institute for Psycholinguistics.
LEXUS: a web based lexicon tool Jacquelijn Ringersma Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
Wishes from Hum infrastructures Examples: DOBES and CLARIN Peter Wittenburg Max Planck Institute for Psycholinguistics.
1 Databases for Linguistic Purposes Peter Wittenburg, Daan Broeder, Kees vd Veer Max-Planck-Institute for Psycholinguistics Richard Piepenbrock Nijmegen.
1 DOBES/MPI Archive - architecture - Paul Trilsbeek, Roman Skiba, Peter Wittenburg MPI for Psycholinguistics Access Management Nijmegen November 2004.
“Interoperability”??? Opportunities for Applied Research on the Creation, Management, Preservation and Use of Digital Content IMLS Washington, DC March.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
COCOSDA Meeting -summing up some impressions after a very dense week – -on one hand the “big and slightly smaller” challenges of the discipline -highly.
The Agora hybrid library project Rosemary Russell, UKOLN (UK Office for Library and Information Networking) Agora Communications Coordinator.
CLARIN work packages. Conference Place yyyy-mm-dd
EVA Workshop, 26 March 2003, Florence, Italy1 COINE Cultural Objects In Networked Environments Anthi Baliou University of Macedonia,Library Thessaloniki,
N. Calzolari 1Nijmegen, August 2010 Conclusions – Observations (maybe biased)  Field linguistics: Re-doing the path we did, asking the same questions,
Exploring and Enriching a LR Archive via the Web Marc Kemps-Snijders, Alex Klassmann, Claus Zinn, Peter Berck, Albert Russel, Peter Wittenburg MPI for.
CLARIN Issues Peter Wittenburg MPI for Psycholinguistics Nijmegen, NL.
Technology – Broad View Aspects that play a role when integrating archives leave the details of some core topics to the 2. day Bernhard Neumair:Base Technologies.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands TLA/MPI requirements for a Semantic Registry.
Agenda CMDI Tutorial 9.30 Welcome & Coffee Introduction to metadata and the CLARIN Metadata Infrastructure (CMDI) 10.30CMDI & ISO-DCR 10.50The CMDI.
AILLA:The Archive of the Indigenous Languages of Latin America Heidi Johnson / The University of Texas at Austin.
Improving Description through Collaboration: The Ethnomusicological Video for Instruction & Analysis Digital Archive Music Library Association, February.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Creating & Testing CLARIN Metadata Components A CLARIN-NL project Folkert de Vriend Meertens Institute, Amsterdam 18/05/2010.
Creating Access to Europe’s Television Heritage Vienna, EDL Workshop November Dr. Alexander Hecht (Austrian Broadcasting Corporation ORF) Johan.
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Personal Archives Accessible in Digital Media
Textbook Engineering Web Applications by Sven Casteleyn et. al. Springer Note: (Electronic version is available online) These slides are designed.
MPI WP2/3 Report Metadata Integrated Resource Domain Portal Creation
Malte Dreyer – Matthias Razum
Presentation transcript:

1 Workshop Goals DELAMAN and DAM-LR Peter Wittenburg MPI for Psycholinguistics Access Management Nijmegen November 2004

2 When did we start? it is just 5 years that we started in our discipline speaking about – large digital online collections – standardizing the formats XML was new and users were very skeptical MPEG was and is something still not well understood – open metadata to come to browsable and searchable domains – using metadata to create well-organized archives – interoperability LREC Athens 2000 – first workshop on these issues – start of the ISLE project (linguistic concepts, lexicon, metadata, …) – start of the IMDI work in 2000 also first LDC workshop with OLAC as focus little later DOBES was granted and E-Meld started this is very short time when you want to convince a community Access Management Nijmegen November 2004

3 What did we achieve? have “large” on-line digital archives/collections/Digital Libraries – MPI ~ session bundles / ~10 TB – DOBES ~1.500 session bundles/ 1500 h – AILLA – PARADISEC – Lund corpora – also in HLT domain LDC ELRA BAS – also “traditional” archives (Phonogramm Archiv, NAA, …) – etc some of us became “archivists” by practice idea of web visibility and online accessibility spreads despite archiving attempts: according to D. Schüller ~80% of the digitized material is endangered Access Management Nijmegen November 2004

4 What did we achieve? much evangelization and agreement about standards – DOBES workshops and documents – LDC workshops and documents – E-Meld workshops and excellent web-site – ISLE workshops with IMDI result – PARADISEC workshop with DELAMAN result – HRELP workshops – LREC workshops and contributions – ACL workshops and contributions – IASA/IAML conference – etc “everyone” agrees with XML, UNICODE and linear PCM “everyone” understands the relevance of schemas to make linguistic structure and encoding explicit wrt JPEG and MPEG we are shooting on a moving target, but don’t yet have real alternatives Access Management Nijmegen November 2004

5 What did we achieve? created awareness about the need of metadata for visibility created operational metadata infrastructures within 4 years – structured IMDI for discovery and management – OLAC for overall discovery – gateways between the two domains however, still not satisfying situation – > 50 institutions are using IMDI (as far as we know) – ?? institutions are providing OLAC records – still only a small fraction of the language resources are visible – MD creation is hard it is work for others – although this increasingly often is wrong it means cleaning up your own holding and figure out what is available it means to write “correct” scripts and to learn new software it means being disciplined have done our development job – have to continue dissemination despite limitations we hope that people stick to what is out there Access Management Nijmegen November 2004

6 What did we achieve? interoperability is still a dream however … – have metadata gateways in our discipline (OLAC-IMDI) – increasingly often tools are producing correct XML, UNICODE, … – have filters for character encodings and formats although we miss well-designed and comprehensive services – have started with ontological work to tackle the linguistic aspects GOLD ontology from E-Meld ISO TC37/SC4 Data Category Registry TDS (Dutch Typology Project) meta-language EAGLES/ISLE/TEI specifications we are at the beginning cannot speak yet about fully operational infrastructures but there are islands like FIELD, LEXUS, ONTO-ELAN, … Access Management Nijmegen November 2004

7 Changing role of Language Archives different groups of people contribute Access Management Nijmegen November 2004 The Archive different groups of people use the content specialists maintain, unify, check quality, etc at the MPI it is understood that the archive is the capital to build on in the DOBES programme the point to make results explicit and accessible only works if we don’t have an “inert, dusty” archives – not an attractive perspective – hear more about this from D.Schüller

8 Vision for a single archive Access Management Nijmegen November 2004 Metadata Tools Archive Utility Layer Domain of Registered Primary and Secondary Resources Domain of Descriptive Metadata Primary Resources: Texts Images Sound Movies User Data Ingestion& Management User Authentication Access Rights Web-based Archive Exploration Annotation Exploration Lexicon Exploration Text Exploration Ontological Knowledge Media Annotation (Web-based) Archive Enrichment Lexical Encoding Web Commentary The Archive done in progress to start

9 Everything ok – so let’s go home … what about the following scenario? Access Management Nijmegen November 2004 Raw Data Metadata Raw Data Metadata data exchange for data survival reasons archive A archive B

10 Everything ok – so let’s go home … what about the following scenario? Access Management Nijmegen November 2004 Raw Data Metadata DOBES Archive Raw Data Metadata AILLA Archive my personal Trumai archive AILLA Trumai DOBES Trumai not just copies but result of own creative process

11 DELAMAN Digital Endangered Languages and Music Archive Network loose network of “archives” sharing a set of visions such as – want to exchange data automatically (list driven) – want to allow people to create integrated virtual working spaces – want to have an integrated access management domain first talks in Nijmegen and at HRELP workshops 2003 foundation at PARADISEC meeting in Sydney 2003 no deep discussions about wishes in detail and implementation therefore this workshop in Nijmegen it’s about future usage scenarios with distributed archives Access Management Nijmegen November 2004

12 DELAMAN / DAM-LR Map DELAMAN is an international network DAM-LR – Distributed Access Management for Language Resources – 3 year EU project starting at – yes we have money to start – centered around the DELAMAN intentions Access Management Nijmegen November 2004 MPI AILLA EMELD ANLC LACITO ELAR PARADISEC AMPM LundINL AIATSIS

13 Workshop want to get a deeper understanding of what “we” want need good requirements specifications want to get a deeper understanding what others are doing – our ideas are not new – we share them with others – Digital Library initiatives (FEDORA, …) – GRID initiative(s) (SRB, GTK, …) – compute/function/data GRID therefore we invited – linguists knowing about potential and real user wishes – “archivists” knowing about maintaining large repositories – technologists knowing about current and future developments – some of us looked into the legal and ethical aspects at the end we should be ready to start Access Management Nijmegen November 2004

14 Programme 1. Day Access Management Nijmegen November Setting the Framework 9.00W. KleinWelcome 9.10P. WittenburgDELAMAN and Workshop Goals 9.40D. SchüllerAudiovisual archiving: Visions, Challenges, Strategies 10.15Discussion 10.30Coffee Break Researcher Requirements Kamp 11.00T. Aristar/H. DryLinguist Wishes 11.30P. Austin/D. NathanLinguist Wishes 12.00G. Holton/H. JohnsonLegal & Ethical Aspects 12.30Lunch Break Archivist Requirements Strömquist 13.30H. JohnsonAILLA Setup and Implications 14.00L. BarwickParadisec Setup and Implications 14.30Wittenburg/Skiba/TrilsbeekDOBES Setup and Implications 15.00Coffee Break Summary and Discussion Strömquist 15.30Uneson/Broeder/StrömquistSummary of Requirements Questions and Discussion 17.00W. KrullDOBES Program and the VW Foundation 17.15Soddemann/Neumair/Verharen/WbgTechnology - Broad View End 20.00Joint Dinner at Kwok Paw

15 Programme 2. Day Access Management Nijmegen November Technology Components Nathan 9.00T. Soddemann (got the Billing Award)Web Services 9.40D. BarryGRID Components 10.10B. KerverAuthentication and Authorization Systems 11.00Coffee Break Nathan 11.30L. LannomHandle System 12.00R. MooreStorage Resource Broker 12.45Discussion 13.00Lunch Break Mapping Requirements and Technology Aristar/ Broeder 14.00Aristar/Dry/Johnson/Barwick/…Understanding Technology Linguists/Archivists 14.15Broeder/Nathan/Jacobson/Neumair/...Choice and Integration Aspects 14.30Discussion 15.00Coffee Break 15.30Grand Summary and Open Discussion 16.00WittenburgSummary 16.30Discussion 17.00End times not too strict – it’s a workshop

16 Let’s go … Access Management Nijmegen November 2004 The MPI team wishes us two interesting and highly interactive days in Nijmegen Daan, AndreasTechnology Paul, RomanArchive Peter??