CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen.

Slides:



Advertisements
Similar presentations
From CESSDA to European Research Infrastructure Developments in cross-European data sharing.
Advertisements

Interoperability aspects in the The Virtual Language Observatory Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
CLARIN Technical Infrastructure How far are we?. Short Overview CLARIN is one of the 44 accepted ESFRI Roadmap Initiatives official start: , Kick-off:
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Advanced Metadata Usage Daan Broeder TLA - MPI for Psycholinguistics / CLARIN Metadata in Context, APA/CLARIN Workshop, September 2010 Nijmegen.
CLARIN and the DSA Paul Trilsbeek The Language Archive Max Planck Institute for Psycholinguistics.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
Steven KrauwerCLARIN-NL Launch CLARIN-EU: Where do we stand? Steven Krauwer Utrecht institute of Linguistics UiL OTS CLARIN-EU Coordinator.
CLARIN: Common Language Resources and Technology Infrastructure for the Social Sciences and Humanities Steven Krauwer Utrecht institute of Linguistics.
CLARIN: Goals and Structure of the Project Steven Krauwer CLARIN Coordinator Utrecht institute of Linguistics UiL-OTS (NL)
Steven KrauwerLREC20081 CLARIN: Common Language Resources and Technology Infrastructure for the Humanities and Social Sciences Kimmo Koskenniemi (University.
The current state of Metadata - as far as we understand it - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure.
DASISH Common Solutions to Common Problems. DASISH – Data Service Infrastructure for the Social Sciences and Humanities DASISH brings together 5 ESFRI.
Creating the User’s European Digital Library Jill Cousins The European Library Knowbynet, Berlin, June 2007.
Repositories, Workspaces, Web Services - some ideas - Peter Wittenburg The Language Archive - Max Planck Institute CLARIN Research Infrastructure Nijmegen,
DASISH Strategic Board T he future of data infrastructures in social science and humanities Bente Maegaard CLARIN ERIC & University of Copenhagen November.
CLARIN Centers for a Sustainable Infrastructure Daan Broeder, MPI for Psycholinguistics Jan Odijk, Utrecht University.
CLARIN-NL First Call Jan Odijk CLARIN-NL Kick-off Meeting Utrecht, 27 May 2009.
DARIAH-ERIC Towards a sustainable social and technical European eResearch Infrastructure for the Arts and Humanities DARIAH-ERICDARIAH-ERIC VCC1 e –Infrastructures.
CLARIN for Linguists Introduction Jan Odijk LOT Summerschool Nijmegen,
1 CLARIN - NL Language Resources and Technology Infrastructure for the Humanities and the Social Sciences in the Netherlands Jan Odijk LREC May.
CLARIN-NL Second Open Call Jan Odijk CLARIN-NL Call 2 Info-session Amsterdam, 26 Aug 2010.
European Life Sciences Infrastructure for Biological Information ELIXIR
The importance of being ERIC Developments in cross-European data sharing.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the.
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Why should we invest in DWF? Peter Wittenburg CLARIN Research.
EPOS Preparatory phase Torild van Eck (ORFEUS) Call INFRA Deadline: December 3, 2009 Funding: between 3 and 6 MEuro Duration: max 4 year.
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
CLARIN Infrastructure Vision (and some real needs) Daan Broeder CLARIN EU/NL Max-Planck Institute for Psycholinguistics.
C ross-European data sharing made easy EDAF Luxembourg.
Wishes from Hum infrastructures Examples: DOBES and CLARIN Peter Wittenburg Max Planck Institute for Psycholinguistics.
Linguistics with CLARIN Introduction Jan Odijk LOT Winterschool Amsterdam,
EPOS a long term integration plan of research infrastructures for solid Earth Science in Europe Preparatory Phase Project
DASISH Final Conference Common Solutions to Common Problems.
CLARIN work packages. Conference Place yyyy-mm-dd
CLARIN Issues Peter Wittenburg MPI for Psycholinguistics Nijmegen, NL.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
1 CLARIN - NL What is going on? Jan Odijk Amsterdam 26 Aug 2010.
Exploring ‘Workspaces’ Tom Visser, SARA compute and networking services, Amsterdam Garching Workshop 21 st September 2010.
The DEER The Distributed European Electronic Resource.
Networks ∙ Services ∙ People Thomas Bärecke Journée Fédération, Paris Collaboration européenne GÉANT SA5 03/07/2015 SA5 T5 team
CLARIN EUDAT2020 uptake plan Dieter Van Uytvanck CLARIN ERIC EUDAT User Forum, Rome.
AAI needs of the Distributed Computing Infrastructures - CLARIN Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EPOS and EUDAT.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
CLARIN and CLARINO resources Knut Hofland Uni Research Computing Bergen, Norway Workshop ICAME 37, Hong Kong,
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No The use of the.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Support to scientific.
CLARIN ERIC Franciska de Jong Oxford April 2016
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Herbadrop.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Aalto Data Repository.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No LTER- Europe &
PIDs in EUDAT Webinar, 15 Februari 2013
GISELA & CHAIN Workshop Digital Cultural Heritage Network
AAI for a Collaborative Data Infrastructure
CESSDA – for what and for whom?
Antonella Fresa Technical Coordinator
Darja Fišer CLARIN ERIC Director of User Involvement
CLARIN ERIC and the science cloud
WP 5 Shared Data Access & Enrichment
Knowledge Exchange – European Open Science Cloud
Common Solutions to Common Problems
Integrating social science data in Europe
GISELA & CHAIN Workshop Digital Cultural Heritage Network
The Joint Action on Health Information InfAct
EOSC-hub Contribution to the EOSC WGs
Presentation transcript:

CLARIN - a European Research Infrastructure Peter Wittenburg Max-Planck Institut für Psycholinguistik, Nijmegen

eResearch - Infrastructures Bozen, J. Taylor “eScience is about global collaboration in key areas of science and the next generation of infrastructures that will enable it” Requires new persistent platforms - to enable researchers to combine resources and tools to solve the big challenges of today (global migration, crisis of cultures and minds) - to increase the efficiency of researchers in the many small tasks - 40 % of the time of "knowledge workers" is spent, to find useful material (Forrester Research)

CLARIN Goal Bozen, What:  Offer a distributed Research Infrastructure of integrated and interoperable Language Resources and Tools that serves researchers and students in the SSH How:  allow the combination of existing and web- accessible digital centers hosting resources in a common federation  offer language tools and services as distributed services with a common web interface

Key Application/Mission Bozen, A researcher authenticates at his own organization and creates a virtual collection of resources from different repositories and executing a virtual pipeline of processes on them. King Arthur failed by the way will CLARIN fail as well?

CLARIN is pan-European CLARIN: 3 Jahre Prep-Phase ~ 200 members ~ 25 centre candidates

CLARIN Work Dimensions how to come to a persistent and stable infrastructure? how to come to a federation and how to get access? how to make all of their LRT visible? how to come to interoperable services? how to get it all together for user services? community centres service provider federation CMDI future & short term solution service oriented architecture pan-European demo cases CLARIN has other very important aspects: Relation with SSH disciplines - mainly driven by national funds Education/Training, Help/Support/Advice, Dissemination Harmonization of licencing and Code of Conducts Specification of the ERIC legal framework to ensure persistency... at least IT oriented aspects

Community Centres how to come to a persistent and stable infrastructure? how to come to a federation and how to get access? how to make all of their LRT visible? how to come to interoperable services? how to get it all together for user services? community centres service provider federation CMDI future & short term solution service oriented architecture pan-European demo cases CLARIN Centres Centres Criteria Long-term Preservation REPLIX Replication 25 Centre Candidates all are busy with restructuring plans 2 already give long-term preservation service

Service Provider Federation how to come to a persistent and stable infrastructure? how to come to a federation and how to get access? how to make all of their LRT visible? how to come to interoperable services? how to get it all together for user services? community centres service provider federation CMDI future & short term solution service oriented architecture pan-European demo cases Trust Domain Initial Federation PID Service setup federation technology build initial federation setup EPIC service central user attribute server Service Provider Federation Agreement 1 n centers members Link up with national IdFs Agreement 2 DFN De HAKA Fi SURFnet Nl 1 Mio pot. Users-id currently more countries and centers coming

Metadata Domain how to come to a persistent and stable infrastructure? how to come to a federation and how to get access? how to make all of their LRT visible? how to come to interoperable services? how to get it all together for user services? community centres service provider federation CMDI future & short term solution service oriented architecture pan-European demo cases Component Metadata Metadata now Virtual Collection CMDI Infra ISOcat development setup OAI PMH machinery ISOcat Registry VLO Observatory Category Definition LRT Inventory Virtual Language World ARBIL MD Editor ISOcat concept registry component editor myprofile metadata editor metadata descriptions CLARIN component registry user area component registration concept registration ? this is where the ILSP team played a central role

Service Oriented Architecture how to come to a persistent and stable infrastructure? how to come to a federation and how to get access? how to make all of their LRT visible? how to come to interoperable services? how to get it all together for user services? community centres service provider federation CMDI future & short term solution service oriented architecture pan-European demo cases Service Oriented Infrastructure Web Services Interoperability Standards & Best Practices Service Framework Specification Web Service and Processing Chains Standards and Best Practices Web 2.0 Application for Tool Chaining and Execution Repository Stuttgart Tübingen Berlin LeipzigFinland Standard-conformant Text Corpus Encoding StuttgartTübingenLeipzig Romania

Demo Cases (just started) how to come to a persistent and stable infrastructure? how to come to a federation and how to get access? how to make all of their LRT visible? how to come to interoperable services? how to get it all together for user services? community centres service provider federation CMDI future & short term solution service oriented architecture pan-European demo cases EU Identity Index Case Multimedia/multi modal Case Folkstory Case C4/WebLicht Corpus Case

not alone... EUDAT Meta-Net

need to take care of data... Architecture created by EC High Level Expert Group will be a guideline for coming decades Data generators Users Common Data Services Community Support Services Data Curation User functionalities Data capture & transfer Virtual Research Environments Data discovery & navigation Workflow generation Annotation, Interpretability Safe & persistent storage Identifiers, Authenticity, Workflow execution, Mining Trust CLARIN, DARIAH etc Daten e-Infrastructure

why European? Bozen,  live in a multilingual Europe with a joint historical tradition and need to exploit this strength  many research questions are cross- national  required standards cannot be national  sharing costs in all respects is more efficient  finally it's about global competition also in SSH

Why now? Bozen,  there is the ESFRI process and all countries are synchronized which is a unique chance to build infrastructures  in total 44 initiatives on the ESFRI roadmap and there is the potential of gain by an eco system of RI  we need to organize our resource domain due to huge increase of data (MPI: 200 TB)  we need to take care to not loose our cultural and scientific memory  there is a huge uptake of RI and there will be many funding streams!!!

who and when?  current EU CLARIN consortium in prep phase (08-10): 32 partners from 24 countries  CLARIN construction phase from 2011; main funds by national programs - but additional funding streams by EC connected to RI  legal issue: foundation of a European Research Infrastructure Consortiums (ERIC) as basis for future with automatic qualification to participate in programs Bozen,

CLARIN Utrecht March Organisation of the CLARIN ERIC

who seems to be on board? Belgium, Bulgaria, Germany, Denmark, Estonia, Latvia, Finland, Croatia, Netherlands, Norwegen, Austria, Portugal, Spain, Czech Republic, Hungary, South Tirol, ? Some are discussing: FR, SW, GR?, etc. Bozen,

Advantage of membership Bozen,  privilaged access to CLARIN federation  networked with CLARIN centres (direct technology transfer)  a word when discussing priorities, agreements, best practices  access to EC funding streams  access to education and training programs to make our young generation competitive

Weitere Informationen Bozen,  CLARIN web site:  CLARIN office:  CLARIN Newsletter:  CLARIN members:

Thanks for your attention.

CLARIN Usage Scenario  Scenario: A Serbian and a German PhD student want to study language variation in the Balkan area  Resource: via VLO they find all relevant language variation data for that area  Tools/Services: Modern clustering methods available via the web allow to quickly build dialect continua on top of a geographic map; visualization services allow to pipeline this to get a nice output

Visualization of Dialect Data: Clustering

CLARIN Usage Scenario  Scenario: Linguists, sociologists and ethnologists want to study the cultural and linguistic differences of parliament debates in SE, DE and GR about the swine flue and compare how such global problems are dealt with  Resource: building a virtual collections of all debates (Audio, Video, Transkription)  Tools/Services: allowing researchers to analyse and annotate gestures, intonation, word choices, timing etc where partly powerful computers need being used  Vision: in 2011/12 such computational services will be made available in CLARIN 2011