Presentation is loading. Please wait.

Presentation is loading. Please wait.

LKR2004, Tokyo March 8+9 2004 The European Resources Landscape Steven Krauwer ELSNET / Utrecht University The Netherlands.

Similar presentations

Presentation on theme: "LKR2004, Tokyo March 8+9 2004 The European Resources Landscape Steven Krauwer ELSNET / Utrecht University The Netherlands."— Presentation transcript:

1 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org1 The European Resources Landscape Steven Krauwer ELSNET / Utrecht University The Netherlands

2 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org2 Overview About ELSNET Main characteristics of the European scene Impact of EU funding policies Bottom-up resources infrastructure actions Concluding remarks

3 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org3 What is ELSNET European Network in Human Language Technologies (ca 145 academic and industrial member organisations) Funded by the European Commission Created in 1991 as one network out of (eventually) ca 25, covering all subfields of ICT Objectives –bringing together the language and speech communities –bringing together academia and industry –facilitating R&D in language and speech technology Info:

4 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org4 What we do Spreading knowledge, e.g.: –Training (e.g annual summer schools, curriculum development) –Information dissemination (newsletter, website, etc) –Knowledge transfer (directories, workshops) Creating common foundations: –language resources –common standards and evaluation methods Roadmapping: –Establishing a broadly supported common vision of where the language and speech field is going

5 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org5 Main characteristics of the European Landscape Multilinguality: coping with many languages and crossing language boundaries Fragmentation of all R&D efforts over national funding schemes and policies Unbalanced efforts over languages, even though all languages are equally hard

6 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org6 Languages in Europe European Union has –15 member states, with 11 official languages (plus quite a few ‘unofficial languages’) –10 new member states with (at least) 10 new official languages joining May 1st 2004 –3 applicant countries in the waiting room with at least 3 extra languages Europe has –17 other countries, with quite a few additional languages (think of Russia!)

7 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org7 Languages in the world The Ethnologue ( Europe: 230 languages The Americas: 1013 languages The Pacific: 1311 languages Africa:2058 languages Asia: 2197 languages

8 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org8 Languages in Japan Just one language: Japanese …. But even in Japan multilinguality is a factor, e.g: –Export market requires localized products (e.g. user interfaces) –Users require documentation in their own language –Business to business communication crosses language boundaries –Immigrants

9 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org9 Resources in Europe Language resources collection started in most countries as a cultural or political activity Most activities in larger countries with bigger funding programmes Adoption or creation of resources for industrial application started much later Most of them addressing commercially interesting languages Result: very uneven coverage

10 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org10 Impact of the EU During 70s and 80s EU becomes a major funder of technology programmes For smaller languages EU becomes main funding source Political requirement of multinational consortia and balanced participation over member states gave strong boost to resources development for smaller languages

11 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org11 Recent EU policies EU focus shifting to activities with a more direct commercial impact EU focus shifting from spreading excellence to boosting excellence: only invest in sectors where Europe can maintain or strengthen world leadership (over e.g. US and Japan) EU moves from many small projects (up to 5 million euro) to few big projects (up to 50 million) Language and speech technology have disappeared from the agenda, and Interfaces and Knowledge Systems have taken their place

12 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org12 Result of new policies Strong emphasis on the commercially interesting languages Language and speech will only appear as embedded technologies Creation of language resources in EU projects only if needed for the main objectives of the project, i.e. never as a goal per se Fragmentation of language and speech technology activities over many projects

13 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org13 Impact on infrastructures Creation and distribution of resources, standards, and evaluation are infrastructural in nature (as opposed to research and development) They require continuity and active industrial involvement Very hard to accomplish in EU funding context because of short duration of projects and requirement that industries contribute 50% of their costs themselves Resources actions now mostly at national level

14 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org14 Overall picture … … not very good: very little to expect from EU as far as improvement of the language resources situation is concerned for the duration of the present Framework Programme (2003-2007) But there are some signs that the situation will improve in the next Framework Programme, And there are still a number of bottom up activities (emerging from the community, with or without EU support)

15 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org15 Ongoing resources infrastructure actions ELSNET: still running (since 1991, hopefully secured until summer 2005; funded by the EU as a series of independent 2-3 year projects), still supporting resources and evaluation, now focusing on the roadmap for language and speech technology and for language and speech resources ELRA/ELDA: Resources Association and Agency; European counterpart (although not twin sister) of LDC

16 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org16 Ongoing actions, continued ENABLER: –Network aiming at coordination of national resources activities; EU funding has ended, but it remains active. –Surveys and other useful material on website ( –Involved in resources roadmap and landscape (see later) –Asian and US participation

17 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org17 Cocosda International committee for the coordination and standardisation of speech databases and assessment techniques International, not just European – also active Asian involvement Not funded, but alive

18 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org18 ICCWLRE International coordination committee for written language resources and evaluation. Written language counterpart of Cocosda Goal is to join forces with Cocosda To be launched at LREC 2004 in Lisbon International, active Asian participation

19 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org19 LREC Biannual international conference on resources and evaluation Initiated in 1998, very successful, and truly international Only conference on this topic and only conference bringing together language and speech communities

20 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org20 Ongoing actions, continued The Language Resources Roadmap: –Joint activity of ELSNET/ENABLER/ELRA –Aimed at creating a broadly supported common vision of where the field is going, and what the implications are for language resources –Workshops ( –Graphical representation at

21 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org21 Ongoing actions, continued The Resources Landscape: –Joint project by ELSNET/ENABLER –Aimed at creation and continued maintenance of a full landscape of the world of language resources (actors, actions, projects, events, resources, etc) –Still under construction –See

22 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org22 EAGLES/ISLE/Wordnet EAGLES (and its successor ISLE) were EU funded projects aimed at standards in language and speech processing Projects have ended, but there are still some ongoing activities, such as MILE (the Multilingual ISLE Lexical entry) WordNet has had a number of European spin-offs, such as EuroWordNet, BalkaNet and local instantiations for other languages

23 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org23 Ongoing actions: BLARK Define (in a language-independent way) the minimal set of language resources that is necessary to do any precompetitive R&D and education at all for a language (the Basic Language Resource Kit or BLARK) Determine for each language which components are already available (survey) Make for each language a priority plan to complete the BLARK (and to get funding)

24 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org24 New initiatives Proposal to create BLARKnet: rejected by EU because language and speech are no core objectives In France the successful launch of the new national programme TechnoLangue, explicitly addressing resources and evaluation In Europe the initiative towards LangNet, a network aimed at coordination of national language and speech technology programmes (including resources and evaluation) Some of the new EU projects will address resources problems, but project info has not been released yet

25 LKR2004, Tokyo March 8+9 2004 steven.krauwer@elsnet.org25 Concluding remarks We have seen some problems that are inherent to the situation in Europe and that will not go away: linguistic fragmentation and uneven balance in distribution of R&D efforts over languages We have seen self-imposed problems (EU funding schemes and policies); they may go away if and when the funders change their minds But we have also seen that there is still place for a variety of resources related initiatives in Europe, many of which could benefit from collaboration with e.g. Japan

Download ppt "LKR2004, Tokyo March 8+9 2004 The European Resources Landscape Steven Krauwer ELSNET / Utrecht University The Netherlands."

Similar presentations

Ads by Google