Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe A.Gómez-Pérez (UPM) Project Coordinator.

Slides:



Advertisements
Similar presentations
Cultural Heritage in REGional NETworks REGNET Exploitation of Results.
Advertisements

DILIGENT Digital libraries powered by the Grid Peter Fankhauser
LIBER Annual Conference, 2008, Istanbul 1 LIBER 37th Annual Conference, Istanbul, 3 July 2008 DRIVER: Building a sustainable infrastructure of (European)
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Project Overview Slide 2 of 15 Overview Project in a Nutshell ◦Motivation ◦Aims and Objectives ◦Expected Outcomes PlanetData Programs Join PlanetData.
Innovative Concept for Internationalizing Companies IC&IC Concept Inovativ pentru Internaţionalizarea Companiilor.
Help communities share knowledge more effectively across the language barrier Automated Community Content Editing PorTal.
Cultural Content and Digital Heritage Bernard Smith European Commission INFSO/D2.
MultilingualWeb Standards and best practices for the Multilingual Web An Overview of the Thematic Network funded by the European Commission 11 June 2010.
Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.
® ® Global Advisory Council (GAC) Outreach overview, Jan 2011 Mark Reichardt, President and CEO Open Geospatial Consortium © 2011 Open Geospatial Consortium.
 Identity  Introduction  The Organic.Edunet Web portal  Objectives  Technology  Content  Consortium  Workplan  Contact details.
Multilingual eLearning in LANGuage Engineering. Project Overview  Project span: Oct 2004 – Oct 2007  Kick-off meeting Oct  Project goals:
 Identity  Introduction  The Organic.Edunet Web portal  Objectives  Technology  Content  Consortium  Workplan  Contact details.
OneGeology-Europe - the first step to the European Geological SDI INSPIRE Conference 2010, Session Thematic Communities: Geology Krakow, June 24 th 2010.
MID-TERM CONFERENCE 26 th January 2012 DISSEMINATION of the DE-LAN Project Activities and Collaboration David Cromar Erisa.
Building Digital Museums, Libraries and Archives David Dawson Senior Policy Adviser (Digital Futures)
Robust and affordable process control technologies for improving standards and optimising industrial operations Pau Puigdollers Project Coordinator.
RDA Europe & National initiatives HILARY HANAHOE, TRUST-IT SERVICES, RDA SECRETARIAT & RDA EUROPE PROJECT COORDINATOR -
Data Sets, Vocabularies and Tools Pablo N. Mendes Freie Universität Berlin 1st year review Luxembourg, December /02/11.
A Brief Survey of Multimedia Annotation Localisation on the Web of Linked Data Gary Lefman 1 David Lewis 1 Felix Sasaki 2
Help communities share knowledge more effectively across the language barrier Automated Community Content Editing PorTal.
21 June 2001Prof. Dr Gerd Stanke(GFaI, Berlin) EVAN Electronic Imaging & the Visual Arts Networking (IST EVAN) Start.
Project Overview The Linked2Media Consortium April 2012.
Exploring Europe's Television Heritage in Changing Contexts Connected to: Funded by the European Commission within the eContentplus programme
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
BARCELONA January 2011 European Commission Information Society and Media GaLA Game and Learning Alliance The European Network of Excellence on Serious.
EStream – Best Practice in the Use of Streaming Media © A. Knierzinger, C. Weigner Increasing the use of Streaming technology in school education in Europe.
Results of the HPC in Europe Taskforce (HET) e-IRG Workshop Kimmo Koski CSC – The Finnish IT Center for Science April 19 th, 2007.
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
1 Web: Steve Brewer: Web: EGI Science Gateways Initiative.
24 Jan 2005 Kick off meeting (Luxembourg) 1 LIRICS Linguistic Infrastructure for Interoperable Resources and Systems ►Kick off meeting presentation ►Proposal.
Music Australia Engaging partners and audiences Robyn Holmes, Curator of Music, National Library of Australia.
VIRTUAL INFORMATION AND KNOWLEDGE ENVIRONMENT FRAMEWORK IP-FP
National Library of Estonia in the TEL-ME-MOR project IST4Balt workshop in Estonia June 2006 Baltic ICT Community.
The European Localisation Exchange Centre Karl Kelly Event Coordinator LRC electonline.org.
NoE Knowledge Web Dissemination Activities Guus Schreiber Scientific Director.
EVA Workshop, 26 March 2003, Florence, Italy1 COINE Cultural Objects In Networked Environments Anthi Baliou University of Macedonia,Library Thessaloniki,
Marina Signore e- Frame Project Coordinator Division "Metadata, Quality and R&D Projects", Chief Istat e-Frame “European Framework for Measuring Progress.
1 CHORUS + Coordinated approacH to the EurOpean effoRt on aUdio-visual Search engines ” Grant Agreement No CHORUS + 01/01/2010 – 31/12/ Jean-Charles.
1 ASTRONET Coordinating strategic planning for European Astronomy.
Role of national bibliographic agencies in linked data environment Gordon Dunsire Presented to staff of the Bibliothèque nationale de France, Paris, 25.
Green Learning Network Presentation Slide deck 1.
Linked Data Profiling Andrejs Abele National University of Ireland, Galway Supervisor: Paul Buitelaar.
Toward an Open Source Textual Entailment Platform (Excitement Project) Bernardo Magnini (on behalf of the Excitement consortium) 1 STS workshop, NYC March.
SmartOpenData, & SDI4Apps Dr. John J O’Flaherty, The National Microelectronics Applications Centre Ltd, SDI4Apps Kickoff Meeting, Sicily,
VOA3R: Virtual Open Access Agriculture & Aquaculture Repository sharing scientific and scholarly research related to agriculture, food, and environment.
The Future Media Internet coordination action Dr. Federico Álvarez Universidad Politécnica de Madrid.
Co-funded by the European Union Ref. number: LLP FI-ERASMUS-ENW WP2: Identification of Industrial Needs for Open innovation Education in.
Co-funded under the ICT Policy Support Programme of the European Union Technology and Innovation for Smart Publishing TISP A thematic network Title Speaker.
OER Humanities: The HumBox Project Alison Dickens (Project Director) Subject Centre LLAS.
1 Open Discovery Space Overview Argiris Tzikopoulos, Ellinogermaniki Agogi Open Discovery Space [CIP-ICT-PSP ][elearning] A socially-powered and.
MICHAEL Culture Association WP4 Integration of existing data structure into Europeana ATHENA, WP4 Working group technical meeting Konstanz, 7th of May.
2. The funding schemes ICT Proposer’s Day Köln, 1 February 2007 The ICT Theme in FP7 How to participate to ICT in FP 7.
From CLEF to TrebleCLEF Promoting Technology Transfer
Information Day on “Search Engines for Audio-Visual Content”
The ACCEPT Project Enabling machine translation for the emerging community content paradigm. Allowing citizens across the EU better access to communities.
Building the Localization Web
Asunción Gómez-Pérez (UPM) (Project coordinator)
ESSnet Linked Open Statistics Update
ESS roadmap on Linked Open Data State of play
ESSnet on Linked Open Statistics
Gerry Barbera University of Messina (ITALY)
LOSD Publication Deirdre Lee
ASTRONET Coordinating strategic planning for European Astronomy
Infrastructrural Language Resources and International Cooperation
Linked Data Reuse in the Language Services Industry
Introduction “Technologies for the Multilingual Web” & ITS 2
Key Action 2: Strategic Partnership Projects
European Masters Program Language & Communication Technologies
Presentation transcript:

Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe A.Gómez-Pérez (UPM) Project Coordinator CSA Budget: € Starting date: 1. Nov Duration: 2 Years

The LIDER consortium 2 Universidad Politécnica de Madrid (UPM, Spain) [COORDINATOR] Trinity College Dublin (Ireland) DFKI (Germany) National University of Ireland, Galway (Ireland) Institut für Angewandte Informatik EV (INFAI, Germany) University of Bielefeld (Germany) Universita degli Studi di Roma La Sapienza (Italy) GEIE ERCIM (France)

Evidence of industrial demand  Multilingual multimedia content annotation. o Increase demand for NLP services that combine text processing with Multimedia meta-data and media processing components.  LOD generation from linguistic resources o data is already being published by companies, but not linguistic resources as LLOD  LOD-based NLP services for Content Analytics o CA related companies that actively use the English Dbpedia (OpenCalais, Zemanta, Ontos, Yahoo!, Nerd, etc.) o multilingual LOD would be vital for reaching EU- wide and global markets 3

The use of LOD for NLP in Content Analytics  Which extensions to the LOD are needed to support a new generation of large- scale content analytics applications that will overcome language barriers. o identification of key NLP tasks that require background knowledge o Specification of a new generation of NLP services that are LOD-aware and can exploit LOD  Licensed linguistic linked data (LLD or LLOD)

Linked Open Data and Language  2007  2009  LOD is increasingly multilingual 2.LOD interconnects resources in many languages

2,567,324 10,250,936 3,154,779 10,594,33812,272,806 3,365,930 RDF literals without language tag RDF literals with language tag January 2012June 2012December Current usage of language tagging capabilities in RDF 349 1, ,2011, Monolingual datasets Multilingual datasets January 2012June 2012December Number of Monolingual and multilingual datasets 4. Evolution of top-10 languages (non Eglish) LOD is dominated by the English language 431,660 2,135,6642,751, ,714 2,808, ,785 RDF literals with English tag RDF literals with other language tag January 2012June 2012December English tags versus other languages' tags

LOD as large background knowledge for NLP 7 Multimedia and Multilingual Content Multimedia and Multilingual Content Producers Metadata Generation Multilingual content medatada Consumers Content Analytics Content Analytics... Language Resources (Lexicon, corpora,...) some of them are FOI other are private Linguistic LOD generation LLOD (language resources as LD) LOD-aware NLP services

Iterative approach 8 Industry use cases Roadmap, guidelines, target architecture Community building networking

Expected Contributions from the Community  Use case definition from industry will be input to the roadmap  Linguistic resources  LLOD  Validation of guidelines and reference architecture  Participation in surveys  Participation in events: o Roadmapping WS, hackatons, etc. 9 Lider will help with travelling grants to participants in Roadmapping WS

Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe A.Gómez-Pérez (UPM) Project Coordinator

The use of (Linguistic) LOD for NLP Linguistic LOD (LLOD)  Subset of LOD  Linguistic and Open resources in RDF interconnected with other Linguistic and Open resources  Not too many linguistic resources as LOD Linguistic LD (LLD)  Licensed linguistic linked data LOD, LLOD and LLD as a source of large background knowledge for NLP 11

Workplan and outcomes 1. Definition of business use cases o Extract requirements needed to exploit LLD in content analytics processes o Extract common and frequent NLP-based tasks that are needed for content analytics. 2. Definition of Guidelines and best practices for: o Multimedia and multilingual content metadata generation and use o LLD generation o NLP services built on top of LLD 3. Reference Architecture and Roadmap for content analytics o Reference architecture: reference model + architectural patterns o Roadmap involving the academic community and industry 12 Business use cases: LLD in CA Guidelines and best practices: LLD for CA Linguistic LOD LLD Reference Architecture Roadmap: LLD for CA

Workplan and Outcomes 4. Community Building and Dissemination o Industrial Board o Open community Events tailored to the different audiences Roadmapping Workshops Surveys to localization industry and general Web companies Sessions at W3C Multilingual Web Workshop and European Data Forum Publication of best practices material via W3C community groups Hackathons o Community portal Relying on portal and the related social channels o Dissemination activities 13

Lot of domain data in LOD… Music Geographic Life Sciences Publications E-Gov On-line activities Cross-domains