The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Catalyst Preview Enda McDonnell Alchemy User Conference London 2012 London Science Museum 31 May 2012.
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
EXtensible Catalog David Lindahl University of Rochester.
Session: Technologies for the Multilingual Web. The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web)
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
Using language services to enrich the LOs' descriptions Dr. Vassilis Protonotarios University of Alcala, Spain 10 th Strategic Seminar / Conference 6-7.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
A Practical Introduction to XML in Libraries Marty Kurth NYLA October 22, 2004.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
An innovative platform to allow translation and indexing of internet sites Localization World
DEiXTo.
1 1 Roadmap to an IEPD What do developers need to do?
(C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program.
INTRODUCTION TO WEB DATABASE PROGRAMMING
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Joel Bapaga on Web Design Strategies Technologies Commercial Value.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Administered by: Funded by: MultilingualWeb-LT: Putting the World in the World-Wide Web Arle Lommel Deutsches Forschungszentrum für Künstliche Intelligenz.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
XML BIS4430 – unit 10. XML Origins Extensible Markup Language (XML) 1998 Inspired by Standard Generalized Markup Language (SGML) and HTML. SGML defines.
5 Quick ways to improve content value do cool stuff using Calais.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
© Copyright 2008 STI INNSBRUCK NLP Interchange Format José M. García.
MultilingualWeb – Language Technology A New W3C Working Group Felix Sasaki, David Filip, David Lewis.
10/18/2015 NORTEL NETWORKS CONFIDENTIAL – FOR TRAINING PURPOSES ONLY Global Documentation Evolution System Overview and End-to-End Process Training.
(C) 2014 Logrus International Visualizing ITS 2.0 Categories for the localization process.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
Using XML to store Descriptive Metadata Richard Murphy Rosarie O’Riordan Central Statistics Office Ireland.
Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global.
© Copyright 2013 STI INNSBRUCK “How to put an annotation in HTML?” Ioannis Stavrakantonakis.
Xml:tm XML Text Memory Using XML technology to reduce the cost of translating XML documents.
The IBM Rational Publishing Engine. Agenda What is it? / What does it do? Creating Templates and using Existing DocExpress (DE) Resources in RPE Creating.
Advanced Technical Writing 2006 Session #4. Today in Class… ► Meet with your editorial team, refine/post deliverables ► Send URL for deliverables to Bill.
FEISGILTT Dublin 2014 Yves Savourel ENLASO Corporation QuEst Integration in Okapi This presentation was made possible by This project is sponsored by the.
Machine Translate Post Edit Quality Check Extract Content I18N Text Analysis Curate Corpora Workflow Analysis Segment Identify Terms Translate Provenance.
SDMX IT Tools Introduction
ITS 2.0 in XLIFF 2 FEISGILTT Dublin June 2014 Yves Savourel ENLASO Corporation This presentation was made possible by.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Internationalization Tag Set (ITS) Version 1.0 The Internationalization Tag Set (ITS) is a set of XML elements and attributes that supports the internationalization.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
1 Introduction to XML Babak Esfandiari. 2 What is XML? introduced by W3C in 98 Stands for eXtensible Markup Language it is more general than HTML, but.
A report by Olaf-Michael Stefanov to the JIAMCATT community
TextCrowd – Collaborative semantic enrichment of text-based datasets
^ Reviewer’s Workbench
Dave Lewis W3C MultilingualWeb - Language Technology Working Group
Building the Localization Web
Text Analytics in ITS 2.0: Annotation of Named Entities
The Re3gistry software and the INSPIRE Registry
Business Benefits of ITS 2
Part of the Multilingual Web-LT Program
XML Data Introduction, Well-formed XML.
ITS 2.0 Enriched Terminology Annotation Showcase
ITS Workbench Two Problems, One Open Standards Based Solution
LOD reference architecture
Statistical Information Technology
Part of the Multilingual Web-LT Program
CSE591: Data Mining by H. Liu
Use Cases Simple Machine Translation (using Rainbow)
Linked Data Reuse in the Language Services Industry
Two Problems, One Open Standards Based Solution
Introduction “Technologies for the Multilingual Web” & ITS 2
Text Analytics in ITS 2.0: Annotation of Named Entities
Presentation transcript:

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No Implementation Basket Moderator: Felix Sasaki (DFKI / W3C Fellow)

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No What is in the basket? Tools to support work with W3C ITS 2.0 ①ITS 2.0 in editing environments ②Generate and validate ITS 2.0 ③(Automatically) process ITS 2.0 enhanced content What the audience should do – Think about the area that interests you – Remember faces and use META-FORUM for hallway conversations 2

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No W3C ITS 2.0 in editing environments In the CMS 1: Adobe. Presenter: Felix Sasaki In the CMS 2: Cocomore. Presenter: Clemens Weins In a word processor: ]init[. Presenter: Steffen Haller In a Web content editor: Disruptive Innovations. Presenter: Daniel Glazman 3

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No Adobe’s ITS2 Implementation CMS REST Framework Translate Localization Note Id Value Target Pointer Adobe’s fully open source implementation imports and exports content enabled with ITS2 metadata to/from a JCR Content Repository XML (xliff) html5 To access content: GET To access the same content, ITS Enabled : GET html Implemented Data Categories Accessible via ‘selector’ REST URLs. E.g.: 4

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No Build the bridge Web CMS <> TMS 5 Drupal ITS 2.0 integration JavaScript ITS 2.0 parser Real life ITS 2.0 showcase with a customer (VDMA) and Language Service Provider (Linguaserve) XHTML + ITS 2.0 LSP

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No W3C ITS Libre Office Extension ]init[ AG für Digitale Kommunikation Downloadable at Libre Office Extension Centre: rg/extension-center Open Source GPL v3 free to use and to be developed further More on: ficeWriter 6

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No Generate and validate ITS 2.0 Generate Terminology: Tilde. Presenter: Andrejs Vasiļjevs Generate Text Analysis information: Institut “Jožef Stefan”. Presenter: Felix Sasaki Transform HTML5+ITS2 to NIF (NLP Interchange Format): Univ. of Leipzig. See on NIF poster from Sebastian Hellmann Validate all ITS 2.0 data categories: University of Economics Prague. Presenter: Jirka Kosek 8

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No W3C ITS 2.0 Enriched Terminology Annotation Showcase taws.tilde.com 9

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No Creating translation context with disambiguation Problem: Localizing content containing proper names without sufficient context ITS 2.0 markup provides the key information about which entities are mentioned, so they can be correctly processed within translation Data category: Text Analysis Solution: use natural language processing techniques to provide context for ambiguous content. Implemented and demonstrated with the Enrycher NLP tool Demo: enrycher.ijs.si/mlw/ Questions: 10

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No W3C ITS 2.0 Support in Modern Document Formats HTML5 support – Native support (its-* attributes) – Supported by validators – validator.w3.org and validator.nu – You can use ITS markup right now in your pages and get them validated DocBook support – Supported by standard schema and stylesheets DITA support – Coming soon 11

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No (Automatically) process ITS 2.0 enhanced content (1/2) Machine translation statistical: Dublin City University. Presenter: Felix Sasaki Machine translation rule based: Lucy Software. See presentation from Pedro Díez Orzas later Building localization processes: ENLASO. Presenter: Felix Sasaki Building localization Web services: University of Limerick, Moravia. Presenter: David Filip Workflow for creating global content: Trinity College Dublin. Presenter: Dave Lewis Preview in the browser: Logrus. Presenter: Serge Gladkoff 12

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No ITS 2.0 & M ACHINE T RANSLATION Translation Web Service Translating of HTML / XLIFF documents tagged with ITS 2.0 metadata – Domain, Lang Info, Locale Filter – Terminology, Translate – MT Confidence, Provenance Demonstrate pre/post process wrapper scripts are sufficient to adapt a pre-existing MT system to the ITS 2.0 standard Benefits include integration of MT system into the larger localization pipeline Training Web Service Use of metadata info to train Statistical MT components (Translation & Lang Models) – Translate, Terminology Extract do-not-translate and named entity Terms, force feed this in training cycle – Significant Improvement observed in translation accuracy Benefits include added consistency in translation across multiple documents Web Service Located at: 13

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No W3C ITS in the Okapi Framework Open-source and cross-platform set of libraries and tools for building localization processes. Offers ITS support for XML, HTML5 and XLIFF, as well as in many components: Quality Check, Term Extraction, Microsoft Batch Translation, Enrycher, LanguageTool, etc. Makes adoption of ITS easy for developers and immediate for Okapi’s tools users. Continuing work after the MLW-LT project. 14

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No ITS and XLIFF in a full roundtrip test bed 15 Source CMS Target CMS RDF provenance store Named Entity Recogniser Term Annotstor Web- based PE MT - Matrex CAT XLIFF store Parse, filter, segment ITS +XLIFF 1.2 & 2.0 XLIFF/ PROV-O QA viewer MT - Bing MT – M4LOC ITS +HTML5 +CMIS ITS +XLIFF ITS +SPARQL Workflow Management Services Brokers MT, TA, CAT, …

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No ITS 2.0 for Global Intelligent Content Linked Data and Multilingual Content Processing Multilingual Content Interoperability New FP7: FALCON New FP7: LIDER 16 Source CMS Target CMS RDF provena nce store Named Entity Recogni ser Term Annotst or W eb - ba se d PE MT - Matre x CAT XLIFF store Parse, filter, segment ITS +XLIFF 1.2 & 2.0 XLIFF / PRO V-O QA viewer MT - Bing MT – M4LOC ITS +HTML5 +CMIS ITS +XL IFF ITS +SPARQL Workflo w Manage ment

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No Preview of ITS 2.0 Metadata in Web Browsers ( Part of the Multilingual Web-LT Program) COMPLEX METADATA AT YOUR FINGERTIPS: Part of Work in Context Solution (WICS) from Logrus 17

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No (Automatically) process W3C ITS 2.0 enhanced content (2/2) Capturing ITS 2.0 metadata: VistaTEC. Presenter: Phil Ritchie, separate slot Localization CMS / TMS / MT integration: Linguaserve. Presenter: Pedro Díez Orzas, separate slot 18

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No WHAT WILL OR MAY COME NEXT? 19

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No What will or may come next? Standardization break – let’s use W3C ITS 2.0 and gather experience! Outreach involving ordinary Web (content) developers – “ITS 2.0 for everybody” Strengthen the bridge to the Semantic Web: via e.g. ITS2<>NIF conversion (Sebastian Hellmann poster), FALCON (Dave Lewis poster), LIDER (Asunción Gómez Pérez presentation) 20

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No What will or may come next? Further contributions to the development of multilingual services and data analytics technologies – a long and open list of ideas – Mining provenance information for business analytics, “Terminology-Translation-Web technology” triangle, multilingual technologies for multimedia content,... We are looking for your ideas & thoughts – let’s discuss here at META-FORUM 21

The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No Implementation Basket Moderator: Felix Sasaki (DFKI / W3C Fellow)