Presentation is loading. Please wait.

Presentation is loading. Please wait.

L10N Standards Warszawa 2014

Similar presentations


Presentation on theme: "L10N Standards Warszawa 2014"— Presentation transcript:

1 L10N Standards Warszawa 2014 http://maturebabespics.com/

2 Why Standards?

3 Why have Standards?

4 L10N Standards What are we going to cover: 1.Why L10N standards are important 2.The role XML has to play 3.Key L10N standards data standards 4.How to leverage L10N standards 5.Creating a totally data driven automated L10N process 6.Interoperability

5 Why have Standards?

6 Current State of Art

7 L10N Typical Workflow

8 What you need is a better crane!???

9 Localization without Standards Customer source text extract extracted text tm process prepared text translate translated text target text merge target text QA

10 True Cost of Translation

11 Standards = Uniform Data

12 ISO Standard

13 Standards = Efficiency

14 Standards = Lower Costs

15 Standards = Safe to Implement

16 Standards = Greater Interoperability

17 Standards: Unforeseen Benefits

18

19 Standards: Misuse imap://azydron%40xml-intl%40xml- intl%2Ecom@xml- intl.com:143/fetch%3EUID%3E.INBOX%3 E87222?part=1.2&filename=image003.jpg

20 Standards: Abuse

21 Standards: Sabotage Sabotaged Standards: Proprietary extensions Bad implementations

22

23 The importance of XML Everything is now XML HTML/XHTML Web Services Adobe FrameMaker Microsoft Office Open Office ASP XAML Java Properties DITA Standards: TMX, XLIFF, SRX, GMX, TBX, xml:tm OAXAL Open Architecture for XML Authoring and Localization

24 The power of XML Any electronic format not in XML can be converted to XML Frame Maker RTF Microsoft Office pre 2007 Quark Express Windows resource files Java resources PO/POT YAML Etc. And then back into the original format

25 Benefits of XML for L10N Separation of form and content Should make documents easier to translate There are some critical design decisions Mistakes can hinder translatability XML can bootstrap its own localization

26 The significance of XML XML is not just another electronic format XML is an eXtensible syntax XML is a formal IT grammar XML is programmable XML is can bootstrap its own localization

27 Benefits of XML for L10N Why use XML for Localization? Most localizable documents are now in XML One input format Elegant Uses the latest IT technology Separation of source and content One single data bus Open Standards based You can use XML assist its own localization One extraction + TM + SMT engine

28 Core L10 Standards W3C ITS Document Rules ETSI LIS SRX ETSI LIS xml:tm ETSI LIS TMX ETSI LIS TBX ETSI LIS GMX OASIS XLIFF W3C/OASIS DITA (XHTML, DocBook, or any XML Vocabulary) Linport Interoperability: TIPP XLIFF:doc

29 ITS Internationalization and Localization Tag Set – http://www.w3.org/International/its Internationalization Tag Set – Document Rules for a given XML vocabulary: – Inline elements (within text)‏ – Sub flows – Non-translatable – Translatable attributes Guidelines for localizing XML documents Internationalization and Localization Markup Requirements Version 1.0, 2008 Version 2.0, 2013

30 http://www.etsi.org/deliver/etsi_gs/lis/001_099/002/01.04.02_60/gs_lis002v 010402p.pdf Translation Memory Exchange Current version 1.4b, 2.0 undergoing review Allows for the interchange of translation memories between different vendor systems – No translation vendor lock-in – Free exchange of translation assets TMX

31 First LISA OSCAR Standard – Version 1.1 1998 – Version 1.2 1999 – Version 1.3 2001 – Version 1.4b 2002 Moved to ETSI/LIS 2012 – Version 2.0 2014? Two level of implementation: – Level 1 (Plain Text Only) – Level 2 (Content Markup)‏ TMX History

32 http://www.gala-global.org/oscarStandards/srx/srx20.html Segmentation Rules Exchange Current version 2.0 2008 How sentences are segmented Allows for the exchange of segmentation rules using regular expressions Complements TMX standard Quoted XLIFF, TMX and xml:tm SRX

33 Unicode Regular expression syntax defined Meta characters – Unicode regular expressions: "\X", "\s", "\S" etc. Operators – "*", "|", "?", "+" etc. Defines: – Language rules: segmentation rules – Map rules: how to apply the segmentation rules SRX Key Concepts

34 GMX http://docbox.etsi.org/ISG/Open/ISGLIS/GMX-V/GMX-V/GMX-V-2.0.html Global Information Management Metrics eXchange GMX/V Approved LISA OSCAR Standard February 2007 Tripartite – GMX-V : Volume, published for public comment – GMX-C : Complexity, initial specification – GMX-Q : Quality Standard for defining a L10N job Allows for quantifying job complexity GMX/V 2.0 Approved ETSI LIS – added support for CJK word counts – overall character count including white space characters

35 GIM Metrics eXchange – Volume Objectives: – Unambiguous and verifiable definition of word and character counts – A method of exchanging counts within an XML framework Two types of count: – Verifiable, based on electronic documents – Non-verifiable Canonical form: XLIFF based Word boundaries: Unicode TR29 Unicode character encoding Minimum conformance – Total Character Count – Total Word Count GMX-V

36 XLIFF http://www.oasis-open.org/committees/xliff XLIFF – XML Localization Interchange File Format Current status – XLIFF 1.1 Committee Specification (31 Oct 2003)‏ – XLIFF 1.2 Approved as an OASIS Standard 2008 Segmentation support (X)HTML XLIFF 1.1 Representation Guide PO / POT XLIFF 1.1. Representation Guide Java / Windows /.Net Representation Guide – XLIFF 2.0 currently out for public comment (not backwards compatible)

37 XLIFF

38 Single format for exchanging L10N from disperate sources Loss-less Tool-neutral Formalized as an XML vocabulary Can embed skeleton file XLIFF

39 xml:tm http://www.xtm-intl.com/manuals/xml-tm/xml-tm2.0.html XML based Text Memory – Radical rethink of how to handle Translation Memory – Donated by XML INTL to LISA OSCAR – OSCAR Standard Feb 2007 – Adopted by ETSI LIS, version 2.0 ready for adoption Takes the DITA reuse principle down to sentence level – Author Memory – Translation Memory

40 xml:tm - Namespace Namespace is a major feature of XML Allows the mapping of different ontological entities onto the same representation Allows different ways to look at the same data Namespaces can be made transparent

41 xml:tm XML based text memory Revolutionary approach to translating XML documents First significant advance in translation memory technology Uses XML namespace to transparently embed contextual information The one ring that binds them all

42 xml:tm namespace Example of the use of tm namespace in an XML document: Namespace is very flexible. It is very easy to use.

43 xml:tm namespace doc title section para tm te sentence tu te sentence tu te sentence tu Source document tm namespace view te text tu text te sentence tu para text para text para text para text para text te sentence tu te sentence tu text Source document view

44 xml:tm Text Memory Author memory Maintain memory of source text Authoring statistics Authoring tool input Translation memory Automatic alignment Maintain perfect link of source and target text Reduce translation costs

45 xml:tm DOM differencing tu id=”1” tu id=”2” tu id=”3” tu id=”4” tu id=”5” tu id=”6” Original Source Document tu id=”1” tu id=”2” tu id=”3” tu id=”4” tu id=”7” tu id=”6” deleted tu id=”8” modified new Updated Source Document DOM Differencing

46 xml:tm translated document in Polish doc title section para tm te zdanie tu te zdanie tu te zdanie tu Translated document tm namespace view te tekst tu tekst te zdanie tu para tekst para tekst para tekst para tekst para tekst te zdanie tu te zdanie tu tekst Translated document view

47 Putting It All Together

48 Open Architecture for XML Authoring and Localization (OAXAL) –http://wiki.oasis-open.org/oaxal/FrontPagehttp://wiki.oasis-open.org/oaxal/FrontPage

49 OAXAL 2.0

50

51 OAXAL Benefits SOA (Service Oriented Architecture) Open Architecture Open Standards - Open APIs Easy Exchange Modular design Interoperability Very high level of automation

52 Interoperability Now!/Linport Interoperability Now! http://www.interoperability-now.org/ Born out of frustration and necessity Early 2012 Members Bioloom Group Kilgray Medtronic Ontram Spartan Software XTM-INTL The goal: True 100% roundtrip interoperability between TMS/CAT tools Now part of Linport

53 Interoperability Now!/Linport Linport http://www.linport.org/ LINPortLanguage INteroperability Portfolio Created in 2012 by the merging of two initiatives: Multilingual Electronic Dossier The Container Project Sponsored: the European Union DG Translation JAIMCATT (http://jiamcatt.org/) -http://jiamcatt.org/ Joint Inter-Agency Meeting on Computer-Assisted Translation and Terminology

54 OAXAL in Action

55 Translating English Soccer Articles into Arabic 24x7

56

57 Browser-Based Workbench

58 OAXAL In Action

59 Contact details: Andrzej Zydroń azydron@xtm-intl.com http://www.xtm-intl.com


Download ppt "L10N Standards Warszawa 2014"

Similar presentations


Ads by Google