Presentation is loading. Please wait.

Presentation is loading. Please wait.

XLIFF - the XML based Open Standard for Localisable Content Tony Jewtushenko Oracle Corporation - Principal Product Manager Chair – OASIS XLIFF TC The.

Similar presentations


Presentation on theme: "XLIFF - the XML based Open Standard for Localisable Content Tony Jewtushenko Oracle Corporation - Principal Product Manager Chair – OASIS XLIFF TC The."— Presentation transcript:

1 XLIFF - the XML based Open Standard for Localisable Content Tony Jewtushenko Oracle Corporation - Principal Product Manager Chair – OASIS XLIFF TC The XML Localisation Interchange File Format

2 Slide 2 Agenda Open Standards Definition and process Overview of XLIFF Definition, goals, and benefits of XLIFF Architecture and Main Features of XLIFF Use cases Open Source Localisation Technical Overview Process Overview Use case Where does XLIFF fit? Tools Support for XLIFF XLIFF Adoption by Open Source community

3 Slide 3 What is an Open Standard? Open standards are: Publicly available in stable, persistent versions Developed and approved under a published process Open to public input: public comments, public archives, no NDAs Subject to explicit, disclosed IPR terms See the US, EU, WTO governmental & treaty definitions of “standards” Anything else is proprietary Source: “Relationship Between Open Standards and Open Source Software”, Patrick Gannon – CEO OASIS, Open Source in Government, Washington, DC, 15-17 March 2004

4 Slide 4 OASIS: Standards Body Home of XLIFF OASIS: Organization for the Advancement of Structured Information Standards World’s largest independent, non-profit organization dedicated to the standardisation of eBusiness specifications. More than 150 member companies plus individuals Operates XML.ORG Registry, the open community clearinghouse of XML application schemas Technical work on XML interoperability includes XML conformance and XML Registries/Repositories General XML and eBusiness technical resource

5 Slide 5 OASIS Standards Process Specifications are created under an open, democratic, vendor-neutral process –Anyone may participate –No single organisation can dictate the specification - specifications must meet everyone’s needs –All discussions are open to the public view and comment Two Tiered Specification approval process –Committee Draft approved by Technical Committee –OASIS members approve specification as OASIS Standard Process guarantees that specifications are created by a broad range of industry, not just a single vendor

6 Slide 6 XLIFF Overview A glance at the definitions, goals and benefits of the XML Localisation Interchange File Format.

7 Slide 7 What is XLIFF? A specification for the lossless interchange of localizable data and its related information, which is tool-neutral, has been formalized as an XML vocabulary, and features an extensibility mechanism.

8 Slide 8 Why XLIFF is Needed? Localization offers the following challenges: Insufficient interoperability between tools. Lack of support for overall localization workflow. Necessity of localization tools developers to deal with many formats. Large number of proprietary intermediate formats.

9 Slide 9 Advantages – Technology (1/2) For a given utility, only one implementation is necessary (e.g. not one spell checker for PO Files, and another one for HTML). Increases usability of utilities (i.e. all formats with XLIFF filters can be used with XLIFF- enabled utilities). Can contain either UI or Document content Metadata provides integration with automated workflow.

10 Slide 10 Advantages – Technology (2/2) All advantages of XML-based processing: –Content validation (XSD) –Use of its internationalization features. –Better interoperability and cross-platform support. –Powerful rendering options (XSL-FO, CSS). –Powerful transformation options (XSLT). –Greater integration with Web services. Access to existing, and often open-source, XML implementations

11 Slide 11 XLIFF Timeline September 2000 - DataDefinition Kickoff December 2000 - first face to face March 2001 - second face to face End March 2001 - draft 1.0 spec and DTD published June 2001 - White Paper published December 2001 - OASIS XLIFF Technical Committee Proposal submitted April 2002 – XLIFF 1.0 Specification approved by formal vote as an OASIS Committee Specification May 2003 – XLIFF 1.1 Specification approved by formal vote as an OASIS Committee Specification August/Sept 2003 – XLIFF 1.1 Peer Review November 2003 – Revised XLIFF 1.1 Specification approved as OASIS Committee Specification November 2003 – XLIFF 1.1 Specification submitted for public review

12 Slide 12 Drivers Behind XLIFF Alchemy Software Bowne Global Solutions Convey Software Ektron, Inc ENLASO Corp ( RWS) Globalsight HP Lotus/IBM Lionbridge LRC Moravia IT Novell Oracle PASS Engineering Microsoft SAP SDL International Sun Microsystems Tektronix TRADOS XML-Intl

13 Slide 13 XLIFF TC in the Standards Community Shared interests with OASIS Translation Web Services Technical Committee –XLIFF may be used as data container for WS Shared interests with the OSCAR SIG at LISA –Segmentation and word-count. –Content markup (inline codes). Shared interests with the W3C i18n WG –Localization directives. –Best practices. –In the localization aspects of the W3C. recommendations. –Web services.

14 Slide 14 Architecture A look at XLIFF’s main features and how they work together.

15 Slide 15 Extract-Localize-Merge Paradigm Separate data related to localization from parts not related to localization. Merge translated data with codes at the end of the process to create the final document. Skeleton file is optional, so this paradigm is also optional

16 Slide 16 A Birds-Eyes View An XLIFF document can capture anything needed for a localization project: 1.Localizable objects (e.g. text strings) in source and target languages. 2.Supplementary information (e.g. glossaries, or material to recreate the original format). 3.Administrative information (e.g. workflow data). 4.Custom data (e.g. initialization information for tools).

17 Slide 17 The XLIFF Document An XLIFF document is designed to store the extracted data related to localization. Each given source container (e.g. a file, a database table, and so forth) corresponds to a element in XLIFF. Each XLIFF document can include several elements. A whole localization project can possibly be stored in a single XLIFF document.

18 Slide 18 Bilingual Model Each element is designed to store one source language and one target language. The rational is that the translation of different target language is done by different people most of the time. However, languages in element can be different. For example, proposed matches in national Portuguese when translating into Brazilian Portuguese.

19 Slide 19 Localizable Objects XLIFF allows not only text string as localizable object but also other object types such as graphics. Supplementary information can be represented in a generic way through inline codes (e.g. formatting of text). Relationship between object can be captured (e.g. all items in a menu).

20 Slide 20 An XLIFF Snippet… A simple menu represented as XLIFF

21 Slide 21 Supplementary Info XLIFF provides “hooks” for storing supplementary information (for example to glossaries or translation memories which should be used). The supplementary information can be referenced (i.e. reside outside of the document), or embedded within the document.

22 Slide 22 Administrative Info XLIFF provides mechanisms for capturing administrative information: For relating source material to XLIFF documents. For storing workflow data. For providing pre-translation entries generated by TM, MT, translation repository. For keeping track of changes.

23 Slide 23 XLIFF 1.1 Custom Data In XLIFF 1.1, we have the ability to customise XLIFF by extending via private namespace: –Elements –Attributes –Attribute Values

24 Slide 24 Embedding XLIFF 1.1 Can embed an entire or part of an XLIFF doc in other XML doc XML defined by XML Schema (XSD) that includes an element in the definition of the element where the XLIFF data can be inserted

25 Slide 25 Use Cases XLIFF in the localisation process.

26 Slide 26 Basic Use Case – without XLIFF Tool Resource Filters Developer Applications Translator Customer Specific Tool (s) Native File 2 (e.g., Java Files) Native File 1 (e.g., HTML) Native File 3 (e.g., Java Properties) Native File n Publisher/ Customer Domain Localisation Domain

27 Slide 27 Basic Use Case –with XLIFF XLIFF compliant Developer Applications Translator XLIFF Compliant Editor XLIFF file(s) containing HTML, Java, Properties, etc translatable resources Non XLIFF compliant Developer Applications - OR - Publisher/ Customer Domain Localisation Domain Direct to XLIFF authoring HTML Java Properties RC Data Pre-processing

28 Slide 28 Automated Localisation with CAT Use Case Developer Translator Generate XLIFF Pseudo Translate / Test Localization Engineer XLIFF Translation Kit 100% match Translation Repository Defect Report XLIFF Editor XLIFF Translation Kit Translate Requires Translation 100% Translated 0% Translated 100% Translated Fuzzy match Translation Memory Machine Translation Machine Translate Update

29 Slide 29 Open Source Localisation Issues specific to localising Open Source software.

30 Slide 30 Open Source Resource Formats User Assistance (Help): –DocBook as intermediate container UI Resources: –Many different format types, but converge on: PO / POT Java Resource Bundles (.properties &.java)

31 Slide 31 Docbook Formed in 1991 SGML and XML versions Many commercial XML editors optimised for Docbook No good Open Source XML editors available. GNU converts Docbook to (XML->) PO files, translates, then converts back. Docbook converted to HTML dynamically by Yelp Help Browser. To optimise performance can pre-convert to HTML

32 Slide 32 UI Resource Format – Java Resources ListResourceBundle –.java file –Can contain binary data –Compiled into class file PropertyResourceBundles –.properties file –Contain strings only –Values acquired at runtime –Requires 8859-1 encoding –Non 8859-1 characters represented as UTF8 escape codes (ie, \u xxxx) –native2ascii to convert non 8859-1 content

33 Slide 33 UI Resource Format – Java Resources Localization challenges: –Each file contains 1 language locale pair –Key / Value Pairs –No normalized metadata – comments often used for ad hoc metadata.

34 Slide 34 UI Resource Format - PO PO (Portable Object) Files, and POT (templates) –A “Catalog” –Bi-lingual model –Resource bundle accessed by “gettext()” –Text files –Utilities available to convert from many resource types to PO (ie., C, Delphi, Java, Python, etc.) –Compiled into “MO” files –Support for Plurals –Limited metadata –Used by most GNU, GNOME, KDE and other Open Source projects

35 Slide 35 PO File Syntax # SOME DESCRIPTIVE TITLE. # Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER msgid “” msgstr “” "Project-Id-Version: Project Version \n" "PO-Revision-Date: YYYY-DD-MM HH:MM-SSSS\n" "Last-Translator: TranslatorName \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=code\n" "Content-Transfer-Encoding: 8bit\n" "POT-Creation-Date: \n" "Language-Team: \n“ white-space (usually a single new line) # translator-comments #. automatic-comments #: reference... #, flag... msgid untranslated-string msgstr translated-string Header Resource(s) Segment Metadata Comments Separator

36 Slide 36 PO File Plural Form white-space # translator-comments #. automatic-comments #: reference... #, flag... msgid untranslated-string msgstr translated-string msgstr_plural translated-string-plural-form msgstr[0] translated-string-plural-form msgstr[1] translated-string-plural-form msgstr[n] translated-string-plural-form Plural form of a message in the PO file looks like this: “n” is language specific

37 Slide 37 PO File Plural Forms Syntax / Examples msgid untranslated-string msgstr_plural translated-string-plural-form msgstr[0] translated-string-plural-form msgstr[1] translated-string-plural-form msgstr[n] translated-string-plural-form msgid "%s file" msgid_plural "%s files" msgstr[0] "%s fichier" msgstr[1] "%s fichiers" msgid "%s file" msgid_plural "%s files" msgstr[0] "%s plik" msgstr[1] "%s pliki" msgstr[2] "%s plików" Syntax French Polish

38 Slide 38 PO File Localization Challenges Plural Forms Challenges –Rules differ across languages, and implementations differ across platforms. –PO editing tools don’t support plural form well (poedit, Kbabel), and recommend using text editors. Limited normalized metadata Little or no context information for translators Docbook represented as PO files loses metadata Limited support for segmentation, alignment

39 Slide 39 Simplified GNU/KDE Style Use Case Docbook i18n Coordinator Documentation Author Developer Domain Localisation Domain Docbook/PO converter CVS UI Developer Generate PO Files PO Preparation & Project Management Translator PO Text Editor PO Editor CVSUP PO/Docbook converter Translation TM

40 Slide 40 Open Source Localisation Process Localization in Open Source community is very technical, and almost entirely manual – primary interface is CVS, even for translators (eg: http://i18n.kde.org/translation-howto/index.html)http://i18n.kde.org/translation-howto/index.html Process and tools differ from project to project, even language to language. Little or no formal linguistic review: quality, style consistency vary widely. Project Management and translation are performed by volunteers.

41 Slide 41 Tools Support A survey of localization tools that support XLIFF

42 Slide 42 XML-Enabled Translation Tools Any XML-enabled translation tool can work with an XLIFF document, as long as the text to translate is initially copied in the elements. However, this does not mean it supports all XLIFF features, but just permits translation of content. Many tools cannot handle conditional translation (for example: ). Then, you need to add extra elements temporarily.

43 Slide 43 XLIFF Enabled Commercial Tools Alchemy Software - Catalyst 5.0 – Visual XLIFF 1.1 Editor http://www.alchemysoftware.iehttp://www.alchemysoftware.ie Heartsome XLIFF Editor, support for PO files, Docbook: http://www.heartsome.nethttp://www.heartsome.net PASS: Passolo: Visual XLIFF Editor: http://www.passolo.com http://www.passolo.com Trados: No direct XLIFF support yet, but can edit XLIFF files using modified INI XML-Intl : XLIFF Editor http://www.xml-intl.comhttp://www.xml-intl.com

44 Slide 44 XLIFF Enabled Shareware/Freeware ENSALO Corp (formerly “RWS Group”) : Extraction Utility for RC Data and Java Properties to XLIFF 1.1 http://dotnet.goglobalnow.net/http://dotnet.goglobalnow.net/ Various Freeware Utilities, including converters for PO files: http://www.translate.com/shared/toolshttp://www.translate.com/shared/tools

45 Slide 45 XLIFF Enabled Open Source International Components for Unicode (ICU): –Open Source set of C/C++ and Java libraries for Unicode support, software internationalization and globalization, extends JDK i18n – genrb, and XLIFF2ICUConverter class to convert between common formats and XLIFF –Includes RBManager, a Java based resource bundle editor with XLIFF support http://oss.software.ibm.com/icu/

46 Slide 46 XLIFF Enabled Open Source Okapi Framework XSL Template Collection: –Sample utilities for transforming XLIFF to PO, RC, Java Properties http://sourceforge.net/project/showfiles.php?group_id=42949& release_id=67485 xliffRoundTrip tool –Transforms any XML file to/from XLIFF using XSLT http://sourceforge.net/projects/xliffroundtrip/ Lionbridge ForeignDesk –Incomplete XLIFF support http://sourceforge.net/projects/foreigndesk/

47 Slide 47 Future Support for XLIFF Announced: Apple Corp: Apple’s resource editor AppleGlot Idiom: Worldserver V.6.0 SDL International: SDLX support for XLIFF currently in development. See http://www.sdlx.com for more information.http://www.sdlx.com uPortal: Open Source Web portal infrastructure for Universities – XLIFF support announced for Version 3.0, to be released in 2005

48 Slide 48 Where does XLIFF fit? Good choice for projects with multiple resource formats, especially good for XML. XLIFF addresses the process and metadata related problems of Open Source projects: –Supports workflow metadata. –Supports multiple resource formats –Normalised translation memory / repository data. –Simplifies translator usability experience.

49 Slide 49 Where does XLIFF fit? Issues Blocking Adoption by Open Source: –Adoption requires retooling - lack of existing open source XLIFF tools for PO and Docbook. –PO tools deemed adequate for current requirements –“Volunteer” model reduces urgency to reduce costs

50 Slide 50 Where does XLIFF fit? Issues Encouraging Adoption by Open Source: –Increase in commercial product development for Open Source platforms Translation not volunteer effort - cost control important. Integration with existing automation required. Increased availability of commercial tools that support XLIFF –Increase in Java Open Source projects Java projects are well supported by XLIFF. Well documented L10n best practices include XLIFF Available commercial and Open Source tools

51 Slide 51 More Information The XLIFF TC Web Site: http://www.xliff.orghttp://www.xliff.org A “best practice” from Sun Developer Network: http://developers.sun.com/dev/gadc/technicalpu blications/whitepapers/translation_technology_ sun.html Presenter: –XLIFF TC Chair: Tony Jewtushenko (Oracle) (tony.jewtushenko@oracle.com)tony.jewtushenko@oracle.com

52 Slide 52 Thank You... Questions?


Download ppt "XLIFF - the XML based Open Standard for Localisable Content Tony Jewtushenko Oracle Corporation - Principal Product Manager Chair – OASIS XLIFF TC The."

Similar presentations


Ads by Google