Presentation is loading. Please wait.

Presentation is loading. Please wait.

Localization Enablers Technology Development for Indian Languages (TDIL) Programme Department of Information Technology, Ministry of Communication & Information.

Similar presentations


Presentation on theme: "Localization Enablers Technology Development for Indian Languages (TDIL) Programme Department of Information Technology, Ministry of Communication & Information."— Presentation transcript:

1 Localization Enablers Technology Development for Indian Languages (TDIL) Programme Department of Information Technology, Ministry of Communication & Information Technology Govt of India Swaran Lata, Director slata@mit.gov.in Elitex-2008, January 17, 2008

2 Globalization of IT

3 Process of generalizing a product so that it can handle multiple languages and cultural conventions without the need for re-design. Taking a product and making it linguistically and culturally appropriate to the target locale (country/ region and language) where it will be used and sold" I18NL10N GLOBALIZATION

4 Focus of National Knowledge Commission of India  The National Knowledge Commission focuses on the objective of transforming India into a knowledge society.  It has concentrated five focus areas of the knowledge paradigm:  Access  Creation  Concepts  Application  Services  Information Technology applications, services, tools and resources based on natural language processing techniques would be key enabler for the above five knowledge paradigm.

5 NLP CONCEPTS (Semantic – web) APPLICATIONS (Multilingual office Tools & database) SERVICES [e-Governance; G2C, G2G] CREATION [Multilingual E-content] ACCESS (Information Retrieval across Languages) Local Area Portals for Gloabalizing local knowledge – digitize earlier existing communities

6 Localization activities Localized Application Internationalized application to be Localized

7 The Tree of Localization Complexities Presentation of dates, times, numbers, lists, and other values. Collation and sorting Alternate calendars, which may include holidays, work rules, weekday/weekend. Currency Tax or regulatory regime Machine Translation Optical Character Recognition Speech Technologies Cross Lingual Information Retrieval Project Management Translation Memory Translation Tools Natural language for text processing: parsing, spell checking, and grammar checking etc Automatic Testing Tools Encoding Standards Multimodal input device standards Fonts & Rendering Engines Transliteration & Translation Guidelines Best Practices Case Studies Consultancy Showcasing of Tools & Technologies Parallel Corpora Speech Corpora Lexical resources Ontologies Dictionaries Thesaurus Reference Terminologies Certified Localization professionals PG Specialization in Localization PhD Programmes Minimizing Time lag Benchmarking w.r.t. English version Political sensitivity Pricing issues Testing methodologies Metrics for Linguistic Testing Certification by Government for linguistic compliance

8 Guidelines for enhancing the Localizability Design and develop information and applications in a way that meets the needs of the international user Design that allows for easy localization at the point of need Means to reduce the cost and length of localization Checklists grouped by task, and supported by backup examples and explanations for example Browser feature applicability charts: What browsers and browser versions supported which i18n features (eg. ruby, bidi, utf-8, Lang attribute, :lang, white-space handling, writing-mode:lr-tb, etc, etc.) This would help us implement pages that used the most up-to-date internationalization features appropriate to our audience without the pain of trial and error (or perhaps more likely erring too far on the side of caution). Use of constructs in existing markup languages (eg. (x)HTML) to either enable interoperability in a globalised system or improve the localizability of data for example avoid the use of deprecated tags of HTML

9 I18n considerations applicable to document and ui design also includes such things as navigation, screen space and layout, implementing graphics, creating source text, designing interoperable systems, choosing and implementing fonts and complex script rendering, multimedia design, handling data format conventions, supplying data for translation/localization etc For example Standard Icons: a)Allowing for regional variation for point to a list of (or link to) country/language site selections b)Text based approaches can be problematic in two ways: –they may not be understood - that's often why you are going to the selection list (eg. how would the average American find the 'global sites' link on a page in Arabic or Japanese - not made up examples!) –they may make the user feel like his/her needs are secondary. Separation of localizable data from style sheets and templates for example use of CSS for separating presentation aspects from the content while designing websites. Guidelines focussing on content development, DTD design and stylesheet development relating to implementation in XHTML, XML, XSL, XSLT, CSS, XForms, SVG, and other similar specifications.

10 Guidelines for developing internationalized DTDs such as: white space handling, use of markup vs. Unicode control characters, use of alternative content or entities for different markets, provision of meta data to describe document structure for localization tools, provision of information about available space and other aspects of content affected by localization, the ability to tag terminology and semantics within content Language Tags: rfc3066 for 'language tagging' in XML and HTML has inherent difficulties in distinguishing between language and dialect, as well as historical variations. To devise a way of expanding the language tag concept to adequately cover the locale and script oriented needs of the localization community, incorporation of markup to support international script features (such as ruby and Arabic directionality) Internationalization tag set: a)Develop a set of tags that others could use for creating DTDs b)In the form of a namespace for inclusion in a schema, or simply a partial DTD and set of recommendations. c)Methodology for identifying non-translatable content for automatic identification by the localization tool

11 Internationalized data formats Time and date formats are just two of many ways in which people represent the same or similar information differently. Other examples include numbers, currencies, temperatures, weights, dimensions, addresses, telephone numbers, personal names, paper sizes, etc. It would be great if there was a way of capturing this information in a non- culturally-specific way and rendering and (more difficult) recognising it automatically in a culture-specific format, that could be used by people implementing web based communication - be it web page forms or exchange of information between machines. The work involved in this is not trivial, but it is desperately needed. Whether the W3C should attempt to produce this or work with others to achieve it is for discussion, but either way I believe it would be very useful.

12 धन्यवाद Thank You


Download ppt "Localization Enablers Technology Development for Indian Languages (TDIL) Programme Department of Information Technology, Ministry of Communication & Information."

Similar presentations


Ads by Google