Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction Part II: Enabling Internationalization.

Similar presentations


Presentation on theme: "An Introduction Part II: Enabling Internationalization."— Presentation transcript:

1 An Introduction Part II: Enabling Internationalization

2 License This presentation and its associated materials licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 2.5 License. You may use these materials without obtaining permission from the author. Any materials used or redistributed must contain this notice. [Derivative works may be permitted with permission of the author.] This work is copyright © 2008-2011 by Addison P. Phillips

3 Who is this guy? Globalization Architect, Lab126 We make the technology behind the Kindle Chair, W3C Internationalization WG

4 Internationalization is: the design and development of a product that is enabled for target audiences that vary in culture, region, or language. [W3C] a fundamental architectural approach to software development

5 Related Concepts Localization: creation of a product tailored to a particular target market Translation: process of converting text from one language to another Globalization: unified approach to creating global products, especially those that support multiple geographies simultaneously

6 Mystic Numbering (M4C N7G) Opinions differ on capitalization (C12N); choose from:  i18N  I18n  I18N Very geeky; not very internationalized ( I19G ?) I N T E R N A T I O N A L I Z A T I O N I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 N I18N Localization=L10N Globalization=G11N Canonicalization=C14N Accessibility=A12Y

7 The Internationalization Approach Gather requirements globally Enable Externalize Customize Test and support globally Localize

8 The Internationalization Approach Enabling—the same code supports multiple regions or cultures. Sometimes called a “global binary”. Externalization—plan for localizability by separating “content” from code. This makes localization for specific languages, regions, or cultures easy, fast, and cheap. Customization—add culturally specific functionality, presentation, or content to an application.

9 A Global Approach Internationalization turns technical problems into business decisions Balance priorities based on real user distribution/requirements – Consider global user population as a whole – Consider specific market requirements on an equal footing – Potential markets for the product

10 Internationalization Myths We (wrote it in Java/C#, used Unicode, etc.), so it is internationalized. We made the assumption that the product would only ever have English screens: all our users understand it anyway. A localized product is internationalized. An internationalized product is slow/slower. It takes longer to write internationalized code. We can ’ t read the screens/it is too hard to test. We have no intention of localizing, so no need to internationalize. We don ’ t have any customers there. The users in (some country) never complained, so it must work. This product is 100% fully internationalized. We need special experts. We need an extra development cycle. We need six more months to build it. We need people who speak (language).

11 Internationalization Truths: “Well, it depends…” Generalize designs – Locale independent data structures – Locale sensitive display Externalize cultural or linguistic variations Customize as a last resort

12 Buy In: The Key to Success For internationalization to be a success over time, there must be commitment: – Management – Product Team – Development Team All developers, not a splinter group

13 Addressable Market: Why Do Internationalization?

14 Globalized Product Development Internationalization turns technical problems into business decisions. – Localization: Choose which markets to translate user interface or documentation for with no engineering. – Deployment : Choose whether to serve applications from a single site, cluster of sites, or in each target market. – Development : Add content and features to products as necessary in each target market. – Integration and Interoperability: Servers and products can work together around the world, so customers can truly create “Enterprise” solutions.

15 Development Methodologies  Independent of development methodology  Agile? Waterfall? You make the choice.  Encompasses the full development cycle:  Design  Development  QC  Release  Support

16 The Customization Approach Let’s do it in a separate release. Let’s make a branch for the international customers. Let’s get a special team of people to work on the international release.

17 How That Model Really Looks Time Main Line sexy new featuresbug fixes 1.0 1.0a2.0 International Branch Merges and Fixes Lots more people and cost International Release 1.0 functionality gaps: intl users waiting for 2.0i now Lost $ and opportunity lots of cost to get there 1.0i

18 The Problem with Customization Code forks. (double, triple coding) Lag time for international releases. Non-adoption of localized release. Full regression of every language. Quality or commitment perception. Lack of data exchange between language versions. Difficult to repeat (every version is a repeat) Proliferation of bugs and of support problems. International features are cancelled. Core product still doesn ’ t work/can ’ t address similar markets. Loss of market share.

19 ANALYZING AND DEVELOPING A DESIGN Large Animal Pictures

20 dates numbers images colors addresses local rules strings Your Application local rules, regulatory requirements, postal addresses, default bookmark lists, your company’s customer service phone numbers The Problem

21 The Solution Locale-independent global binary Locale-dependent resources (includes code)

22 Large Animal Pictures Software Component Output Global Code Resources I/O Input

23 Enterprise Animal Pictures Business Logic Data Store Front End Operating Env. Your System API Unicode Legacy Encoding Detect / Convert Capture Encoding Detect / Convert Unicode Cloud Unicode Interface Convert to Legacy Partner/ Content Provider

24 Internationalization Issues Text Processing – Character encodings, including Unicode, spelling, word breaks, collation, and so on Language – Of the software (localization) – Of solutions built using the software (localizability, data) Locale-affected formats – dates, numbers and the like Regionally-affected formats – names, addresses, currency, and the like Time-related issues – time zone, calendar, holidays, work rules and the like Cultural adaptation – presentation, style, position, color use, and the like Legal requirements – accessibility, SOX, DRM, moderation, security, content, and the like

25 Levels of Enablement Not Enabled Single-Language-at-a-Time (SLAAT) All components run in the same language and encoding environment correctly. Multi-Locale Unicode support; components run in different locales, languages, encodings, and time zones

26 Test Your Assumptions Gender:  Male ×Female

27 Choose Your Language

28 How is this company doing?

29 ENABLING Making Code Aware of Culture

30 What is “enabling”? Enabled software: adapts the display, processing, validation, storage, and transmission of data according to the cultural, linguistic, and regional needs of the users – Text, Characters, and Encodings – Locale Awareness – Times and Time Zones A “global binary” is a single object-code version that is used in all markets, regardless of localization.

31 Don’t Code What You Think You Know 5/2/7sometime in February? sometime in May? sometime in 2005? 1.234 more than 1000? less than 2? 4.32.MD number, time, currency? morning or afternoon?

32 Date Formats CultureFormatExample U. S. A.mdy, / 2/16/05 Francedmy,. 16.2.05 Francedmy, - 16-2-05 CJKTymd, / 2005/2/16 CJKT ymd, 年月日 2005 年 2 月 16 日 Japane¥md, 平成 17 年 2 月 16 日 Japan¥md, / 17/2/16

33 Time Formats U.S.A.:4:00 p.m. France:16.00 Japan:1600 Japan: ごご 4:00 Korea: 오후 4:32 Thai:16:32 น. Albanian:4.32.MD Arabic:04:32 م

34 More Examples Assumptions about date tokens: USA: Sun, Mon, Tue3 positions, titlecase French: lun. mar. mer.four positions lowercase Russian: Пн Вв Ср two positions, Cyrillic USA: Jan, Feb, Mar3 positions, titlecase French: janv. févr. mars avr.variable (4 or 5) positions, lowercase Spanish (Spain): ene, feb, marnot titlecase Spanish (Americas): Ene, Feb, Martitlecase

35 Calendars: What Year Is It? Legal, ceremonial, or popular requirement Gregorian2012 Japan Emperor:24 Heisei ( 平成 24 年 ) Thailand (Buddhist):2555 (Gregorian + 543) Chinese (traditional):4704 (lunar) Hebrew5767 תשסו (lunar) Hijri (Islamic) 1428 (lunar) Armenian1461 ( ԹՎ ՌՆԾԶ ) etc. etc. etc.

36 Weekends and Holidays When is the weekend? – Friday is part of the weekend in some countries. Both official and unofficial holidays vary widely in number. Here are a few to watch for: – USA: July 4, MLK, President’s Day, Veteran’s Day, Flag Day, Columbus Day, Thanksgiving… – Japan: Golden Week – China: New Year’s – Britain: Guy Fawke’s Day, Boxing Day – France: Bastille Day – Spain:Reyes Magos

37 Calendar Display

38 Numbers Grouping and decimal separators: England:12,345.67 Germany:12.345,67 Switzerland:12’345,67 Swiss money:12’345.67 France:12 345,67 India:12,34,567.89 France uses a non-breaking space! India: number of digits in groupings changes!

39 Lists List delimiters & separators can conflict French example: 2 345,67, 1 012,34, 45,67 hard to read 2 345,67 ; 1 012,34 ; 45,67 easier to read List myNumberList = getList(); NumberFormat nf = NumberFormat.getInstance(); StringBuffer buf = new StringBuffer(); Iterator iter = myNumberList.listIterator(); while (iter.hasNext()) { buf.append(nf.format(((Number)iter.next()).doubleValue()); buf.append(“, “); } System.out.println(buf.toString()); List myNumberList = getList(); NumberFormat nf = NumberFormat.getInstance(); StringBuffer buf = new StringBuffer(); Iterator iter = myNumberList.listIterator(); while (iter.hasNext()) { buf.append(nf.format(((Number)iter.next()).doubleValue()); buf.append(“, “); } System.out.println(buf.toString());

40 Collation (A FANCY WORD FOR “SORTING”) English:ABC...RSTUVWXYZ German:AÄB...NOÖ...SßTUÜV…YZ Swedish/Finnish: AB...STUVWXYZÅÄÖ Norwegian:AB...VWXYÜZÆØÅ

41 Organizing Information “ Alphabet ” differences Additional information – for example: yomi ASCII vs. the world Mixed information sets

42 “Should I be writing all of this down…” Wide range of variation Obscure formats Difficult to obtain reliable information on formats Lots of work to implement and maintain Enabling means not having to know (m)any of the details

43 Supporting International Formats Use neutral data structures – Makes code independent of locale – Most data types are locale-neutral: Boolean String, char Number classes Date, Calendar Encapsulate formatting/validation in a function – Format style chosen dynamically at runtime – Format details don’t have to be specified or researched – APIs know the gory details

44 Essence of Enabling Object to Presentation, Presentation to Object – Integers – Floats – Percents – Currencies – Dates – Times – Durations – Collation (lists) – Weights/measures/sizes – Resources (user interface strings) Locale user presentation

45 Locale an identifier or data structure that allows programmers to access culturally and linguistically affected functionality in a system. Many systems now based on IETF BCP 47; for example JavaScript, Java 7, and CLDR

46 Complex Types Data structures, APIs, or classes built from basic types must include similar capabilities. – Store data in a locale-neutral or independent format. – Display in a language/regional/culturally sensitive manner – Convert from locale format to locale-neutral or locale-independent storage format.

47 Design Time and Data Structures Identify your own “ locale bias ” – Field names matter! “Postal Code”, not “ZIP code”. Family Name/Given Name, not First Name/Last Name – Avoid problematic fields Postal address parsing? Area code? Etc.

48 Currency Currency formatting is usually similar to number formatting. But things can vary widely here, too: – $1,100.00 [USA] – €1 100,00 [France-Euro] – ¥1,100 [Japan] – 1.100$00 Esc. [Portugal, obsolete] – SFr. 1’000.00 [Switzerland] Currency associated with the locale doesn’t always apply. Store the currency type with value. – Use ISO 4217 std. codes (USD, JPY, EUR, RUR) Not always one symbol. Not always two decimal places. $100 + ¥100 = $101 Consider neutral displays!

49 Being Locale Neutral Avoid or reduce locale-affected display to increase portability – Use unambiguous formats, such as ISO 8601- like dates, especially in log files and the like 2005-04-01 14:17:00 UTC – Use consistent formats ( ‘ user locale ’ ), especially in columns or collections of data AmountCurrency 351,234.56USD 102,556.78EUR 65,336.00JPY 212,345.00INR AmountCurrency 351,234.56USD 102 556,78EUR 65336JPY 2,12,345.00INR

50 “The String is the Thing” Text doesn ’ t get translated on the fly. Don’t use text as an identifier or foreign key. – Use ID Numbers or not-human-readable values instead of requiring text fields to match. – “Intrinsic” data value versus “display” data value. Enumerated values displayed as strings. Use display strings. Displayed “Accounts Payable” “pagável de clientes” Enumerated ACCOUNTS_PAYABLE

51 English-like Construction Concatenation – string1 + string2 Pluralization – Dog + “ s ” = “ dogs ” This topic will be covered in greater depth in the section on localization.

52 Databases Most databases can only handle one collation sequence per instance or one collation per index. – Remove reliance on alphalists. – Self-collate short lists. – Pre-collate long lists? Example: NLS_SORT controls the way Oracle returns data (collation sequence). – Global environment variable. – Not necessarily under your control. – Indices are built on a predetermined or binary sort.

53 Enabling Summary Understand Encodings and Unicode – All text has an encoding! Be Locale-Aware – Create locale-neutral data structures – Separate display from storage

54 IT’S ABOUT TIME Dates, Times, Durations, Calendars a little aside…

55 Observed Time

56 Incremental Time Computed time based on “clock ticks” in an “epoch” – The epochal date is arbitrary. The UNIX epoch is midnight, January 1, 1970, UTC.

57 Field Based Time Time based on calendric fields (day, month, year, hour, minute, second) Some systems have data types for “field based” time also.

58 What is a Time Zone A time zone is a geographical region or area that has common rules for determining the local observed time as it relates to monotonic (computer) time. Distinctions include: – Offset from UTC – Daylight Savings (Summer Time) behavior – Historic changes in offset or DST behavior – Political control

59 Durations and Repeating Events Wall-time: this meeting is at 2 PM Pacific time every Tuesday – interval between meetings may vary in number of seconds Daylight time transitions Changes in DST rules Fixed-duration: run the virus scanner every 57 minutes – interval is always 342000 milliseconds

60 Time Zone Affected Scenarios Zone independent – only “incremental” times are necessary Local time, past only – future changes to time zone rules not applicable – example: logging system Local time, both past and future – time zone rule changes may affect some time values – example: calendar program Floating times – events not tied to a specific time zone – example: birthdate, start date, definition of “night” for phone usage Recurring events – events that recur—sometimes during and sometimes not during daylight savings. – example: weekly status meeting

61 Time Zone Scenarios Zone Independent — generally timestamps that don’t refer to a specific time zone. – Record local offset or (better) use UTC – May want wall time for analysis

62 Time Zone Scenarios Local Time (Past Only)— times that cannot change their relationship to DST – Store zone ID and time value [may store offset instead of zone ID] Local Time (Past+Future) — time values may need to change if DST rules change – Store original offset along with zone ID and time value – May require a database crawl if DST rules change

63 Time Zone Scenarios Floating Times — times that don’t change regardless of where you are in the world. – Publication dates – Birth dates (or any anniversary date) – Etc. Handle using UTC and avoiding zone casting

64 Time Zone Scenarios Recurring Events — time values that occur in both DST and non-DST time – Store time, recurrence period, zone ID, original offset, and whether to tie recurrence to DST

65 Offset Etc/UTC Etc/GMT+1 Time Zone Identifiers Often based on the IANA time zone database (tzinfo) [formerly “Olson IDs”] Continent/City America/Los_Angeles Europe/Paris Asia/Tokyo Antarctica/DumontD Urville Ocean/Island(City) Atlantic/Canary Pacific/Auckland Pacific/Pago_Pago Continent/Region/City America/Indiana/ Indianapolis

66 Time Zone Hints Only 21 countries have more than one time zone (if you know the country, you often know the time zone) Argentina, Australia, Brazil, Canada, Chile, Democratic Republic of the Congo, Ecuador, France, Greenland, Indonesia, Kazakhstan, Kiribati, Mexico, Micronesia, Mongolia, New Zealand, Portugal, Russia, Spain, and the United States. – Of these, most have maritime or overseas regions. Examples: Ecuador: Galapagos Islands Chile: Easter Island Portugal: Azores

67 Locale-Neutral Formats Use locale-neutral formats for interchange: – ISO 8601 – Incremental time values (e.g. time_t ) – Distinguish time zone if necessary for interpretation Offset is not the same as time zone At any given time, in UTC, it is the same time everywhere that time is measured. SQL data types and XML formats are often field-based, while programming languages are usually incremental.

68 Formatting Dates and Times Requires more than just a locale! date time zone calendar value being formatted defines relation to “wall time” defines rules for calculating field values 1034197545321L Asia/Tokyo Japanese Imperial October 10, 14H 6:05:45 AM JST

69 Externalization Making software localizable

70 What is localization? “What is localization?” Zula asked. Peter sighed, letting her know it was a stupid question. “Translating foreign software into Hungarian, making things work correctly in the special environment of Hungary,” Csongor explained, and Zula thought that she could glimpse, here, in the way that he contentedly explained things, Csongor’s father the school-teacher. Reamde by Neil Stephenson

71 What is Localization? The process of tailoring a product to a specific target market. – Translation of messages – Adaptation to local preferences – Addition (or subtraction) of content or features

72 Localization is Obvious … but it isn’t “internationalization” Localizability is internationalization. – Externalize text – Externalize presentation – Dynamic composition – Distribution of language content – “ Plug-in ” features

73 What is a ‘Resource’? any application component loaded dynamically at runtime, rather than compiled into the application In localization: source code files containing language, region, or culturally-affected materials – Text – Error messages – Icons – Pictures – Fonts – Colors – Graphics – Sizes – Positions – Magic Numbers – Mnemonics ( “ Alt+G ”, “ F4 ”, etc.) – File Locations – Dictionaries – Glossaries – Grammar Rules – Code

74 Why Resources? Text Error messages Icons Pictures Fonts Colors Graphics Sizes Positions Magic Numbers Mnemonics Dictionaries Glossaries Grammar Rules Culturally specific code BeforeAfter

75 Avoiding Forks Global Binary Resources English Version Resources Language +1 Version

76 Forked Code Woes Hard to fix and maintain Different versions in the field Delays in releasing localized product Different functionality by region Confusing for customers/users Versions are not interoperable and might not be able to exchange data!

77 More Benefits Rename or re-brand product Fix spelling or grammar mistakes Fix usability Make terminology consistent Test drive new customer experiences, try new designs, etc. … all without a rebuild!

78 "Project-Id-Version: blanket 1.0\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2011-03-23 15:43-0700\n" "PO-Revision-Date: 2011-03-23 15:43-0700\n" "Last-Translator: Richard Gillam \n" "Language-Team: en \n" "MIME-Version: 1.0\n" 20 "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" # font msgid “my.font.name" msgstr "dialog" #: progress bar in point based msgid "progress_bar.rect"progress_bar.rect msgstr "43.11,64.67,172.45,12.93" msgid "progress_bar.border"progress_bar.border msgstr "2" # bounding box: x_pos,y_pos,width,height msgid "shutdown.cust_service.header.rect"shutdown.cust_service.header.rect msgstr "0,14.65,258.68,12.07" msgid "shutdown.cust_service.header"shutdown.cust_service.header msgstr "Repair Needed"

79 What’s wrong here? String1 = There are String2 = no String 3 = tables in String4 = files String5 =. Messages I Could Build: There are files. There are no files. There are 50 files. There are tables in files. There are no tables in files. There are 50 tables in no files. There are tables in.

80 Let’s Google Translate That! Messages I Could Build: There are files. There are no files. There are 50 files. There are tables in files. There are no tables in files. There are 50 tables in no files. There are tables in. Il ya des fichiers. Il n'y a pas les fichiers. Il ya 50 fichiers. Il ya des tables dans des fichiers. Il n'ya pas de tables dans des fichiers. Il ya 50 tableaux dans aucun fichier. Il ya des tables po.

81 Don’t Build Text From Fragments Text fragments are hard to translate – Fragments may not follow grammar rules – Cannot know which parts go together – Parts can be reused in incompatible ways Internationalization APIs offer “patterns” to fix: [] files out of [] were deleted. An error occurred at [] on []. Page [] of [] Processing: []% complete.

82 Example: MessageFormat (Java) There were {0} tables on {1}. There were {0,number,integer} tables on {1,date,short}. {1,date} に {0,number,integer} のテーブルがあった。 Number replacement variables. Provide typing and formatting information where possible. Externalize as a single unitary string.

83 What’s My Gender “ Documenti del Chris“ "Documenti della Chris ” "Documenti - Chris"

84 More Issues With Text Composition – There were one errors found. – You have earned your 22th set of bonus points.

85 Sentence Parts Must Agree Endings, Gender, Plurality, Case – e.g. Japanese counting uses different words for different kinds of objects – e.g. Slavic languages use different endings for singular, few, many…

86 Complex Message Formatting There were no errors. There was 1 error. There were 2 errors. “choice format” APIs allow for different resources to be used based on runtime values. 0:There were no errors. 1:There was {0} error. 2:There were {0} errors. 0: не было ошибок 1: была {0} ошибка 2: были {0} ошибки 5:были {0} ошибок The number of resources may need to vary by locale or language Examples:  ordinal numbers (1 st, 2 nd, 3 rd, 4 th, etc.)  complex messages, such as “27 seconds ago” vs. “10 minutes ago”

87 Images and Icons Avoid metaphors Avoid cultural sensitivities Avoid body parts Replace as necessary Avoid putting text into graphics Graphic: $20 Text: $0.06

88 Images and Culture Beware your biases — even “ good ” ones. Meet your friends on our new social website for India

89 Isn’t it Swell? English is very succinct. – Words in other languages are longer – Sentences are longer – Characters may be larger

90 More Swollen Text 30% in length (alphabetics, abjads, etc.) 30% in height (ideographics) But … a rule of thumb, not a “ fact ” – Measure your results with care.

91 A Cautionary Tale

92 GUI Layout

93 Dereferencing Minimize sentence building Minimize arguments per string Use subject:predicate wherever possible When you can do this: Balance: $100.00 Don’t do this: Your balance is $100.00.

94 Dynamic vs. Static Layout Magic numbers Externalized layouts Mnemonics Colors

95 Localizing Styles Bolding is not universal for emphasis – Italicization, Capitalization, etc. are also not universal (some scripts don’t have these attributes) Use Logical not Presentational names – Describe the function not the appearance. For example, use “emphasis” instead of “italics”. 中国 AmikakeWakiten

96 Use of Color “Going Down” “Going Up”

97 Non-Translatable Resources Some content should be externalized but not translated – Sometimes referred to as “ DNT ” for “ do not translate ” Externalize? Yes … – Segregate DNT material from translated material if possible (by using separate resource files or separate resource blocks within a file). – Developers can ’ t always tell when something should or should not be DNT … and neither can translators (context is missing)

98 The “Locale” in “Localization” Resources “ fall back ” to find the best match Global Binary Resources zh-Hans-SG (Chinese, Simplified script, Singapore) zh-Hans (Chinese, Simplified script) zh (Chinese) (root) Falling back

99 Sparse Population A given language resource may not contain a complete set of resources. – Some resource language fall back for each sub- resource (such as a particular value) “appName”“Demo” “maxRows”57 “dialogTitle”“Hello World” “appName”“Démo” “dialogTitle”“Bonjour monde”

100 Getting the Right Locale Business Logic Data Store Front End Operating Env. client Client Locale Server Locale API Request Locale System Mgmt Locale One request might serve multiple purposes or be seen in multiple contexts

101 Resources and Translation “ key ”, “ display string ” “ dialogTitle ”, “ Dialog Title ” “ aMessage ”, “ This is a message. ” Pseudo-Translation “ key ”, “ ðìsplàÿ stríñg ” “ dialogTitle ”, “ Ðîálòg Tïtlè ” “ aMessage ”, “ Thìß ís â M ésßãgê.

102 Pseudotranslation

103 Keyboards

104 Input Method Editors Some languages require software to assemble keystrokes into characters Asian languages with vary large character sets Complex scripts with vowel-killers and other contextual editing requirements Applications that interact directly with key- pressed events can disable or disrupt IME input. On- and over-the-spot editing

105 Customization

106 When is it okay? Content should be highly localized or have locale-specific requirements: – customization lets you address this requirement in the most localized possible manner

107 dates numbers images colors addresses local rules etc. Externalization again Your Application local rules, regulatory requirements, postal addresses, default bookmark lists, your company’s customer service phone numbers

108 Externalization again Locale-independent global binary Locale-dependent resources (includes code)

109 Large Animal Pictures Software Component Output Global Code Resources I/O Input Code can be a resource!

110 Customization Examples Postal address validation Postal code validation Telephone number formatter “ Personality ” questions blood type vs. sun sign Personal name formatter first/last position, space, highlighting, formality, etc. Tax codes and shipping schedules Generic API Generic Implementation US Implementation DE Implementation ?? Implementation

111 Example: Postal Addresses address1varchar(32) address2varchar(32) cityvarchar(16) statechar(2) zipchar(5) countrychar(2) address1varchar(64) address2varchar(64) cityvarchar(64) provincevarchar(64) postcodevarchar(64) i18n country=US, postcode=‘WC2 1GH’// error country=UK, postcode=‘95111’// error country=DE, postcode=‘1A4 喪 ’// okay? public interface Address { public class USAddress extends genericAddress { public class UKAddress extends genericAddress { public class genericAddress implements Address {

112 Building Global Software Beyond Just Coding: Localization, QA, and all that

113 The Internationalization Cycle Encompasses the full development cycle: – Requirements – Design – Development – QC – Release – Support

114 What is “internationalization QA”? Does the enabled product work correctly? – Non-English configurations – Non-ASCII data and encoding support – Cross time zone support – Market specific features or customizations Does localization appear correctly? – Is the product localizable? What makes this different from “regular” QA?

115 Growing (and Pruning) the Matrix Include non-English configurations in your test matrix; include non-ASCII data in your tests. Be prepared to prune the test matrix.

116 What to Test With – Test Non-English configurations Non-English locales (lying to your machine) Native configurations (when does it make sense?) – Test Non-ASCII data Encodings, encodings, everywhere Non-ASCII character values – Test Across Time Zones Two or more time zones; consider international date line ( “ it ’ s tomorrow in Japan ” ) and DST issues

117 Planning Testing Initially Get tools that are enabled! – Automation allows greater coverage, but only if it works. Plan encodings and locales as part of the test matrix. Acquire third-party products as necessary. Increasing Maturity Use test driven development practices. Get developers to write unit tests that are internationalized. Put the ‘i18n’ bugs into the regression suite.

118 Configuring Machines Create both native and simulated environments: – Native operating systems may have minor but sometimes critical differences (folder names, keywords, localized registry entries) – Most features don ’ t run into native differences (easier to work with English-localized machines) – Don ’ t buy physical keyboards (use software keyboards) unless your application relies on scan codes from keys

119 Localization

120 Incorporate Localization is part of the release process too. – Changes to the user interface cost the localization team time and money. – (Changes to the product cost the documentation and QA folks too) May need to institute change control or a UI freeze

121 Simultaneous Shipment (Simship) Ideally, to maximize opportunity, ship the target languages the same day as the source language. – It might not make sense for your product. – But it might not be as difficult as you think it is. It might even be good for you.

122 Distribution of Content How does the localized text get into the running product? – Satellite assemblies, DLLs, shared libraries – Message catalogs – Special directory – Database – Etc.

123 More Distribution “ Specific Language ” (per-language) “ Language Included ” (one or more languages) “ Language Pack ” (product plus something) English German French English German French English German French Global Binary +

124 Completing the Product Static content is often under source control and can be localized “normally” Dynamic content may include the initial set of data or other items which need to be localized beyond software. – Demos and Demo Data – Dictionary, Language add-ons – Local offers, links to Web store, etc. – Packaging – Regulatory

125 Quality Checking and Development Methodologies Translation is a human-oriented task. – Translation time lines are linear with volume. Localized product should be tested for functionality – translation can break things – usually the first language finds most of the bugs Translations should be checked for quality Development cycle has to include time for translators and quality assurance to catch up. – This does not mean “no agile” or “no changes” – Do pilot language(s) or moving- target translation; do better UI design and usability reviews; etc.

126 Summary

127 Internationalization … is a fundamental architectural approach: it is how software is built. – Design – Enabling – Externalization – Customization – Testing and Support – Lifecycle

128 Q&A Would you write the code for I18N on the whiteboard before you go? #define UNICODE #import I18N.h


Download ppt "An Introduction Part II: Enabling Internationalization."

Similar presentations


Ads by Google