Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unicode Mark Davis Unicode Consortium President IBM Chief SW Globalization Architect.

Similar presentations


Presentation on theme: "Unicode Mark Davis Unicode Consortium President IBM Chief SW Globalization Architect."— Presentation transcript:

1 Unicode Mark Davis Unicode Consortium President IBM Chief SW Globalization Architect

2 Universal Character Encoding … Unique number for every character Unique number for every character

3 Lingua Franca for Computers Developed & supported by industry leaders: Developed & supported by industry leaders: Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, Unisys, … Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, Unisys, … Required by modern standards: Required by modern standards: XML, HTML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, Perl, etc. XML, HTML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, Perl, etc. Implemented in: Implemented in: All modern operating systems, browsers, and other products All modern operating systems, browsers, and other products

4 International Domain Names Draft – Unicode-Based Draft – Unicode-Based Examples: Examples:

5 Standard Resources Online Standard Online Standard Technical Reports Technical Reports FAQs FAQs General Information General Information Discussion Forums, Conferences Discussion Forums, Conferences

6 Programming Resources System APIs: System APIs: Windows, Java, Unix, Oracle, DB2, Sybase, Mac, … Windows, Java, Unix, Oracle, DB2, Sybase, Mac, … Languages Languages Java, JavaScript, Perl 5.6.0, C, C++, SQL, … Java, JavaScript, Perl 5.6.0, C, C++, SQL, … Cross-platform libraries: Cross-platform libraries: ICU, Rosette, … ICU, Rosette, … ICU

7 Multiple Forms UTF-8: maximal compatibility with 8-bit systems UTF-8: maximal compatibility with 8-bit systems UTF-16: good storage, interoperability with Windows/Java UTF-16: good storage, interoperability with Windows/Java UTF-32: simplest processing UTF-32: simplest processing Fast, lossless conversion Fast, lossless conversion See Forms of Unicode See Forms of UnicodeForms of UnicodeForms of Unicode

8 Stability Characters are never moved or deleted Characters are never moved or deleted Ordering of characters is by collation, not binary order. See UTS #10: Unicode Collation Algorithm Ordering of characters is by collation, not binary order. See UTS #10: Unicode Collation AlgorithmUTS #10: Unicode Collation AlgorithmUTS #10: Unicode Collation Algorithm Characters may be deprecated (discouraged). Characters may be deprecated (discouraged). Characters never change names Characters never change names Annotations are used to clarify usage Annotations are used to clarify usage See Unicode Policies See Unicode PoliciesUnicode PoliciesUnicode Policies

9 Indic Support in Unicode ISCII-1988 the basis for characters and allocation ISCII-1988 the basis for characters and allocation Consortium actively engaged with Indian Government, which is a member Consortium actively engaged with Indian Government, which is a member Welcomes addition of missing characters (e.g. Vedic), clarifications or corrections of usage Welcomes addition of missing characters (e.g. Vedic), clarifications or corrections of usage

10 Structural Differences with ISCII Unicode is stateless: Unicode is stateless: No shifting to get different scripts No shifting to get different scripts Each character has separate number Each character has separate number Unicode is uniform: Unicode is uniform: No extension bytes necessary No extension bytes necessary All characters coded in the same space All characters coded in the same space

11 Additional Characters Indian Government is developing proposals for additions of missing characters: Indian Government is developing proposals for additions of missing characters: Vedic Vedic Individual characters for certain scripts Individual characters for certain scripts

12 Encouraging Indic Support Companies moving to support Indic Companies moving to support Indic OpenType fonts OpenType fonts Font support for Indic Font support for Indic Microsoft Windows Microsoft Windows Java (IBM contributed ICU Indic Layout) Java (IBM contributed ICU Indic Layout) Etc. Etc.

13 Encouraging Development Resources! Resources! Descriptions of Character Shaping Descriptions of Character Shaping Transliteration Tables from Script to Script Transliteration Tables from Script to Script Collation Information Collation Information OpenType fonts OpenType fonts …

14 The Future The world is moving rapidly to Unicode The world is moving rapidly to Unicode For interoperability, the best direction for India For interoperability, the best direction for India The Unicode Consortium welcomes and encourages active participation The Unicode Consortium welcomes and encourages active participation Public, accessible resources ensure that Indic support is done sooner and better Public, accessible resources ensure that Indic support is done sooner and better


Download ppt "Unicode Mark Davis Unicode Consortium President IBM Chief SW Globalization Architect."

Similar presentations


Ads by Google