Presentation is loading. Please wait.

Presentation is loading. Please wait.

Worldwide typography (and how to apply JIS-X-4051-1995 to Unicode) Michel Suignard Microsoft Corporation.

Similar presentations


Presentation on theme: "Worldwide typography (and how to apply JIS-X-4051-1995 to Unicode) Michel Suignard Microsoft Corporation."— Presentation transcript:

1 Worldwide typography (and how to apply JIS-X-4051-1995 to Unicode) Michel Suignard Microsoft Corporation

2 Objectives n Worldwide single binary n Multilingual n DTP level on all writing systems –Line breaking –Font selection –word breaking –line justification

3 Challenges n Asian typography is not as well known as Western typography n Conflicting requirements –Vertical versus horizontal layout –Latin word wrap off –Ideographic word wrap on n Size of the Unicode repertoire (35K and growing)

4 JIS-X-4051 n First published in March 1993 –Does not address Unicode repertoire –Limited description of character classification n 2nd edition in October 1995 –Based on JIS-X- 221 (ISO 10646-1) –More detailed Character classification (20 classes) –Covers Line Breaking, Line composition rules, Ruby positioning, Horizontal in Vertical,…

5 Issues with JIS-X-4051 n Still a subset of Unicode n Character class contents are overlapping, (relying on contextual information not available to General Purpose software) n Single behavior class n Half/Full width characters not covered (user-defined) n Not aligned with most font design (Narrow versus Wide symbols) n Lack some useful features (like line break analysis across white space)

6 Character classification n Unicode space decomposed in Partitions (set of character ranges) n Each partition share a common behavior across all covered typographic rules n Partitions are mapped to classes specific to each rules (e.g. line breaking, font selection, etc…)

7 Typical usage After behavior class Before behavior class

8 Line breaking n Kinsoku rules, to avoid this: or Stricter rules for small kana (like in ) Stricter rules for small kana (like in ) n Keep numeric expressions together, including postfix and prefix symbols n Allows French typography rules (no break between last word and :;?!, even if separated by a space character) n Disable Latin word wrap n Keep ideographic characters together

9 Line breaking classes Partitions mapped into 15 classes: 10. Alpha space 11. Alpha characters/symbols 12. Glue Characters 13. Slash 14. Quotation characters 15. Numeric separators 1. Opening characters 2. Closing characters 3. No start ideographic 4. Exclamation/interrogation 5. Inseparable 6. Prefix 7. Postfix 8. Ideographic 9. Numeral sequence

10 Line breaking behavior table

11 Width modification and auto- spacing n Width Modification (contextual kerning) : becomes n Width Modification (contextual kerning) : ( (text) ) becomes ((text)) Auto-spacing (add space between ideographic text and Western or numeric text) becomes: Auto-spacing (add space between ideographic text and Western or numeric text) western text becomes: western text

12 Font selection scenario A new font is applied to a large multilingual selection of text. Is that movie a Japanese movie? Yes, it is. Assume we want to change the font of the English text, but still selecting the whole text: And we apply the Haettenschweiler font to it, it is desirable to only affect the Latin text. Is that movie a Japanese movie? Yes, it is. It is similar situation when we want to apply an Asian face to the Japanese text (like HG) Is that movie a Japanese movie? Yes, it is.

13 Font selection based on character code point and context n Because there are no global Unicode fonts (fonts usually covers a group of writing systems) n Language is an important context selector to determine appropriate font (CJK context, ASCII symbols, Narrow versus Wide Greek and Cyrillic characters) n Some writing systems require several glyphs per characters and are better handled by having specialized fonts (Arabic, Hindi) n A large number of punctuation are shared among writing systems with non shareable typeface (e.g. Period. between Latin and Armenian)

14 Ruby overhanging n Commonly used name to describe the association of pronunciation characters associated with base characters. n The Ruby sequence may be allowed to overhang on top of preceding or following the base characters as long as it doesnt introduce confusion. n The classification allows to determine in which manner characters can be overhung: –No overhanging (e.g. CJK Ideographs), –Allowed only Before (e.g. Open quotes) –Allowed only After (e.g. Close quotes) –Allowed in both case (e.g. Hiragana)

15 Conclusion / Findings n A detailed analysis of the Unicode repertoire along common behavior is a powerful tool to construct sophisticated typographical effects. n Typographic complexity should be expressed as much as possible in tables and properties, not in code. n Many behaviors are correlated, allowing the usage of a limited number of Unicode partitions for many behavior descriptions.


Download ppt "Worldwide typography (and how to apply JIS-X-4051-1995 to Unicode) Michel Suignard Microsoft Corporation."

Similar presentations


Ads by Google