Presentation is loading. Please wait.

Presentation is loading. Please wait.

Review1 What is multilingual computing? Bilingual, trilingual, vs. Multilingual What are the fundamental issues in multi-lingual computing? –Representation.

Similar presentations


Presentation on theme: "Review1 What is multilingual computing? Bilingual, trilingual, vs. Multilingual What are the fundamental issues in multi-lingual computing? –Representation."— Presentation transcript:

1 Review1 What is multilingual computing? Bilingual, trilingual, vs. Multilingual What are the fundamental issues in multi-lingual computing? –Representation of each language in a computer –Ways to distinguish different scripts –How can a system be designed so that it can be used by different languages with minimal changes –How can a system be designed so that it can be used for multiple languages

2 Review2 Characteristics of different scripts What is a script? What are the different types of scripts and examples of them ? –Token-based/Alphabet-based scripts, –phonetic based scripts, –Ideographs What is a phonetic transcription system and examples of them? What is Romanization?

3 Review3 Characteristics of Chinese Graphemics Variant writing (e.g. 教 都 ) Phonetics ( the sound, 音 ) Types of phonemes Semantics (the meaning, 義 ) Independence of meaning

4 Review4 Computer representation of characters Selection of a finite set of characters → character set –Uniqueness → each character/symbol Design of a coded character set → codeset –Uniqueness → each codepoint assignment –Different coding length → different codesets What are the following terms mean? –Codepoint Length of a codepoint –Code space Size of a code space –Code range –Order of characters ( in a char. Set vs. a codeset)

5 Review5 What are the different numerical notations? –Decimal notation –Binary notation –Hexadecimal notation –Scalar value Characteristics of the ASCII codeset What is the Row-cell notation? What are character subsets and why? Character set comparison operations Codeset comparison operations –Character set –Codepoint assignment Compatibility

6 Review6 What is an encoding method and why do we need it? What is the so called high-bit on scheme? What are the characteristics of GB-2312? –No. of Rows, No. of columns → code space –Code range? –Major subsets? –Full characters vs. half characters What are the characteristics of Big5 and Etan Big5? –Rows, columns → code space –Major subsets? –What are UDAs and VDAs for? HKSCS

7 Review7 Other codesets using high-bit on schemes? Encodings using designation( 指定 )? –ISO 2022 –Extended Unix Code(EUC) What is Charset registry and why? Problems with different codesets? –Compatibility → wrong interpretation of data –Solutions: Codeset announcement(using designation) and conversion → conversion problems

8 Review8 ISO 10646 and Unicode What are the design principles of ISO 10646? What are the different coding structures in ISO 10646? What is the structure of UCS-4? What is the characteristics of BMP? What is the structure of BMP? What is UCS-2? What is the compatibility zone for? What is the difference between ISO 10646 and Unicode? Big Endian vs Little Endian notation: FEFF vs FFFE

9 Review9 What is Extension A and Extension B? –Where were they coded? What is Surrogate pairs, what is the need for surrogate pairs, and how does it work? What is UTF, what is its purpose and how does UTF-8 work? What is the difference between a character and a glyph? What is the difference between multi-byte character and wide character ?

10 Review10 Input Methods What is an input method, why do we need it? What are the different types of input methods? What is a keyboard-based input method? How to design an IM? –What is the basic requirement? –What are the limitations? –What information can be used in IM design? Who are the main users? Efficiency consideration? What are the two types of IM? –Applicability and limitations What is keyboard arrangement, why do we need it?

11 Review11 Software L10N and I18N What is L10N and why do we need it? What is I18N and why do we need it? What are the principles in I18N? How to design I18N programs? What is POSIX and what is its purpose? What is the name of the POSIX facility for a specific region? What are the components in a POSIX NLS package? What is a locale and what are the classes in each locale?

12 Review12 POSIX provides a set of interface functions, how are their behaviors defined and in where? What are the major files in each locale? If POSIX where never developed, can you still develop an I18N program on top of an operating system? What is a symbolic name and where are they used? How do we know the binary code of a symbolic name? Programming using wide character data type vs multi-byte characters What is collation and how does it work?

13 Review13 Open systems What is an open system? Why do we want open systems? What are the measurements of an open system? What is an open specification? What are the two types of portability issues? What mechanisms can be used to improve portability or how can we write portable programs?

14 Review14

15 Review15 Output What are characters, glyphs and fonts? What are their relationships and/or difference? –Internal representation vs. external representation What is the difference of character box and bounding box? Why should there are space between the character box and bounding box? What does rendering mean? What are the two different glyph/font representations

16 Review16 What are the characteristics of bitmap fonts and outline fonts? –Representations, scaling (distortion), space requirement, compression How to deal with distortion in the scaling of bitmap fonts? –Ad hoc smoothing algorithms –Smoothing spline and interpolation Understanding of Bazier’s cubic curves –Control points and the equations Why bitmap to outline conversion is needed? How does erosion work?

17 Review17 Unicode on different platforms Unicode is supported on what platforms and in what forms? –Unix, Windows, Mac, Linux, What is a code page? Can Unicode be used if the operating system is not coded using Unicode? Why would encoding needs to be specified when compiling a Java program? What are the data structures supporting multi-byte and Unicode in Java?

18 Review18 I18N vs. multilingual applications What is the difference between an I18N program and a multilingual application? Can a multilingual application be designed/implemented using I18N What needs to be separately considered in the design of multilingual applications What is the relationship between multi-lingual applications to Unicode?

19 Review19 IDCs and the IDS What are ideographic description characters(IDCs)? –Different types of IDCs Why introducing IDCs? What is a ideograph description sequence? How is an IDS between expressed? For a given character, is its IDS unique? For a given IDS does it uniquely define a character?

20 Review20 Information retrieval Differences of IRS from Database system Basic components of an IRS What is the purpose of VSM? what are the data associated with a VSM? What are the similarity functions for? What is term selection for and methods to do term selection What kinds of information can be used as weights for the VSM?


Download ppt "Review1 What is multilingual computing? Bilingual, trilingual, vs. Multilingual What are the fundamental issues in multi-lingual computing? –Representation."

Similar presentations


Ads by Google