Presentation is loading. Please wait.

Presentation is loading. Please wait.

Health Information Standardization and Asian Languages Michio Kimura M.D. Ph.D. Director and Professor of Medical Informatics Department Hamamatsu University.

Similar presentations


Presentation on theme: "Health Information Standardization and Asian Languages Michio Kimura M.D. Ph.D. Director and Professor of Medical Informatics Department Hamamatsu University."— Presentation transcript:

1 Health Information Standardization and Asian Languages Michio Kimura M.D. Ph.D. Director and Professor of Medical Informatics Department Hamamatsu University School of Medicine HL7 Japan chair

2 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Three types of representation -- We have 2 patient names in HIS zAlphabetic zIdeographic zPhonetic yIdeographic names xhave many ways to pronounce xare difficult to sort

3 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Multi-Byte Character Codes in Use in Asia zKorea: KS X 1001, and 1001 annex 3 yHanguls(phonetic) and Ideographics zChina(PR): GB 18030-2000 zTaiwan(ROC): CNS 11643, and Big-5 zJapan: JIS X 0208-1997 yKatakana, Hiragana(Ph.) and Ideographics yJunior school pupils must read/write 810 letters. zVarieties: 6879(JIS) to 48711(CNS)

4 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine ISO 2022-1983 Multi-Byte Extension Technique zBase set is usually ASCII 1-byte(ISO 646) zDefines ESCAPE sequence to set character set to G0 or G2 yNot necessarily multi-byte, to set ISO8859-1: ESC. A yIf the set is 2-byte, it is assumed that following codes are recognized 2 bytes each. yTo set JIS X 0208: ESC $ B yTo set KS C 5601: ESC $ ( C yTo set GB 2312: ESC $ A yTo come back to ASCII: ESC ( B

5 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Byte-wise Representation of ISO2022

6 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine RFC 1468: Japanese Character Encoding for Internet Messages zISO-2022-JP zWithin 7-bit, safe for most nodes zEvery line starts/ends with ASCII yNo carryover shifting zISO-2022-KR is also used in Korea zSame method is in DICOM(Supplement 9), and HL7 v.2.3.1

7 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine UNICODE: ISO10646 z“Allocating 2 bytes for every character, UNICODE can represent every character in the world without any status nor shifting technique.” z16 bits=65,536 y-> CJK unified ideographics

8 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine CJK Unified Ideographics

9 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Why we do not use UNICODE as Message? (I know it is used inside, but, we do not like it go outside as message format.) zIf Chinese “Bone” and our “Bone” are to be recognized same, because of symmetry, how about using these? zUNICODE consortium says “Introduction of Language information”. xWe cannot write “Chinese language textbook written in Japanese. xWe cannot accommodate Koreans living in Japan with their name properly in Korean letter, but their address is Japanese, of course. yOriginal UNICODE dream is gone.

10 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine UTF-8: Transformation format of UNICODE zUNICODE is originally 2 byte for every character. z0000-007F: 0xxxxxxx z0080-07FF: 110xxxxx 10xxxxxx z0800-FFFF: 1110xxx 10xxxxxx 10xxxxxx z1 Byte: ASCII z2 Bytes: Latin extensions, Greek, Russian, Arabic, Thai, Hangul, Katakana, Hiragana, etc. z3 Bytes: CJK ideographics zASCII characters are compatible ASCII, ASCII users can say “we are universal, because we use UNICODE,” in the demerit of ideographic users.

11 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine

12

13

14 HL7 Japan’s answer to HL7 v.3 zIn XML, UNICODE will be default in 2003. zEven in UNICODE v3.1, “over-unification” problem is not solved. zBut with XML schema and XML namespace, font information can be set in each tag. yBy this, Korean name in Japanese address can be described. zOriginal UNICODE dream (all languages in the same time) is gone, but “many 1 byte languages + one 2 byte language” is not bad. yPokémon zAnswer: “UNICODE can be default, provided that we can continue to use each local practice now being used.”

15 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Language representation is not the only issue zLanguage used in; yConversation with patients ySchool education xMedical, Nurse, Technicians yMedical record xSigns and symptoms xReports zStructure of data types yAddress x250 Wu-Hsing street x1-20-1 Handa cho

16 Michio Kimura M.D. Ph.D. Hamamatsu University School of Medicine Final Remarks zSome OS (Windows NT 4.0 or later) are using UNICODE inside. zI do not blame their ignorance, maybe they just didn’t know. zI oppose any proposals with “UNICODE is the only way”. zWhen using UNICODE, pay attention to each language’s proper fonts zLet’s collaborate and agree on XML namespace for language to be used, and submit to standards. zPlease take part in APAMI census for healthcare languages


Download ppt "Health Information Standardization and Asian Languages Michio Kimura M.D. Ph.D. Director and Professor of Medical Informatics Department Hamamatsu University."

Similar presentations


Ads by Google