Presentation is loading. Please wait.

Presentation is loading. Please wait.

Japanese Records and Whether or not to Switch from MARC 8 to Unicode Storage (with an Innovative Interfaces Millennium local system) The University of.

Similar presentations


Presentation on theme: "Japanese Records and Whether or not to Switch from MARC 8 to Unicode Storage (with an Innovative Interfaces Millennium local system) The University of."— Presentation transcript:

1 Japanese Records and Whether or not to Switch from MARC 8 to Unicode Storage (with an Innovative Interfaces Millennium local system) The University of Washington Law Librarys Decision-making Process

2 Differences in Storage and/or Export Settings With Different Local Systems Your Mileage May Vary Its important to note that different local systems vary widely in whether and how data is stored, imported and exported. These differences will have a huge impact on the experience of librarians making decisions on whether or not to export records in Unicode from OCLC to the local system. Your Mileage May Vary Its important to note that different local systems vary widely in whether and how data is stored, imported and exported. These differences will have a huge impact on the experience of librarians making decisions on whether or not to export records in Unicode from OCLC to the local system. Innovative Interfaces Millennium Local Systems Do not allow import of records encoded differently than the encoding for storage. In other words, If III storage is set to Unicode, records must be imported from OCLC in Unicode. If storage is set to MARC 8, records must be imported in MARC 8 Innovative Interfaces Millennium Local Systems Do not allow import of records encoded differently than the encoding for storage. In other words, If III storage is set to Unicode, records must be imported from OCLC in Unicode. If storage is set to MARC 8, records must be imported in MARC 8 Voyager Local Systems (CJK version) Can be set to convert imported MARC 8 records to Unicode on-the-fly for storage. This makes the decision about exporting from OCLC Connexion in Unicode VS MARC 8 less important (almost irrelevant) Voyager Local Systems (CJK version) Can be set to convert imported MARC 8 records to Unicode on-the-fly for storage. This makes the decision about exporting from OCLC Connexion in Unicode VS MARC 8 less important (almost irrelevant) Other Local Systems? Local systems that store data in MARC 8 cannot import and display Unicode records unless they convert the records to MARC 8. Conversely, local systems storing data in Unicode cannot import MARC 8 records unless the data is converted to Unicode. Ask these questions about your local system: Other Local Systems? Local systems that store data in MARC 8 cannot import and display Unicode records unless they convert the records to MARC 8. Conversely, local systems storing data in Unicode cannot import MARC 8 records unless the data is converted to Unicode. Ask these questions about your local system: What encoding is used for storage? What encoding is used for storage? Is there a required encoding for imported records? Is there a required encoding for imported records? If not, are imported records automatically converted to the appropriate encoding for storage? If not, are imported records automatically converted to the appropriate encoding for storage?

3 Our Library is trying to decide… Our Library is trying to decide… To switch, or not to switch… Innovative Interfaces Millennium System OCLC Connexion Japanese Records Marian Gould Gallagher Law Library MARC 8 OR Unicode Storage??

4 Unicode VS MARC 8 Basics Computers store text as numeric codes. Unicode has become the standard for text storage worldwide. Its use facilitates the storage, transfer, and display of text in a wide range of computer software environments (the internet, databases, browsers, word processors, etc) Computers store text as numeric codes. Unicode has become the standard for text storage worldwide. Its use facilitates the storage, transfer, and display of text in a wide range of computer software environments (the internet, databases, browsers, word processors, etc) What is MARC 8? MARC 8 has been the North American Library Communitys text storage standard. ( The group of 7/8-bit and 24-bit character sets used to encode MARC 21 records. These sets are specified in MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media, Character Sets, Part 1. 1) What is MARC 8? MARC 8 has been the North American Library Communitys text storage standard. ( The group of 7/8-bit and 24-bit character sets used to encode MARC 21 records. These sets are specified in MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media, Character Sets, Part 1. 1) What is Unicode? Unicode has become the international standard for text storage. The Universal Character Set (UCS) which is ISO 10646 and its industry counterpart Unicode. 1 What is Unicode? Unicode has become the international standard for text storage. The Universal Character Set (UCS) which is ISO 10646 and its industry counterpart Unicode. 1 1 Source: LCs MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media: CHARACTER SETS http://www.loc.gov/marc/specifications/speccharintro.html http://www.loc.gov/marc/specifications/speccharintro.html

5 What Problems are Specific to Japanese? Q: Do Some Problems associated with Unicode vs MARC 8 storage affect one language (such as Japanese) more than others? A: Not Really. Problems with character display for specific languages are more often an issue of font availability. Each application must have access to a font that will display the proper characters. Arial Unicode MS can display most Unicode characters. In library records, an additional issue is converting between MARC 8 and Unicode. But these issues can affect many languages and scripts; not just Japanese.

6 What Problems are Specific to Japanese? Q: So are there any Japanese-specific problems? A: Not when it comes to Unicode storage itself. But there are common problems with display of Kanji and Japanese romanization in library catalogs. These are mainly font-availability issues, not Unicode storage issues. Examples of Font-based Problems Specific to Japanese Romanization (Diacritic Problem) Romanization (Diacritic Problem) Alif as in koninAlif as in konin Kanji Examples of Japanese Kanji not in EACC (Different Unicode Code Point Required for Verified Catalog Record in OCLC) Kanji Examples of Japanese Kanji not in EACC (Different Unicode Code Point Required for Verified Catalog Record in OCLC) MARC 8/ EACC: (U+8AAA) instead of (U+8AAC)MARC 8/ EACC: (U+8AAA) instead of (U+8AAC) MARC 8/ EACC: (U+865B) instead of (U+865A)MARC 8/ EACC: (U+865B) instead of (U+865A) MARC 8/ EACC: (U+5377) instead of (U+5DFB)MARC 8/ EACC: (U+5377) instead of (U+5DFB) MARC 8/ EACC: (U+9304) instead of (U+9332)MARC 8/ EACC: (U+9304) instead of (U+9332) MARC 8/ EACC: (U+67E5) instead of (U+67FB)MARC 8/ EACC: (U+67E5) instead of (U+67FB)

7 What Problems are Specific to Japanese? Why Switch to Unicode Storage? Q: If there are no problems with MARC 8 storage specific to Japanese, then why should our library switch to Unicode storage? A: Consider this quote from Microsoft: Deciding whether to store non-DBCS [double- byte character set] data as Unicode is generally determined by an awareness of the effects on storage, and about how much sorting, conversion, and possible data corruption might happen during client interactions with the data... However, for most applications the effect is negligible. Databases with well-designed indexes are especially unlikely to be affected…Deciding whether to store non-DBCS [double- byte character set] data as Unicode is generally determined by an awareness of the effects on storage, and about how much sorting, conversion, and possible data corruption might happen during client interactions with the data... However, for most applications the effect is negligible. Databases with well-designed indexes are especially unlikely to be affected…

8 What Problems are Specific to Japanese? Why Switch to Unicode Storage? A: (continued) Most of the time, the decision to store character data, even non-DBCS data, in Unicode should be based more on business needs instead of performance. In a global economy that is encouraged by rapid growth in Internet traffic, it is becoming more important than ever to support client computers that are running different locales. Additionally, it is becoming increasingly difficult to pick a single code page that supports all the characters required by a worldwide audience. 2 2 See the Microsoft article Storage and Performance Effects of Unicode : http://msdn2.microsoft.com/en- us/library/ms189617.aspx http://msdn2.microsoft.com/en- us/library/ms189617.aspxhttp://msdn2.microsoft.com/en- us/library/ms189617.aspx

9 What are the Pros and Cons to Converting our Local System to Unicode Storage? Advantages of Staying with MARC 8 Advantages of Staying with MARC 8 May not be possible to back out of switch to Unicode if problems crop up May not be possible to back out of switch to Unicode if problems crop up Your records have No risk of being damaged Your records have No risk of being damaged Could be faster than Unicode (but probably is not) Could be faster than Unicode (but probably is not) In a phrase: If it aint broke, dont fix it! In a phrase: If it aint broke, dont fix it! Advantages of Switching to Unicode Advantages of Switching to Unicode Could enhance data exchange capabilities Could enhance data exchange capabilities Export/ImportExport/Import Copy/Paste between ApplicationsCopy/Paste between Applications Network printingNetwork printing Allows for display of your records in a wide variety of world- wide computing environments Allows for display of your records in a wide variety of world- wide computing environments May improve some long-standing problems with local system software (such as printing, display) May improve some long-standing problems with local system software (such as printing, display) Supporting the international Unicode standard is one of presenting your library catalog as a global resource Supporting the international Unicode standard is one of presenting your library catalog as a global resource Nothing ventured, nothing gained! Nothing ventured, nothing gained!

10 In our library: The Head of Technical Services Main contact with Innovative Requests information about successes/problems at other libraries East Asian Law Department Responsible for Chinese, Japanese, and Korean records Work together with Tech Services OCLC Connexion Gallagher Law Library Local System an Innovative Interfaces, Inc. Millennium local system MARC 8 Storage Unicode Storage Who decides whether to flip… …the switch to Unicode Storage? OCLC Connexion

11 What will our library do? Undetermined! Our library is still in the decision process Undetermined! Our library is still in the decision process Were considering all of the information noted in this presentation Were considering all of the information noted in this presentation We will probably decide soon! We will probably decide soon! University of Washington Marian Gould Gallagher Law Library

12 What sources of information are there? Your Local System Guides Your Local System Guides Library of Congress Guides Such as: LCs MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media: CHARACTER SETS http://www.loc.gov/marc/specifications/speccharintro.html Library of Congress Guides Such as: LCs MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media: CHARACTER SETS http://www.loc.gov/marc/specifications/speccharintro.html http://www.loc.gov/marc/specifications/speccharintro.html OCLC CJK Help OCLC CJK Help Microsoft Guides Such as: Storage and Performance Effects of Unicode : http://msdn2.microsoft.com/en- us/library/ms189617.aspx Microsoft Guides Such as: Storage and Performance Effects of Unicode : http://msdn2.microsoft.com/en- us/library/ms189617.aspxhttp://msdn2.microsoft.com/en- us/library/ms189617.aspxhttp://msdn2.microsoft.com/en- us/library/ms189617.aspx Unicode Consortium http://www.unicode.org/ Unicode Consortium http://www.unicode.org/ http://www.unicode.org/ OCLC CJK listserv OCLC CJK listserv Eastlib listserv Eastlib listserv

13 Flipping the switch… Is up to you and Your Library… MARC 8 Storage Unicode Storage


Download ppt "Japanese Records and Whether or not to Switch from MARC 8 to Unicode Storage (with an Innovative Interfaces Millennium local system) The University of."

Similar presentations


Ads by Google