Download presentation
Presentation is loading. Please wait.
1
Workshop on XML-Based Library Applications 5
Workshop on XML-Based Library Applications 5. Library Applications (Part One) Hong Kong University of Science & Technology Library updated :02
2
Outline Part One Part Two Using XSLT (New Acquisitions List)
Metadata Design (Electronic Journals) Multi-Script Considerations (Theses and Antique Maps) Part Two XML Name Access Control Repository Hong Kong University of Science & Technology Library
3
New Acquisitions List (1)
Design considerations: No need to build database Static files, one set for each week Web interface by Perl script Weekly static files generated by Perl script as a batch job at night Hong Kong University of Science & Technology Library
4
New Acquisitions List (2)
List of III record numbers Create Review List Retrieve metadata by xrecord= command xrecord requests Send metadata IIIRECORDs Transformation By XSLT Stylesheets HTML pages RSS files INNOPAC Weekly List Generation Hong Kong University of Science & Technology Library
5
New Acquisitions List (3)
XSLT transformation of IIIRECORD to New Acquisitions Record Requires a few passes of XSLT Locally developed tool to convert EACC codes in “braced form” to UTF-8 Sample IIIRECORD Resulting Record after XSLT transformation Hong Kong University of Science & Technology Library
6
New Acquisitions List (4)
Conclusion: By using Perl scripts and XSLT stylesheets, list of XML formatted bibliographic records extracted from INNOPAC can be transformed into two completely different outputs (views), namely HTML web page and RSS news feed. Hong Kong University of Science & Technology Library
7
Electronic Journals Online (1)
Design considerations Require a database (on Tamino) Metadata schema design Indexing design Weekly updating by Perl script Decided to use Perl module of LibXML, instead of XSLT stylesheets Hong Kong University of Science & Technology Library
8
Electronic Journals Online (2)
INNOPAC Weekly Update (by Perl and LibXML2) XML Formatted IIIRECORD EJ_RECORD EJ Online Extract elements Construct EJ_RECORD Load metadata to EJ Online Hong Kong University of Science & Technology Library
9
Electronic Journals Online (3)
Metadata Design Decided not to use Dublin Core Internal metadata - not for exchange with external systems Programming overhead to incorporate DC Requires extension of DC in order to markup MARC Tag 856, the hypertext link to the electronic resources Hong Kong University of Science & Technology Library
10
Electronic Journals Online (4)
Decided not to use RDF Due to the same reasons above; although it can resolve the Tag 856 markup problem that DC has. Sample abridged EJ_RECORD Hong Kong University of Science & Technology Library
11
Antique Maps and Theses (1)
HKUST Theses Design considerations Both databases are on Tamino Metadata as XML documents Hypertext links to PDF files Hong Kong University of Science & Technology Library
12
Antique Maps and Theses (2)
Multi-script Considerations Non-English characters: Diacritics Mathematical symbols and formulas Greek alphabet CJK XML is UTF-8 by default Tamino stores XML documents in Unicode Hong Kong University of Science & Technology Library
13
Antique Maps and Theses (3)
Unicode and UTF-8 Explained: Developed by Unicode Consortium ( since 1991. A character coding system of written texts of diverse languages. Latest version is 4.0, released in 2003. Has 96,382 characters. 82,270 of them are CJK characters (including Hangul). Hong Kong University of Science & Technology Library
14
Antique Maps and Theses (4)
Diacritics – Combining Characters to be positioned relative to an associated base character. UTF-8 transforms a Unicode scalar value to a sequence of 8-bit bytes. English alphabets are one byte, CJK ideographs are three bytes. a ȧ Hong Kong University of Science & Technology Library
15
Antique Maps and Theses (5)
Example of UTF-8 transformation: Latin character A has a Unicode scalar value of U It is transformed to \x41. Greek alphabet α has a Unicode scalar value of U+03B1. It is transformed to \xCE\xB1. Chinese character 中 has a Unicode scalar value of U+4E2D. It is transformed to \xE4\xB8\xAD. Hong Kong University of Science & Technology Library
16
Antique Maps and Theses (6)
Demonstration – Entering non-Latin characters to the metadata Hong Kong University of Science & Technology Library
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.