Presentation on theme: "OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International."— Presentation transcript:
OCLC Online Computer Library Center V irtual I nternational A uthority F ile Prepared by Ed ONeill and Rick Bennett, OCLC Presented by Alison Hall International Association of Music Libraries Oslo, Norway, August 2004
Background The IFLA Section on Cataloguing recognized the need for a virtual international authority file where: Authority records from the worlds national bibliographic agencies could be linked Available via the Internet. Practical expansion of the concept of universal bibliographic control. Build on the work done by each national bibliographic agency Allow national or regional variations in authorized form to co- exist Support worldwide users needs for variations in preferred language, script, and spelling.
Background The VIAF could be one of the basic building blocks to a semantic web When combined with other controlled vocabularies and authority files from such sources as abstracting and indexing services, archives, museums, publishers, etc. Libraries now have an opportunity to make a great contribution to this future and should help make this vision a reality. It is important to the development of this shared vision that the VIAF be made freely available to users worldwide on the web.
Joint Project A project to test the concept of a VIAF is being jointly undertaken by: Die Deutsche Bibliothek (DDB) The Library of Congress (LC) OCLC Online Computer Library Center (OCLC)
Project Goal Demonstrate the feasibility of VIAF by linking the personal names authority records between: Personennormdatei (PND) Library of Congress Name Authority File (LCNAF)
What is the VIAF? The VIAF will be a file of metadata to link users from records in one national bibliographic agencys personal name authority file to matching records in other national authority files. The VIAF will provide for access on the web through a specially designed user interface. The VIAF will provide for multi-lingual and multi-script capability. The VIAF will use Open Archive Initiative (OAI) protocols to harvest metadata from the agencies authority files, which would then be added to the shared servers to keep the file updated.
Long Term Goal The system is being designed so that any number of authority files can be linkednot just the initial two being initially used.
The Problem In the LCNAF and PND authority files: A particular person will have the same established form in both authority files (the ideal) Different people may be assigned the same established form Different forms of the name may be established for the same person
Two People – One Name Adams, Mike In the PND, the name is established for a golfer In LCNAF, the name is established for an author of a Beatles collecters guide
Two Names – One Person LC: Morel, Pierre PND: Morellus, Petrus
Brief Authorty 010 n 84044261 040 DLC $c DLC $d DLC 100 1 Larson, Jack. 670 Thomson, V. The cat, c1982: $b t.p. (Jack Larson)
Information in Bibliographic Records From the bibliographic records we gain significant additional information about Jack Larson: He is a lyricist His primary subject area is music (Subject No. 234) Was published in the 80s and 90s by G. Schirmer and Belwin Mills in New York Worked with Virgil Thomson and Gerhard Samuel Jack Larson is the only name he used on his publications etc.
Project Phases Phase 1: Build enhances authority files for both PND and LC person names Phase 2: Match PND and LC enhances authority records to create the initial version of the VIAF Phase 3: Build OAI Server Phase 4: Ongoing maintenance and metadata harvesting using OAI protocols Phase 5: Build end user interface with unicode displays
Phase 1 Building the Enhances Authority Files Authority records generally including very few, if any, details about the person and/or their publishing history The information is rarely sufficient to determine if two different authority records represent the same person To provide additional information to unambiguously match authority records for same author, information from bibliographic records is used to enhance the authority record
Enhancing the Authorities Bibliographic Record Derived Authority Record Enhanced Authority
Mining the Bibliographic Record LDR 00826ccm 2200289 a 4500 1 ocm10025532 5 20031229650847.0 8 840627s1982 nyuuua n eng 10 $a 84758340 40 $a DLC $c DLC 19 $a 17706440 20 $c $2.95 28 22 $a 48418 $b G. Schirmer 45 2 $b d198006 $b d198007 48 $b va01 $b ve01 $a ka01 50 00 $a M1529.3 $b.T 100 1 $a Thomson, Virgil, $d 1896- 245 14 $a The cat : $b duet for soprano and baritone / $c Virgil Thomson ; [words by Jack Larson]. 260 $a New York : $b G. Schirmer, $c c1982. 300 $a 1 score (11 p.) ; $c 31 cm. 500 $a For soprano, baritone, and piano. 650 0 $a Vocal duets with piano. 600 10 $a Larson, Jack $x Musical settings. 700 1 $a Larson, Jack. Authors LC Control Number LC Classification Title Material Type Publisher Place of Publication Language Date of Publication Usage
Derived Authority Record 00525nz 2200229n 4500 0 1 xlc 1 1 3 OCoLC 2 5 20040721111415.0 3 8 040721nneanz||abbn n and d 4 40 $a OCoLC $b eng $c OCoLC $f viaf 5 100 1 $a Larson, Jack. 6 903 $a 84758340 7 910 14 $a the cat $b duet for soprano and baritone 8 921 $a g schirmer 9 922 $a nyu 10 930 $a jack larson 11 940 $a eng 12 942 $a 234 13 943 $a 198x 14 944 $a cm 15 950 1 $a thomson, virgil $d 1896 All text is normalized Subjects are grouped into approximately subject groups Material type is codedPublication date is by decadeCoauthor
Enhanced Authority Record 00824nz 2200301n 4500 0 1 oca01144962 1 5 19840809154202.7 2 8 840702n| acannaab| |n aaa ||| 3 10 $a n 84044261 4 40 $a DLC $c DLC $d DLC 5 100 1 $a Larson, Jack. 6 670 $a Thomson, V. The cat, c1982: $b t.p. (Jack Larson) 7 903 $a 84758340 $9 1 8 903 $a 93710923 $9 1 9 910 11 $a the cat $b duet for soprano and baritone $9 1 10 910 11 $a sun like $b on a poem by jack larson $9 1 11 921 $a g schirmer $9 1 12 921 $a belwin mills publ corp $9 2 13 922 $a nyu $9 2 14 930 $a jack larson $9 1 15 940 $a eng $9 2 16 942 $a 234 $9 2 17 943 $a 198x $9 1 18 943 $a 197x $9 1 19 944 $a cm $9 2 20 950 11 $a thomson, virgil $d 1896 $9 1 21 950 11 $a samuel, gerhard $9 1
NACO Personal Name Authorities Differentiated names: 3,854,587 Undifferentiated names: 38,010 Total authority records:3,892,597
LC Bibliographic Records Number of records: 6,118,657 Personal Names assigned: 6,569,957 Unique Personal Names: 2,674,687
PND Personal Name Authorities Total authority records:2,498,071
DDB Bibliographic Records Die Deutsche Bibliothek (DDB): 6,316,675 Bibliotheksverbund Bayern (BVB): 5,022,316 Total number of records: 11,338,991 Number of assignments: 12,080,387 Number of unique names: 2,371,461
Matching Objectives Each distinct author should be uniquely identified. Author: An individual person responsible for the intellectual or artistic content of a work. Established Names: A symbol (character string) used to represent an author. Names will not necessarily be the same in the LCNAF and the PND authority files.
Future of VIAF? If the proof-of-concept is successful, the VIAF will be expanded: To include other authority files for personal names, To include other types of authorities – Corporate names, – Geographic names, – etc.
Phase 3: Build OAI Server LCNAF DDB/PND OAI Server(s) Slide Courtesy of Barbara Tillett, Library of Congress
Phase 4: Ongoing maintenance and metadata harvesting using OAI protocols Slide Courtesy of Barbara Tillett, Library of Congress
Phase 5: Build End User Interface with unicode displays Users cookie specifies hongul is preferred. Display 700 form, building on local systems authority structure Slide Courtesy of Barbara Tillett, Library of Congress
Questions? Thank you firstname.lastname@example.org Rick_Bennett@oclc.org http://www.oclc.org/research/projects/viaf