Presentation is loading. Please wait.

Presentation is loading. Please wait.

Subject To Change automatic catalog enrichment with subject headings and codes 10th IGeLU conference Budapest, 3.9.2015 Marcus Zerbst Zentralbibliothek.

Similar presentations


Presentation on theme: "Subject To Change automatic catalog enrichment with subject headings and codes 10th IGeLU conference Budapest, 3.9.2015 Marcus Zerbst Zentralbibliothek."— Presentation transcript:

1 Subject To Change automatic catalog enrichment with subject headings and codes 10th IGeLU conference Budapest, 3.9.2015 Marcus Zerbst Zentralbibliothek Zürich, Systems Librarian

2 Overview Background → Desire to align legacy data with GND and LCSH and DDC data → The Digital Assistant Automatic catalog enrichment with subject headings and codes → Parameters, technical workflow, numbers and statistics Prospect → Plan to use more subject cataloging data from external sources Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 20152

3 Excuse me, what is GND? 3Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015

4 What‘s GND? GND → The Integrated Authority File (GND) is an authority file for Persons, Corporate bodies, Conferences and Events, Geographic Information, Topics and Works. It is used above all for the cataloguing of literature by libraries […] It is operated cooperatively by the German National Library, all German-speaking library networks […] → In April 2012 the GND replaced [various] previously separate authority files. (from http://www.dnb.de/EN/Standardisierung/GND/gnd_node.html) Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 20154

5 Background 5Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015

6 Desire: Subject Indexing of older records with GND/LCSH data → Since Autumn 2012: Cease homegrown system for subject indexing in favour of GND → Standards in German speaking market instead of local peculiarities → Desire for better searchability of all titles by consistent indexing per GND data → Idea: Enrich old records with GND data – manual cataloging not an option Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 20156

7 The Digital Assistant → In production in the parameters of computer-aided subject indexing since October 2013 → Helps subject librarians with indexing by suggesting subject entries according GND → Generates suggestions based on external database entries, translation and statistical analysis of table of contents → Is used in an intuitive web client → Daily import of fields for processed titles to Aleph Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 20157

8 DA – Flow Chart Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 20158 ILS Aleph Import in DA Load TOC Load metadata (Z39.50) Enrichment Subject heading from WorldCat,… and Autocat Editing in DA Choice / addition of headings Suggestions, headings lists Export from DA Daily in Aleph seq format GND/S headings, scope 072, loc. fields Import Daily by FTP Info eMail

9 ToC Technology Components DA Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 9

10 Project Rekat automatic catalog enrichment with subject headings and codes 10Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015

11 Project Rekat Parameters Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 201511 → Enrichment of all our records from publication years 1960-2012 From Autumn 2012 on: subject indexing with GND → Around 1.8 million titles → Matching records against WorldCat → Enrichment with GND, LCSH and DDC

12 Retrospektive Erschliessung– Ablaufschema Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 201512 Aleph / EBI01 Export One-time marcXML Match WorldCat GND, LCSH, DDC Import Consortium cataloging rules Avoid duplicates

13 Technical Workflow 13Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015

14 Find match in WorldCat Order of precedence: ISBN Author & title all terms from 245$$a/$$b and 100$$a Both to exist mandatory As 2., but no stop words Author a & title a/b/c in full record, no stop words. Same no of terms Search=Find Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 201514

15 Import of Data: Adaption I → Transfer of records found in WorldCat → Received: All subjects of all matches All DDC codes → Avoid duplicates: → Check against data in NEBIS catalog → On text level, no Aleph routine Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 15

16 Import of Data: Adaption II → Local routine adjusts MARC21 to KIDS (Swiss MARC rules) → Avoid duplicates Received: 000085853 0820_ $$a501 000085853 08204 $$a501 Adapted and deduplicated: 000085853 0820_ $$a501 Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 16

17 Import of data: Adaption III Technical difference after adaption In 082 all 2nd indicators are stripped during import, afterwards identical fields are purged. But if 1st, valid indicator differs, then fields with identical content are technically different, and will be imported. Received: 000077241082_4$$a709.493 00007724108204$$a709.493 00007724108214$$a709.493 Adapted: 000077241 082__ $$a709.493 000077241 0820_ $$a709.493 000077241 0821_ $$a709.493 Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 17

18 Testing 18Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015

19 Testing → Every 5th year is checked by a subject cataloger (12 years). Inspection of correct attribution of subject terms to title Inspection of correct and complete import to Aleph → By these measures several problems could be detected and eliminated. Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 19

20 → Form/type heading «Online Publication» might be wrong and so was eliminated in general → LCSH: 650_0 not identical to 65000: Import of non LCSH subjects in various languages – not intended → Music material (notes etc.) without ISBN tend to bring bad results, so were eliminated. Problems identified Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 201520

21 → 000080919 650 0 L $$aPhysics$$xEarly works to 1800 → 000080919 650 0 L $$aPhysics$$yEarly works to 1800 → 000196066 651 0 L $$aSwitzerland$$vBibliography → 000196066 651 0 L $$aSwitzerland$$xBibliography → 000229817 650 0 L $$aAdministrative law$$vCases$$zSwitzerland → 000229817 650 0 L $$aAdministrative law$$zSwitzerland$$vCases → 000229817 650 0 L $$aAdministrative law$$zSwitzerland$$xCases Assumed issues Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 201521

22 → 000080919 650 0 L $$aPhysics$$xEarly works to 1800 → 000080919 650 0 L $$aPhysics$$yEarly works to 1800 → 000196066 651 0 L $$aSwitzerland$$vBibliography → 000196066 651 0 L $$aSwitzerland$$xBibliography → 000229817 650 0 L $$aAdministrative law$$vCases$$zSwitzerland → 000229817 650 0 L $$aAdministrative law$$zSwitzerland$$vCases → 000229817 650 0 L $$aAdministrative law$$zSwitzerland$$xCases Assumed issues Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 201522

23 Numbers and Statistics 23Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015

24 Statistics → Records sent: 2.5 million → Records enriched: 1.8 m Matches by ISBN: 1.45 m → Enrichment: GND: 1.12 m DDC: 1.33 m LCSH: 1.21 m Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 24

25 Processed / Enriched Titles Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 25

26 Match Types from Total Match

27 Subject Types from Success

28 Prospect 28Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015

29 Project FRED – automatic daily enrichment → Trigger On item creation or On item arrival → Continuous search for external data until final edit by subject specialist or item status is «loanable» → Stop flag by subject specialists or during final item handling → Subjects are imported to Aleph, CAT field → Different start and stop flags for ebooks Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 29

30 Conclusion → Just started to load enrichment → Success and acceptance yet to be discovered → High demand of automatic processes → Changes in tasks and function of subject specialists → Reduced demand in larger community environment expected Subject To Change, Marcus Zerbst, IGeLU conference, 3. Sept 2015 30


Download ppt "Subject To Change automatic catalog enrichment with subject headings and codes 10th IGeLU conference Budapest, 3.9.2015 Marcus Zerbst Zentralbibliothek."

Similar presentations


Ads by Google