Presentation is loading. Please wait.

Presentation is loading. Please wait.

A case study in using the Connexion Digital Import tool to streamline metadata creation in a digital state documents collection, or,... Christy Allen &

Similar presentations

Presentation on theme: "A case study in using the Connexion Digital Import tool to streamline metadata creation in a digital state documents collection, or,... Christy Allen &"— Presentation transcript:

1 A case study in using the Connexion Digital Import tool to streamline metadata creation in a digital state documents collection, or,... Christy Allen & Amy Rudersdorf State Library of North Carolina Southeastern CONTENTdm Users Group Annual Meeting, Starkville, MS July 31, 2008

2 The Good, the Bad, and the Ugly [Graphic Removed]

3 What is Connexion Digital Import? New-ish* feature in Connexion that allows you to: 1.upload a digital object to a new or existing MARC record in WorldCat, and 2.automatically “dump” the record (mapped to Qualified Dublin Core) and object into a hosted instance of CONTENTdm, (and to the Digital Archive if you have a subscription to that, too), 3.using the OCLC number as the connection point. *See OCLC’s announcement here:

4 What is required? Connexion version 2.0 or higher Full-level authorization status or higher in OCLC A hosted version of CONTENTdm! An OCLC authorization that includes CONTENTdm authorization A WorldCat record to attach digital content to

5 Why did we use it? State Library of N.C. is the mandated depository for state government documents in North Carolina: –Need to provide access to all state documents –Source for original cataloging of *most* state documents in MARC –Depository Library survey indicated our clients want us to continue full MARC cataloging of documents -- let’s re-use that data! –Pilot project started using already-cataloged paper docs that have electronic versions

6 How does it work?

7 How does it work? (cont.)




11 It’s Magic!!! completely gratuitous picture of Stonehenge taken by our Cataloging Branch Head

12 The Good… Multiple access points: WorldCat, ILS, CONTENTdm, and Google Reuses already-existing metadata (MARC records) Files are automatically moved into the Digital Archive for those who subscribe to it Fits into existing cataloging workflow CONTENTdm support is responsive

13 The Good... (continued) CONTENTdm is ready out of the box Built-in functionalities: JPEG2000, full-text searchability, user-friendly interface Compound object functionality: –Easy-to-use compound object interface –builds compound objects on-the-fly from PDF files Crosswalking does allow special characters/ diacritics to come through from WorldCat (special characters/diacritics can’t be easily added to records created through the Acquisitions Station until the fall release of CONTENTdm)

14 OK, maybe it’s not all magic... Likewise, MARC and QDC are not quite the same… Stonehenge snow globe. Doesn’t have quite the same effect. [Graphic of Stonehenge snowglobe Removed]

15 ... The Bad and the Ugly: post-crosswalk editing At first you feel like this guy...... but after a while it’s not so bad [Graphics of sad pig balloon and girl saying “I don’t care what you say. I’m gonna be a horse when I grow up.” Removed]

16 Why edit, you say? Doesn’t the full-text document contain everything the user needs? Well... –The mapping between MARC and QDC is defined by OCLC and is “fixed,” so you don’t get to pick which MARC fields map into which QDC fields! –This means that you may have: 1.Data mapping to a field in which you don’t want it 2.Data you don’t want at all that maps anyway 3.Data you want that doesn’t map anywhere

17 Data mapping to a field in which you don’t want it –Where is this a problem? dc.subject - 099/092/096 fields and non-LCSH subject terms applied by other institutions dc.language – we use ISO 639-2 code as controlled vocabulary, but free text note field in MARC (546) maps to dc.language! dc.relation – OCLC URL maps to this field instead of to dc.identifier

18 MARC 099/092/096 fields (call & cutter numbers) map to dc.subject field in CONTENTdm Data you don’t want at all that maps anyway (1/2)

19 –Issues: CONTENTdm supplies a controlled vocabulary (TGM) for this field or you can implement your own. However, the CV is difficult to apply because every record now contains unique value that does not exist in the controlled vocabulary! If you DO apply a controlled vocabulary to the dc.subject field and forget to remove the classification number while editing the record, the system will not let you save the record, and you may lose all your other edits to that record. Data you don’t want at all that maps anyway (2/2)

20 –041 (language codes) –780/785 (title replaces/replaced by fields) only certain indicator/subfield combos are crosswalked –260 $a (place of publication) –245 $c (statement of responsibility) So, we manually add some of this information... Data you want that doesn’t map to any CONTENTdm field

21 Fields that don’t exist in MARC We repeatedly input the same data directly into multiple CONTENTdm records because... 1.the data simply doesn’t exist in the MARC record, and can’t apply a CONTENTdm template to a record directly dumping from CONNEXION Examples: “Collection,” “Digital Format,” “Rights,” etc.

22 Controlled vocabulary issues We use LCSH and LC name authorities in various fields Terms were loaded into CONTENTdm after pulling the data from our Voyager system If the WorldCat record had authority headings that were added or changed before load, those terms aren’t in our CV In Admin module: new controlled vocabulary terms can’t be added to the CV directly from the record (must be laboriously added before record is edited)

23 MARC record authorization problems Our OCLC authorization = “Enhance level” Some of “our” MARC records have been upgraded to Elvl:[blank] (i.e., we can’t edit them anymore) CDI process replaces record, but we no longer have authorization to do so! OCLC has recommended we create a duplicate record We are brainstorming other alternatives with OCLC

24 Workflow for Editing New Items 1.New items added through CDI appear in the live repository (not in approval queue) –(We don’t insert a collection name into these records until they are edited/approved so that they don’t come up in a collection-specific search (The item will still come up in a repository search)

25 Workflow for Editing New Items 2.Newly imported records are batch- downloaded into the Acquisitions Station, edited, and re-uploaded with the Collection name 3.They then become accessible through collection and repository searches

26 A search within the Publications Collection for “Dept. of Transportation” returns 7 hits (all edited records)

27 A search across the entire repository for the same phrase returns 12 hits (3 of the first 4 are unedited records)

28 Other Issues Import isn’t always successful (sometimes, the digital object isn’t “there” when you index the collection) Unspecified time lags may occur during digital import Large bandwidth required for digital import to work consistently Can’t export administrative fields auto- populated by OCLC. (e.g., the OCLC number) *Not really a CDI issues, but since we’re here...

29 Potential improvements? (1/2) Use templates (or something) to apply “constant data” to imported records Add controlled vocabulary terms directly from metadata record while working in Admin module Attach digital content to ALL records (including CONSER/Elvl:[blank] records) Suppress individual records in the “live” collection until ready to make them publicly available

30 Potential improvements? (2/2) Let CONTENTdm talk to the WorldCat authority file for controlled vocabularies Some kind of visible “required fields” indicator in the Admin interface (customizable on a collection basis). During creation, editing, updating process, required fields would be obvious. Export ALL fields (both administrative and Dublin Core) from CONTENTdm

31 The Digital Import Process: Sometimes Weird but Very Useful The Flowbee cuts and vacuums hair at the same time! [Graphic of “Flowbee” in use Removed]

32 State Library of North Carolina Check out our collections: Christy Allen (soon to be Amy Rudersdorf (soon to be

Download ppt "A case study in using the Connexion Digital Import tool to streamline metadata creation in a digital state documents collection, or,... Christy Allen &"

Similar presentations

Ads by Google