Presentation is loading. Please wait.

Presentation is loading. Please wait.

ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services.

Similar presentations


Presentation on theme: "ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services."— Presentation transcript:

1 ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

2 Who should attend this morning? To get the most from the next hour and a half, Either you have: Experience building CONTENTdm collections OR Attended CONTENTdm Training Hands-on: on-site or on-line Demonstration only: Basic Use Webinar

3 Outline Part One: Review Software architecture Collections and Projects Part Two: Demonstration Importing and searching full text Research papers Yearbooks Postcards Books

4 Acquisition Stations or “clients” JPEG2000 Extension OCR Extension Administration tools Statistics Authorization settings Exporting to WorldCat Administration tools Statistics Authorization settings Exporting to WorldCat Custom Web interfaces Web-based ‘Add’ CONTENTdm Server Unix (Linux, Solaris) or Windows (2000, 2003) CONTENTdm Server Unix (Linux, Solaris) or Windows (2000, 2003) CONTENTdm site pages CONTENTdm Architecture Archival repository OCLC Connexion ‘digital import’ Search engines E.g., Google® WorldCat.org WorldCat Local Search engines E.g., Google® WorldCat.org WorldCat Local

5 Configuring a collection What’s a Collection? A group of objects (items) that Share the same metadata schema Live on the same CONTENTdm server How many Collections can I have? Up to 200 collections per server How many items can be in a collection? 16 million items per collection

6 Populating a collection Through the use of a “Project” What’s a CONTENTdm Project? A workspace on your personal computer Into which you import up to 5000 items at a time Where items reside until you upload to the server A group of settings that are applied to the items E.g., image display resolution, file format, branding E.g., automatic metadata input How many Projects can I have at one time? Limited only by your disk space on the workstation

7 RELATIONSHIP of Collection to Projects Collection A single Collection Many Projects Collection Project 1 Project 3 Project 2

8 What’s a CONTENTdm object or item? CONTENTdm can store/index/search items in various formats Display any file format: Viewed with a Web browser natively or viewed via a plug-in Including: JPEG, JPEG2000, TIFF, PDF, WAV or MP3 audio, AVI or MPEG video, html, MrSID ® Simple items—e.g., images, sound files, research papers (We’ll load papers today as PDF items.) Compound objects—multiple simple items assembled together

9 CONTENTdm Compound Objects CONTENTdm defined classes Documents We will load a section of a yearbook Postcards We will load a handwritten postcard with a typescript Monographs (Structured documents) We will load a book with chapters Picture Cube (six-sided views)

10 Dublin Core metadata element set

11 Review: Basics of CONTENTdm Simple and Qualified Dublin Core element sets offered 100 fields per collection Only DC.Title required to create a record Dublin Core is basis for cross-collection searching Text is stored in a metadata field 128,000 characters per “full text search” field 200 collections/server—i.e., 200 different metadata schema

12 Providing searchable text Remember: metadata fields can be made searchable In addition, full-text, extracted from the digital object itself can be stored in a metadata field designated as “Full text search” data type, in any of three ways: 1.Extracted (by server) from PDFs (if embedded to begin with) 2.Imported as.txt transcript Typescripted from handwritten or OCR’d in advance (external OCR engine) 3.Generated by OCR “on-the-fly” (integrated ABBYY FineReader®)

13 Review: Populating collections Acquisition Station Projects (PC client) Add from CONTENTdm Administration (Browser-based) Connexion digital import (WorldCat cataloging client function)

14 Review: 1. Acquisition Station—PC client Project workspace Project settings Tools to manage Image settings Metadata settings

15 Review: 2. Add –web based function Platform independent Simple item add function may be used for single import of: Images—.jpg,.jp2,.tif (if bandwidth allows) PDF—single and multi-page Audio Video

16 Review: 3. Connexion digital import function

17 Simple items—some examples that carry text Reformatted materials e.g., books, documents, posters, broadsides, memos—scans may all contain text Born digital files e.g., PDFs, single or multi-page Single-page PDFs viewed as items May opt for ‘in-line’ Adobe viewer Multi-page PDFs may be handled as if compound object of type “document” Server side conversion Import as simple item regardless of conversion choice

18 Excerpted from Creating and managing text collections using CONTENTdm

19 First things First-- Recap: Prepare the Collection For importing searchable text items, whether singly or in batch—at minimum: 1.One empty, searchable field is configured as “Full text search” data type to hold text 2.Collection is configured to treat PDFs as compound objects. 3.Collection is configured to provide Full Resolution file management. 4.Other fields are made searchable, hidden, moved, or added, as needed. 5.OPTIONAL: the Web templates are adjusted to suppress display of components of compound objects in search results.

20 Recap: Prepare the items These PDFs have been created with searchable text embedded. Beware: Not all PDFs are created equal!

21 Demonstration 1a--Simple items One simple item—PDF with ‘hidden’ text Acquisition Station Import file Web-based Add

22 Demonstration 1b--Multiple simple items (Acquisition Station) A batch of simple items, two ways: Method A: Import a batch of simple digital items stored in folders (where Template Creator only is used to automatically generate metadata) Method B: Import a tab-delimited text file naming and describing the digital items (where metadata also resides in imported tab-d file)

23 Recap: Behind the scenes: prepare the items, organize folders Method A: PDFs had been created with text (Adobe, Word conversion) For importing a batch of PDFs in one load, All PDFs were stored in one folder. Digitization Training

24 Recap: Behind the scenes: prepare the items, organize folders Method B: PDFs had been created with text (Adobe, Word conversion) For importing a batch of PDFs in one load, All PDFs were stored in one folder. For loading with tab-d files: Prepare.txt file of metadata Place it in a directory different from the.pdf files

25 Demonstration 2—Single Compound objects Yearbook (OCR’d transcript produced on the fly) Handwritten Postcard (with a previously created typescript file) Book (Separate transcript produced in advance)

26 Text: Newspapers Newspapers Wissahickon Valley Public Library. (PA) Ambler Gazette Collection. [AccessPA consortium]. http://205.247.101.31:2005/cdm4/browse.php?CISOROOT=%2Fwivp-gazett Freeport News. (NY) [LILRC consortium] http://209.139.1.182/cdm4/search.php http://209.139.1.182/cdm4/search.php Summit Memory (Ohio) “The Ohio Informer” http://www.summitmemory.org/cdm4/search.php http://www.summitmemory.org/cdm4/search.php Lehigh University. (PA) Brown and White newspaper. [AccessPA consortium] [article segmentation] http://digital.lib.lehigh.edu/cdm4/browse.php?CISOROOT=/bw http://digital.lib.lehigh.edu/cdm4/browse.php?CISOROOT=/bw

27 Text: PDF documents Arizona Memory Project http://azmemory.lib.az.us/ [PDF document accessible via “abstract” – no full text within CONTENTdm – requires secondary search] (Search ‘arizona visitor industry’) http://azmemory.lib.az.us/ Duquesne University, PA http://cdm256101.cdmhost.com/cdm4/search.php [PDF full text all within CONTENTdm – single search of all full text] (Search ‘bermuda’) http://cdm256101.cdmhost.com/cdm4/search.php

28 Questions & Answers Getting help with Text User Support Center Downloading the appropriate Acquisition Station JPEG2000 Installing, activating the OCR extension Tutorials to study Help files related to text works Write contentdmsupport@oclc.org

29 Questions? ingramg@oclc.org

30 Collections of documents: Text-based letters, newspapers, diaries, yearbooks, PDFs, and more

31 60-Day Free CONTENTdm Evaluation https://www3.oclc.org/app/contentdm/evaluation/

32 Section Break Line Two Subtitle here Contact: Ron Gardner, OCLC gardnerr@oclc.org 1-800-848-5878 For more information about CONTENTdm… www.oclc.org/contentdm/


Download ppt "ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services."

Similar presentations


Ads by Google