Content Management at Grainger Engineering Library Case studies from various digital library research projects Tom Habing

Slides:



Advertisements
Similar presentations
Digital Collections: Storage and Access Jon Dunn Assistant Director for Technology IU Digital Library Program
Advertisements

EXtensible Catalog David Lindahl University of Rochester.
Tying it all Together: Integrating Digital Collections William H. Mischo, Mary C. Schlembach Grainger Engineering.
1. The Digital Library Challenge The Hybrid Library Today’s information resources collections are “hybrid” Combinations of - paper and digital format.
The Library behind the scene How does it work ? The Library behind the scenes 1 JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot.
NATIONAL LIBRARY OF MEDICINE PubMed Central Brooke Dine National Library of Medicine Medical Library Association Conference May 2004.
NSF – DLF – JISC/UKOLN Digital Library Service Registry Workshop National Science Foundation, Arlington, VA March 2006 The University of Illinois.
DSpace Devika P. Madalli DRTC, ISI Bangalore.
Emerging Information Technologies: The Role of XML, DOIs, OpenURL, and Federated Search William H. Mischo Grainger Engineering Library.
William Y. Arms Corporation for National Research Initiatives March 22, 1999 Object models, overlay journals, and virtual collections.
UCLA Digital Library UC Digital Library Forum August 5, 2002 UCLA Digital Library Presenter: Curtis Fornadley Senior Programmer/Analyst.
Greenstone Digital Library Usage and Implementation By: Paul Raymond A. Afroilan Network Applications Team Preginet, ASTI-DOST.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
Introduction to XSLT & its use in Grainger Library full-text & metadata projects Thomas G. Habing Grainger Engineering Library Presentation to ASIS&T,
Digital Library Architecture and Technology
Enriching Metadata for XML Journal Articles Through Extraction of MathML and Function Names Timothy W. Cole William.
Serenate1 Non-standard users: The Library Raf Dekeyser K.U.Leuven.
Digital Library Issues and Trends William H. Mischo Grainger Engineering Library Information Center University of Illinois at Urbana-Champaign.
University of Illinois at Urbana-Champaign OAI Alpha Experiences Timothy W. Cole Thomas G. Habing Grainger Engineering.
The Illinois Digital Library Initiative: Processing and Access Issues for Full-Text Journals May 27, 1998 Pennsylvania State University William H. Mischo.
Localized Linking Prototype CNI April 10, 2001 Dale Flecker, Larry Lannom, Rick Luce, Bill Mischo, Ed Pentz.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
ALCME: OAI at OCLC Jeffrey A. Young OCLC Online Computer Library Center, Inc.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Technology Choices for the JSTOR Online Archive Presented by Chang Feng Department of Computer Engineering and Computer Science, University of Missouri-Columbia,
The DNER - a national digital library Andy Powell ZIG Meeting, York October 2001 UKOLN, University of Bath UKOLN is funded by Resource:
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
Creating an Open Archives Metadata Harvesting Protocol Compliant Repository for the American Memory Online Collections OAI Open Meeting, Washington, DC.
07/11/2002Thomas Baron - JACoW Workshop1 CERN Library Requirements T. Baron CERN ETT-DH-CDS.
UVa's Digital Library CSG - September 2005 Slides courtesy of: Leslie Johnston Director, Digital Access Services, UVA Library Tim Sigmon University of.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
CONTENT DISCOVERY, SERVICES, AND SUSTAINED ACCESS Timothy Cole, William Mischo, Beth Sandore, Sarah Shreeves ~ University of Illinois Library
1 MSCS 237 Overview of web technologies (A specific type of distributed systems)
IUScholarWorks Technical Overview Randall Floyd Digital Library Program Programmer/Database Administrator.
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Corporation For National Research Initiatives Technical Issues in Electronic Publishing Corporation for National Research Initiatives William Y. Arms.
OAI Registry at UIUC Presented by: Thomas Habing Grainger Engineering Library.
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
OAI Tools By Thomas G. Habing Grainger Engineering Library Information Center University.
NASRULLAH KHAN.  Lecturer : Nasrullah   Website :
Serenate1 The librarian’s view Raf Dekeyser K.U.Leuven.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Distributed Service Registry Workshop, Warwick, U.K. 1 Distributed Functionality in the UIUC OAI Registry
DLF Fall Forum The Distributed Library: OAI for Digital Library Aggregation UIUC’s Role: Registry of OAI Data Providers
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Do Real Archivists Use OAI? Mid-Atlantic Regional Archives Conference Gettysburg, PA October 31, 2003 Chris Prom Assistant University Archivist University.
Metayogi Increasing the Accessibility of the Semantic Web Karim Tharani Doug Macdonald Rachel Heidecker.
Networked Information Resources Federated search, link server, e-books.
Breeda Herlihy, IR Manager, UCC Library. UCC selected DSpace in 2008 Software selection group Staff from Library IT, Computer Centre, Special Collections,
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
21 October 2000 MathML & Math on the Web Illinois D-Lib Testbed: Technologies for Converting Legacy Mathematics for Display on the Web Timothy W. Cole.
The Open Archives Initiative: Perspectives on Metadata Harvesting OAI Provider & Harvesting Services at the University of Illinois Timothy W. Cole Mathematics.
Beyond HTML: Extensible Markup Language (XML)
Web Programming Language
University of Illinois at Urbana-Champaign OAI Alpha Experiences
Qualified Dublin Core Using RDF for Sci-Tech Journal Articles DC-2001 International Conference on Dublin Core and Metadata Applications, October 22-26,
Metasearch: Top-Level Interface, Reference Applications
Digital Library Issues and Trends
VI-SEEM Data Repository
Workshop on XML-Based Library Applications 5
DIGITAL LIBRARY.
Digital Library Issues and Trends
Oya Y. Rieger Cornell University Library May 2004
The Fedora Project April 28-29, 2003 CNI, Washington DC
Developing a Technical Registry of OAI Data Providers
Presentation transcript:

Content Management at Grainger Engineering Library Case studies from various digital library research projects Tom Habing

Outline Introduction Case studies –DeLIver project –Open Archives Initiative (OAI) projects –Simultaneous Search –Institute of Physics (IoP) archive Challenges / Conclusion

Intro: DL Research Focus Engineering / Scientific Resources –Access and Discovery –Full-Text Rendering (especially mathematics, MathML) Markup-based (SGML / XML) Standard Tools: XSLT, XML Schema –Metadata Schemas: MARC, RDF, DC –Linking DOI, OpenURL –Search Distributed databases (Grainger Search Aid) Aggregated databases (Open Archives Initiative)

Intro: The Digital Library ‘Digital’, ‘Virtual’, ‘Electronic’ Library as network- based library without regard to place and time. Tendency to apply term to collections and resources. Digital Collections vs. Digital Library. Emphasis on the integration of collections and services (NSDL). Application of standards and protocols is important.

DeLIver Testbed funded under DLI-I by NSF, DARPA, and NASA, Awards made to 6 universities. Large-Scale testbed, distributed repository models, evaluation, web software. CNRI D-Lib Test Suite Program 1998—2001. Collaborating Partners Program. AIP, APS, ASCE, IEE, NRL, ASM, ACM, NTT Learning Systems, Elsevier.

DeLIver - Testbed American Institute of Physics--APL, JAP, RSI –16,000+ articles, American Physical Society--PRL –10,000+ articles, , weekly updates. ASCE Journals (25 titles) –9,000+ articles, IEE Proceedings and Electronics Letters –8,500+ articles, ASM (American Society for Materials) Handbook. ACM (Association for Computing Machinery). Elsevier Science.

DeLIver - Project Objectives Construct large-scale, multipublisher, markup- based full-text journal testbed. Investigate processing, indexing, normalization, retrieval, rendering and linking. Study end-user searching behavior and needs. Develop one-stop-shopping retrieval techniques (Aggregation, Resource Linking). Identify models for effective retrieval in distributed repository environment.

DeLIver - Accomplishments Process and retrieve from multiple publishers and heterogeneous DTDs. Cross-repository searching. SGML to XML conversion. Metadata extraction, representation, merging. Transformation and rendering technologies. Dynamic linking: forward/backward, from/to A & I services. End-user studies

DeLIver – Workflow SGML files from publishers ( FTP, CD, Tape ) Convert to XML Extract and process metadata using custom scripts and XSLT –Create reference links –Normalize Process mathematics Build search indices Move files to web server Tape backups

Deliver - Demos

DeLIver –Details Web Server –Dell PowerEdge 4300, Dual Pentium II, 512 MB –145 GB across 5 HDs ~80 GB used by DeLIver content –Windows 2000 (just upgraded from NT) –IIS 5.0 (Active Server Pages, VBScript) –Access is controlled via campus Bluestem service

DeLIver - Details Database Server –HP 9000 J200, HP-UX –OpenText LiveLink database for full-text search capability –Older Netscape web server CGI application –Also MS SQL Server for metadata only search

Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) HTTP and XML based protocol Data providers –Share metadata about their collections Service providers –Harvest the metadata and use it to develop different services (i.e. search portal)

OAI-PMH - Demos Grainger Data Provider – sp?verb=ListRecords&metadataPrefix=oai_dchttp://g118.grainger.uiuc.edu/engdocoai/oai.a sp?verb=ListRecords&metadataPrefix=oai_dc Service Providers – –

OAI-PMH - Details Data Providers –Many open source toolkits for various platforms –We have developed both ASP and JSP implementations –Metadata can reside in various databases, as XML files on a file system, or a combination

OAI-PMH - Details Service Providers –Also various open source implementations of OAI harvesters –Cultural Heritage search is running on Dell PowerEdge 4600, 4 GB Ram, 180 GB Disk, RedHat Linux 7.3, U. Michigan DLXS software. –Engineering search is running on a Dell Poweredge 6300, Quad Pentium, IIS ASP application, MS SQL Server database

Grainger Search Aid Distributed search across multiple resources with a common interface –Google, Library Catalog, A & I databases Integrating A & I services with full-text resources (OpenURL, DOI)

Institute of Physics (IoP) Archive Recently acquired a local copy of the full text of the IoP archive back to 1874 –PDF, XML Metadata, GIF and JPEG Images –550,000 files in 160 GB Integrated with the OAI search interface How to integrate this with the DeLIver material?

Misc. Challenges Full-text rendering across different browsers, especially SciTech material with math and special characters (MathML) Integrating heterogeneous resources Maintaining code across software and OS updates Typical source code control issues, especially for research projects which are transitioned into production

Conclusion For a digital library the biggest challenge isn’t managing one’s own content (although this is still a big challenge), but integrating, managing, and making accessible different content from a wide variety of sources, many of which are outside your direct control. XML and related standards are helping enormously Many other standards such as DOI, OpenURL, OAI are also critical to the problem