Digital Data Preservation: a schema-driven model Student: Stacy Kowalczyk Co-Authors: Clare McInerney and Phil Mitchell Digital Data Preservation – the.

Slides:



Advertisements
Similar presentations
How to Set Up a System for Teaching Files, Conferences, and Clinical Trials Medical Imaging Resource Center.
Advertisements

How to Author Teaching Files Draft Medical Imaging Resource Center.
Copyright, UCL LEADERS: Linking EAD to Electronically Retrievable Sources Developing a Generic Toolkit: Architecture and technology issues ALLC/ACH Conference.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
14 October 2003ADASS 2003 – Strasbourg1 Resource Registries for the Virtual Observatory R.Plante (NCSA), G. Greene (STScI), R. Hanisch (STScI), T. McGlynn.
May 2, 2006 Virtual Collections Or, catalog building without the rocket science.
Technical Tips and Tricks for User Support Mike Gardner
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
1 CS 502: Computing Methods for Digital Libraries Lecture 22 Repositories.
CONTENT: A model for collaborative database building Trevor Bond Alan Cornish Washington State University Libraries.
U of R eXtensible Catalog Team MetaCat. Problem Domain.
Copyright 2003 The McGraw-Hill Companies, Inc CHAPTER Application Software computing ESSENTIALS    
Introduction and Conceptual Modeling
Use of METS in CDL Digital Special Collections Brian Tingle.
Class 6 Data and Business MIS 2000 Updated: September 2012.
INTRODUCTION TO DHTML. TOPICS TO BE DISCUSSED……….  Introduction Introduction  UsesUses  ComponentsComponents  Difference between HTML and DHTMLDifference.
Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.
Chapter 5 – Part II IT Infrastructure and Emerging Technologies.
Using MIRC Khan M. Siddiqui, MD Chief, Imaging Informatics & MRI VA Maryland Health Care System Assistant Professor, Radiology University of Maryland,
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
TERRA KRIDLER SENIOR LIBRARIAN & ASSISTANT UNIVERSITY ARCHIVIST AMERICAN UNIVERSITY IN CAIRO MIDDLE EAST AND NORTH AFRICA INNOVATIVE USERS GROUP CONFERENCE.
In addition to Word, Excel, PowerPoint, and Access, Microsoft Office® 2013 includes additional applications, including Outlook, OneNote, and Office Web.
10 Adding Interactivity to a Web Site Section 10.1 Define scripting Summarize interactivity design guidelines Identify scripting languages Compare common.
OracleAS Reports Services. Problem Statement To simplify the process of managing, creating and execution of Oracle Reports.
XML BIS4430 – unit 10. XML Origins Extensible Markup Language (XML) 1998 Inspired by Standard Generalized Markup Language (SGML) and HTML. SGML defines.
DSpace UI Alexey Maslov. DSpace in general A digital library tool useful for storage, maintenance, and retrieval of digital documents Two types of interaction:
Architecture for a Database System
OpenURL Link Resolvers 101
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Introduction to HTML Tutorial 1 eXtensible Markup Language (XML)
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
1 Reference Linking in Project Euclid …with some thoughts on the preservation of digital collections. A presentation at the Workshop on Linking and searching.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
File Systems and Databases Lecture 1. Files and Databases File: A collection of records or documents dealing with one organization, person, area or subject.
METS at UC Berkeley Generating METS Objects. Background Kinds of materials: –primarily imaged content & tei encoded content archival materials: manuscripts.
Introduction to Computers Lesson 10B. home Database A collection of related data or facts.
Introduction to Computers Lesson 10B. home Database A collection of related data or facts.
ISpheresImage iSpheresImage Feature Overview and Progress Summary.
Digital Volcanoes and Data Flows Carol Hamilton 1VALA 2012.
IS 325 Notes for Wednesday August 28, Data is the Core of the Enterprise.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Introduction to metadata
The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825.
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
1 EndNote X2 Your Bibliographic Management Tool 29 September 2009 Humanities and Social Sciences Resource Teams.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Nikola Tesla Museum Clipping Library Saša Malkov Nenad Mitić Žarko Mijajlović 3 rd SEEDI Int.Conf. Cetinje, Montenegro 14. September 2007.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
How to Set Up a System for Teaching Files, Conferences, and Clinical Trials Medical Imaging Resource Center.
Information Systems Today: Managing in the Digital World TB3-1 3 Technology Briefing Database Management “Modern organizations are said to be drowning.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
 An essential supporting structure of any thing  A Software Framework  Has layered structure ▪ What kind of functions and how they interrelate  Has.
Invitation to Computer Science 6 th Edition Chapter 10 The Tower of Babel.
C. Candace Chou University of St.Thomas EndNote for Researchers.
VIVO architecture March 1, Major Components Vitro is a general-purpose Web-based application leveraging semantic standards VIVO is a customized.
Virtual Collections VIRTUAL COLLECTIONS LDI Architecture Meeting, Tuesday, July 19.
Text2PTO: Modernizing Patent Application Filing A Proposal for Submitting Text Applications to the USPTO.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
A Presentation Presentation On JSP On JSP & Online Shopping Cart Online Shopping Cart.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
MIRC Overview Medical Imaging Resource Center John Perry RSNA 2009.
MIRC Overview Medical Imaging Resource Center. RSNA2006 MIRC Courses Overview of the RSNA MIRC Software Installing MIRC on Your Laptop Using MIRC for.
BOF-1147, JavaTM Technology and WebDAV: Standardizing Content Management Java and WebDAV Juergen Pill Team Leader Software AG Remy Maucherat Software Engineer.
Metadata and XML <xmlpresentation>
Prepared for Md. Zakir Hossain Lecturer, CSE, DUET Prepared by Miton Chandra Datta
Lecture 1 File Systems and Databases.
Presentation transcript:

Digital Data Preservation: a schema-driven model Student: Stacy Kowalczyk Co-Authors: Clare McInerney and Phil Mitchell Digital Data Preservation – the problem Librarians and archivists have long lamented the fact that much current research is in danger of being lost because it is sitting on computers under desks in offices – professors’ completed research, librarians’ bibliographies, curators’ specialized collection data. Each of these files is different: in data, in format, and in technology. Not only is there no access to the data, there is little hope that the data can be preserved for future generations. The Harvard University Digital Library Initiative thought this problem was ripe for research. A small team from the Harvard University Library Office for Information Systems was assembled to develop a solution to this digital data preservation problem. The research question: Can a system be developed to take an arbitrary data layout, store it in a preservation-quality format and provide access to the contents via the web without a programmer’s involvement? Stacy Kowalczyk, Phil Mitchell, and Clare McInerney developed the prototype system. Using XML schema technology, TED (TEmplated Database) allows a data owner to create a standard database for a collection, define a structured data format, and easily customize screens and parameters for search and display with minimal effort by either the data owner or the systems office. The Solution To have a system that could take any arbitrary data structure, the system had to be data independent. So the core solution was to abstract metadata dependencies out of the system into a template layer. Using XML Schema, the TED system takes a formal description of the metadata as input and automatically creates the query and display interface. XML Schema is the key. TED relies on multiple schemas – the TED schema and the application data schemas. The TED schema describes all of the function points of the system. When the TED schema is used as markup to the application data schema, it becomes the instructions to build the interfaces to the system. TED has three components – a data loading system with schema-driven indexing; a schema-driven data maintenance system used by the data owner to create, update, and delete instance documents in the database; and a schema-driven web query interface (what librarians call an online catalog). This poster focuses only on the last of these, the web query interface. The TED web query interface is a Java servlet that runs in Tomcat. It uses a Schema Object Model (SOM) as well as a Document Object Model (DOM). The TED servlet parses the schema to create the user interface in HTML with cascading style sheets. TED uses an XML DBMS, Software AG’s Tamino, for the datastore. Because the underlying datastore is XML, considered a preservation-quality format, the data preservation issues are resolved. Future Research TED currently has 2, and soon to have 4, very different application data models running in production. Even with all of the efforts put into the system to be “easy to customize”, it still requires a highly technical person to create the application data XML schema. A toolkit needs to be developed for the data owners to create their own schemas. WISP Competition 2005 – Technology Category Milman Parry Collection Biomedical Image Library (BIL) Collection Milman Parry Schema with TED markup BIL Schema with TED markup The collection splash screen is generated from a set of files for the banner, the text and the list of links. The data owner can update these at will. The TED Schema