Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Wrangling: Developing Local Best Practice for Born Digital Metadata Tracy Popp, Digital Preservation Coordinator Ayla Stein, Metadata Librarian University.

Similar presentations


Presentation on theme: "Data Wrangling: Developing Local Best Practice for Born Digital Metadata Tracy Popp, Digital Preservation Coordinator Ayla Stein, Metadata Librarian University."— Presentation transcript:

1 Data Wrangling: Developing Local Best Practice for Born Digital Metadata Tracy Popp, Digital Preservation Coordinator Ayla Stein, Metadata Librarian University Library University of Illinois Urbana-Champaign

2 Intro What will be addressed: Institutional context Project needs Challenges Current progress Future work

3 Institutional Context University Library –Campus-wide network of libraries –Largest public university research library in U.S. thirteen million volumes 24 million items and materials Over 12,000,000 digital files Main Library building, East Entrance http://www.library.illinois.edu/bis/images/uiucmainlib.jpg

4 Institutional Context Collaborative effort: –Content Access Management (Cataloging and Metadata) Ayla – Metadata Librarian –Preservation Unit Tracy – Born Digital Content Reformatting –Special Collections University Archives RBML, Sousa, etc. –Back to Preservation Kyle Rimkus – Preservation Librarian –Digital Content Long-term Preservation (Medusa)

5 Project Needs Ayla (Metadata) and Tracy (Born Digital Content Reformatting) Identify –Metadata currently captured Make –Schema Recommendations Technical Administrative Descriptive –Controlled Vocabulary

6 Overview of Challenges Behemoth spreadsheet Various reports not in a schema No controlled vocabulary Redundant data entry Ideally aligns with Medusa data

7 Born Digital Reformatting Behemoth spreadsheet –Project tracking and data entry Reports –Structured but not to a schema From FTK Imager: »Directory list of media structure (created at time of disk imaging); item level information »Hash list of exported files From TreeSize Pro »Media group level reports

8 Challenges - Schema No one schema appropriate –Many layers of transformation –varying types of metadata Born Digital Reformatting Collecting Unit Digital Preservation Repository Recover from obsolete media Arrangement Description Access Medusa: Long term Preservation

9 Challenges – Controlled Vocabulary Reformatting request form is paper –Project tracking system in works No Controlled Vocabulary Reviewed: MANY Chose: –PBCore instantiationMediaType –PBCore instantiationPhysical

10 Schema Choices METS, MODS, and PREMIS Why? –MODS and PREMIS align with Medusa terms

11 Schema Choices PREMIS –Record technical info of item pre- reformatting –Encode actions and digital forensics reports as ‘events’ –Can have full provenance of a digital object in a cohesive piece

12 Schema Choices The Catch: –Medusa supports limited metadata Collection & file group level Event info does not pre-date ingest into repository –Metadata file as content METS wraps up MODS & PREMIS info Deposit METS record with content

13 Good Practice Interoperability Various levels that will assist in the digital preservation life cycle

14 Summary: Work In Progress Schema Choice: –METS, MODS, and PREMIS Controlled Vocabulary Choices: –Data Type: instantiationMediaType –Media Type: PBCore instantiationPhysical

15 Future Work Creating centralized, web-based tracking tool –Allow curating units to add descriptive information –Avoid data duplication Import metadata and reports –Structured in schema More controlled vocabulary –Rights

16 Thank You! Tracy Popp Digital Preservation Coordinator tpopp2@Illinois.edu tpopp2@Illinois.edu Ayla Stein Metadata Librarian astein@Illinois.edu astein@Illinois.edu @TheStacksCat


Download ppt "Data Wrangling: Developing Local Best Practice for Born Digital Metadata Tracy Popp, Digital Preservation Coordinator Ayla Stein, Metadata Librarian University."

Similar presentations


Ads by Google