METS at UC Berkeley Generating METS Objects
Background Kinds of materials: –primarily imaged content & tei encoded content archival materials: manuscripts and pictorial collections oral histories Kinds of Metadata –Structural metadata: physical structure –Descriptive metadata –BasicTechnical metadata about digital files and how they were produced
Tools For Producing METS Objects GenDB –Gathers structural, descriptive and technical metadata GenX –Generates METS objects from GenDB
GenDB Consists of: –Relational database (Currently SQL Server) –Locally developed software for gathering metadata and facilitating digital processing
Div 1 GenDB Database Structure Structural Metadata Div 2 Div 3 Object 1 Object 2 (root) (parent = div 1) Div 1 Div 2 Div 3 (root) (parent = div 2) (parent = div 1) Div 4 (parent = div 2) Object 1 Div 1 Div 2 Div 3 Object 2 Div 1 Div 2 Div 3 Div 4 … Structural Md Table
Div 1 GenDB Database Structure Descriptive Metadata Div 2 Div 3 Object 1 Object 2 Div 1 Div 2 Div 3 Div 4 Core Desc Md Name 1 Name 2 Name 3 Note 1 Note 2 Note 3 Name Table Note Tables Structural Md Table
Div 1 GenDB Database Structure Content File/Technical Md Div 2 Div 3 Object 1 Master Image Table Derivative Image Table Structural Md Table Drv 1 Drv 2 Drv 3 Mstr 1 Mstr 2 Technical Md Drv 4 Technical Md
Populating the Database Tables Web interface: manual input of structural and descriptive metadata Digitization Management modules –Generate work orders to guide digitization process –Import content file information and technical metadata coming out of digitization process Batch loader: batch input based on TEI encodings, legacy metadata
Web Interface: WebGenDB Web Interfac e SQL Server Database Java Servlet Java Server XML Config Files rmi jdbc
Digitization Management Modules Web Interfac e Java Servlet Java Server SQL Server Database Imaging/ Transcription WorkOrders Vendor Technical MD Spreadsheets
Batch Loader Web Interfac e SQL Server Database Java Servlet Java Server Java Batch Loader XML Batch Load File TEI Docs XSLT
WebGenDB The concepts that drove the design Shielding user from METS complexity Highly configurable Unicode support Access driven by login privileges Use of Open Source software and components Distributed approach
XML Configuration Files Three levels –Common to all projects elements –Common to all screens in a project elements –Specific to a screen in a project Define fields common to all projects Define fields used in specific project Define screens by project & object type
AlProjects.xml Proj1.xml Proj2.xml ObjectType1.xml ObjectType2.xml ObjectType1.xml ObjectType2.xml Relation among XML files
workorder /data/_w/GenDB/WEB-INF/classes/edu/berkeley/library/propertyFiles/CalCultureWorkOrderScreensFile.xml Image checkbox Image 1 Text checkbox Text 1 Title text Title 60 Project XML file example
Software used MSSQL running on NT Tomcat implementing servlets 2.3 Jsdk 1.4 Xalan 2.4 Xerces FOP JDOM beta 8 Opta 2000
Relationship of GenDB to METS Metadata not directly stored in METS, MODS or MIX schema formats. –Much of the database structure was developed before these standards emerged –Database structure and content adjusted to be compatible with all these formats
GenX: From GenDB to METS Allows Digital Publishing Group staff to select the objects in the GenDB database that are ready for export and to export them as METS objects.
GenX Architecture App Interfac e GenDB Java Application METS XML Repository JDBC
GenX Output METS output corresponding to version 1.3 Descriptive metadata exported to METS descMD in MODS 2.0 format Technical Metadata exported to METS techMD in MIX format Planned: –Text technical md to METS descMD in NYU TextMD –Rights to METS rightsMD in ODRL subset
Links GenDB Web Interface Demo – –login: demo –password: demo Developers: