Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information modeling and infrastructures for metadata

Similar presentations


Presentation on theme: "Information modeling and infrastructures for metadata"— Presentation transcript:

1 Information modeling and infrastructures for metadata

2 The book catalog In a paper-based catalog, descriptive information is listed for items. Typically structured according to a general syntax similar to ISBD Association with an item is via title or distinguishing mark Access achieved by scanning the list of entries

3 ISBD ISBD specifies nine required "areas" of description and a syntax for combining the recorded information Areas: 0: Content form and media type area 1: Title and statement of responsibility area, consisting of 1.1 Title proper 1.2 Parallel title 1.3 Other title information 1.4 Statement of responsibility 2: Edition area 3: Material or type of resource specific area (e.g., the scale of a map or the numbering of a periodical) 4: Publication, production, distribution, etc., area 5: Material description area (e.g., number of pages in a book or number of CDs issued as a unit) 6: Series area 7: Notes area 8: Resource identifier and terms of availability area (e.g., ISBN, ISSN)

4 The card catalog Descriptive information about an item is recorded on cards Cards are organized into separate catalogs (file drawers) of different types to enable different access points Often: Subject / Title / Author One card per item per catalog Association to items via shelf numbering scheme Access via alphabetical search within the catalog corresponding to the access point

5 The special record format approach
Specify a machine-readable record format to hold descriptive information MARC formats Bibliographic Authority Holdings Classification Community

6 Record format – MARC21 Bibliographic
This is a record format specifically designed to encode bibliographic data. Requires using specialized software to Load, add, or update records Index records for catalog search Typically this software will be an integrated library system (ILS) Integrates cataloging and maintenance, holdings and circulation, patron information

7 MARC approach One bibliographic record per manifestation (roughly)
Access via OPAC built from indexing MARC records Assocation to items via shelf numbers, barcode identifiers, URLs

8 General database approach
Create a relational database to record descriptive information about objects Specify a schema using a general database model and infrastructure e.g. create a schema to values of Dublin Core elements The association between objects and the information structures in the database will depend on the schema design Access will be achieved via database retrieval tools

9 The relational database approach, intuitively…
The relational approach represents information as a table … as rows and columns of values. The rows are considered to be without order. The columns are considered ordered Each row typically corresponds to an entity, each column to a property, and the values are specific instances of that property. A table is a collection of rows (of equal length). A relational database is a collection of tables.

10 note Schema information is embedded in a relational database
Adding attributes (i.e. columns) requires adjusting the schema Association to digital objects can be achieved via URIs/URLs/ internal system file locators

11 Document-oriented approaches
Using a general purpose information modeling language, specify a structure for encoding descriptive information about objects For example, use XML to encode Dublin Core or use JSON-LD to serialize data in the DPLA In these approaches you have one "document" per object But you don't necessarily have one file per object Access is achieved by building an index across the document collection

12 note For online resources (especially text), document-oriented metadata can be embedded in the object XML is a "meta-grammar" for defining schemas All well-formed XML documents have to match certain requirements It's not required to have a schema to process XML data, but it helps

13 The digital repository approach
Digital repository software (e.g. Fedora, DSpace) integrates digital asset management and description Uses a built-in data model to associate digital objects with descriptive information Standard functions to disseminate or export metadata Typically as OAI-PMH data

14 Metadata Transport and Aggregation
The purpose is to maintain the integrity of a record across a change in technical environments. The object of attention is a metadata record (or a set of them). A system packages up metadata and sends it off to be received by another system. often using a standard "wrapper" that can be generated in response to a server request and operates on top of a transmission protocol.

15 OAI-PMH example

16 Metadata Aggregation and Mapping

17 The IPL – LII Merger Issues discussed by Khoo and Hall:
Unique or specialized metadata fields Lack of item-level metadata Collections stored in different databases Lack of controlled subject vocabularies Lack of hierarchy in browse structures Incommensurate subject catagories Complex metadata workflow

18 Discussion With a partner, choose one of the challenges discussed by Khoo and Hall. What are the particular consequences of the issue in terms of aggregating metadata? What kinds of processes might mitigate this challenge? Khoo and Hall generally point to issues of "institutional knowledge", but how else could these kinds of issues be avoided?

19 Large-scale digital aggregations
Aggregate descriptions of objects including some “view” of the objects objects and descriptions are maintained by providing institutions Cultural heritage aggregations Europeana Digital Public Library of America (DPLA) Promise Increase exposure of institutions and objects Wide audience: scholars, teachers, general public

20 Metadata Aggregation Already existing metadata DPLA hub model
harvested and aligned available through a portal with links to the providing institution DPLA hub model content hubs provide data directly service hubs work with smaller institutions coordinate digitization and aggregation into DPLA

21 DPLA core data model

22 Item-centric model Portal Platform
Items as focus of description and access Collection inclusion optional Collection description limited Portal Unified access to millions of items Humanities, arts, and sciences Faceted search and visualization Platform JSON-LD RESTful API

23 Dates in DPLA Date formats vary significantly

24 Date Propagation DPLA date parsing is unreliable but consistent

25 Subject - Phrases

26 Commonalities and Differences

27 Thresholded Boundaries
Variants Ojibwe-Ojibway GLBT-LGBT Hierarchies Labor Unions Minnesota Minneapolis Newspapers Organizing

28 Automatic? Descriptions

29 Automatic? Descriptions

30


Download ppt "Information modeling and infrastructures for metadata"

Similar presentations


Ads by Google