Presentation is loading. Please wait.

Presentation is loading. Please wait.

Georgia Institute of Technology Grace Agnew 1/15/2000 SCALABLE DURABLE METADATA: **A Tutorial**

Similar presentations


Presentation on theme: "Georgia Institute of Technology Grace Agnew 1/15/2000 SCALABLE DURABLE METADATA: **A Tutorial**"— Presentation transcript:

1 Georgia Institute of Technology Grace Agnew 1/15/2000 SCALABLE DURABLE METADATA: **A Tutorial**

2 Georgia Institute of Technology Grace Agnew 1/15/2000 MODELRecord Structure Repository Design Data Element Registration Database Population Dissemination to Users Data interchange ( other repositories) BUILDING A METADATA REPOSITORY

3 Georgia Institute of Technology Grace Agnew 1/15/2000 Ingest Archival Storage Data Management DI AIP DI AIP ProducerProducer SIP Access and Dissemination ConsumerConsumer Requests Other info DIP MODEL: Functional Component Model for an OAIS CCSDS 650.0-R-1: Reference Model for an Open Archival Information System (OAIS). Red Book. Issue 1. May 1999. PDF.Available at: http://ssdoo.gsfc.nasa.gov/nost/isoas/overview.html

4 Georgia Institute of Technology Grace Agnew 1/15/2000 Repository Design: Scalable: Flexible, scalable metadata object and repository structure can serve an expanding domain. Standardized: Metadata object design and repository structure are shareable by other repositories within the domain and, optimally, by other domains Unambiguous : Repository structure and metadata object design can be consistently interpreted and utilized by human and machine users Effective : Data is well-managed for persistence over space and time and is readily accessible to users at point of need. Integrated Data repository integrates well with other data sources in the user information environment

5 Georgia Institute of Technology Grace Agnew 1/15/2000 Database Management System File Based (hierarchies, drag and drop Microsoft model) Relational (Oracle, MySQL, MS Access) Object-oriented (CORBA) Data Management: Import and Export (Direct input; file transfer; batch import and export) Data validation, deletion, modification, migration mechanisms Scalable, accessible data storage Resource: Moore, Reagan, et al Configuring and Tuning Archival Storage Systems http://www.sdsc.edu/NARA/Publications/OTHER/HPSS-tuning/HPSS- tun.v3.html Metadata Repository: Structural Elements

6 Georgia Institute of Technology Grace Agnew 1/15/2000 Security: Access Control: Levels of authorization for management, input, search & retrieval, display and download Data Integrity Search and Retrieval: XQL (XML Query Language) XML Query Language (XQL) is a notation for addressing and filtering the elements and text of XML documents. XQL is a natural extension to the XSL pattern syntax. From: Robie, Jonathan, Joe Lapp and David Schach. XML Query Language: XQL. http://www.cuesoft.com/xqlspec.htm Display and Download: Options: XML(XSL Stylesheets) documents; HTML documents; fielded flat files; ASCII files, Relational database, SAS file format, etc. Closely associated with access control Tool: Zope: Open Source Web Application server (integration with MySQL; object oriented databases. http://www.digicool.com

7 Georgia Institute of Technology Grace Agnew 1/15/2000 Data Exchange between Repositories: Z39.50: ANSI/NISO Z39.50-1995 (ISO 23950): Client/Server computer-to-computer communications protocol that specifies query and retrieval of information: bibliographic data, full-text documents; images, and multimedia in a distributed network environment, across disparate computer systems, databases and search engines. Current version: 3 http://lcweb.loc.gov/z3950/agency/document.html Profiles: Profile for Access to Online Thesauri: http://lcweb.loc.gov/z3950/agency/profiles/zthes-03.html Profile for Access to Digital Library Collections: http://lcweb.loc.gov/z3950/agency/profiles/collections.html CIMI Profile for Museum Collections http://lcweb.loc/gov/z3950/agency/profiles/cimi2.html The Bath Profile:An International Z39.50 Specification for Library Applications and Resource Discovery http://www.ukoln.ac.uk/interop- focus/activities/z3950/int_profile/bath/draft/BathProfileRevisedPublicDraft10Jan2000.htm Conformance to this profile's specifications will improve international or extranational search and retrieval among library catalogues, union catalogues, and other electronic resource discovery services worldwide..

8 Georgia Institute of Technology Grace Agnew 1/15/2000 Data Exchange between Repositories: Z39.50 Variations: Z-SQL (SQL query language and generic record export in Z39.50) http://www.dstc.edu.au/Research/Projects/Z+SQL/ ZORBA (CORBA object retrieval in Z39.50) Ward, Nigel. Michael Lawley & Sonya Finnegan. ZORBA: Information Retrieval Using Distributed Object Technologies http://www.dstc.edu.au/Research/Resource_Discovery/publications/zorba_eogeo98 / Tools: LeVan, Ralph. Building a Z39.50 Client OCLC Online Computer Library Center. (pdf file) Kunze, John A. Basic Z39.50 Server Concepts and Creation. University of California at Berkeley. (pdf file) List of commercial and shareware systems: http://www.cni.org/pub/NISO/docs/Z39.50-brochure/50.brochure.part09.html

9 Georgia Institute of Technology Grace Agnew 1/15/2000 Data Exchange between Repositories: XMI: Open information interchange model for exchange of models and data over the Internet in a standardized manner. Tool: XMI Toolkit. Available from IBM Alpha Works. 90-day cost-free testing period. http://www.alphaworks.ibm.com Common Warehouse Metadata Interchange: Request for Proposal issued. Submissions due 9-17-1999 OMG Document ad/98-09-02. Available from http://www.omg.org Objectives: Establish an industry standard specification for common warehouse metadata interchange Provide a generic mechanism that can be used to transfer warehouse metadata Leverage existing vendor-neutral interchange mechanisms

10 Georgia Institute of Technology Grace Agnew 1/15/2000 DATA EXCHANGE BETWEEN REPOSITORIES BXXP Protocol: Interesting New Development! Multiplexes several generic application channels carrying XML (or other mime-type data) on a single socket connection. Provides for segmented data, windowed flow control, user authentication, profile negotiation and secure transport Block Architectural Precepts - Marshall Rose and Carl Malamud http://www.ietf.org/internet-drafts/draft-mrose-blocks-architecture-00.txt Blocks Simple Exchange Profile (M. Rose) http://www.ietf.org/internet-drafts/draft-mrose-blocks-exchange-00.txt Blocks eXtensible eXchange Protocol (M. Rose) http://www.ietf.org/internet-drafts/draft-mrose-blocks-protocol-00.txt

11 Georgia Institute of Technology Grace Agnew 1/15/2000 Models: Standard Syntax: UML - Unified Modeling Language Tool: Rational Rose http://www.rational.com/products/rose/index.jtmpl Metamodel: Provides the Conceptual Schema for a Repository Uses standard object modeling concepts Classes/Entities Relationships/Associations Attributes Conceptual Data Model: How data is structured in the real world Logical Data Model: How data is structured and processed by a computer system Modeling Tool: http://www.isis.vanderbilt.edu/projects/gme/meta/default.html

12 Georgia Institute of Technology Grace Agnew 1/15/2000 Users Primary Constraints: Domain (e.g. Astrophysicists) Organization (e.g. at the University of X) Application needs: (e.g. research, teaching) Relationships: Related domains Other information sources within the information universe Example: AGRIS and CARIS metadata records (FAO) are in MARC format Other user groups Critical to developing a metadata system is to understand the domain, the users, and how users interact with the domain. This conceptual framework is then mapped into a model for the repository

13 Georgia Institute of Technology Grace Agnew 1/15/2000 MODEL: X3.285- METAMODEL FOR THE MANAGEMENT OF SHAREABLE DATA http://pueblo.lbl.gov/~olken/X3L8/drafts/Metamodel/MetaModel_ToC.html Data Registry: A place to keep characteristics of data that are necessary to clearly describe, inventory, analyze and classify data. A data registry supports data sharing with cross-system and cross-organization descriptions of common units of data. A data registry allows users of shared data to have a common understanding of a unit of datas meaning, representation and identification. X3.285 specifies the schema of a registry where descriptions of shareable data are stored. Defines relationships and constraints between components of the Model. Data element (indivisible atomic unit of data) Data composite (collection of data elements treated as a unit) Property (distinguishes one object from another) Object class (set of concepts, abstractions or things in the universe that are bound or classed together) Representation (Expression of the data element, through permissible values, datatype, and as applicable, a unit of quantity).

14 Georgia Institute of Technology Grace Agnew 1/15/2000 Domain Model: National Health Information Knowledgebase (Australia) From: Australian Institute of Health and Welfare (AIHW): http://www.aihw.gov.au/services/health/nhik.html

15 Georgia Institute of Technology Grace Agnew 1/15/2000 Person characteristic Accommodation characteristic Demographic characteristic Education characteristic Insurance / benefit characteristic Labour characteristic Legal characteristic Lifestyle characteristic Parenting characteristic Social characteristic Cultural characteristic Other person characteristic Physical characteristic Information Model Subtypes Accommodation characteristic The living arrangements of a PERSON. For example, the type of dwelling, age of dwelling, number of bedrooms, modification of dwelling to account for restricted movement etc. In the National Health Information Model, ACCOMMODATION / HOUSING CHARACTERISTIC may relate to where a PERSON usually resides or it may be of interest at an instance in time - for example while a PERSON is in receipt of care. [Show Linked Data Elements][Show Linked Work Programs]

16 Georgia Institute of Technology Grace Agnew 1/15/2000 Metadata: Definition and Rationale Data or information which help us perform one or more of the following functions with respect to data and information resources: Finding Interpreting/evaluating Accessing Analyzing Managing Preserving Boyko, Ernie. Statistical Metadata: A User Perspective. Open Forum on Metadata Registries. January 20, 2000.

17 Georgia Institute of Technology Grace Agnew 1/15/2000 Two Metadata Development Trends Development of a metadata schema to precisely and unambiguously describe an information object, generally in a one-to-one metadata record to information object relationship. Intrinsic: Incorporated within information object Extrinsic: Located in a separate metadatabase with fielded link to information object Objectives for Standardized Metadata Schema: Provide description and management data at the information resource level. Provide standardized metadata record formats to facilitate data storage, indexing, querying, retrieval, exchange and display.

18 Georgia Institute of Technology Grace Agnew 1/15/2000 Major domain-neutral Standards: MARC (Machine Readable Data Records) primarily for library-based information resources http://lcweb.loc.gov/marc Dublin Core -15 standard elements with approved qualifiers. Primarily for web-based resources http://purl.org/dc/ Drenth, B.D., et al. 1999. Guide to Best Practice: Dublin Core. Consortium for the Computer Interchange of Museum Information http://www.cimi.org/documents/meta _bestprac_final_ann.html

19 Georgia Institute of Technology Grace Agnew 1/15/2000 RDF - Resource Description Framework. Framing system, providing transparent transport for the metadata schemas defined and utilized within its wrapper. XML schema that is both human and machine interpretable. http://www.w3.org/TR/PR-rdf-syntax/ Key Concepts: Resource: Any object uniquely identifiable by a URI (uniform resource identifier) Property-type: Property associated with a resource. Value:Associated with a property type--may be atomic (a string) or another resource, creating a new hierarchy)

20 Georgia Institute of Technology Grace Agnew 1/15/2000 RDF Property types express the relationships of values associated with resources: Famous Example The Author of Metadata Overview is Grace Agnew Metadata Overview http://www…….edu/meta Grace Agnew Resource Property Type Value Author

21 Georgia Institute of Technology Grace Agnew 1/15/2000 Tools: CORC: OCLC Cooperative Online Resource Catalog Project. http://purl.oclc.org/corc Information entered in a template is cross-cataloged in MARC, Dublin Core and RDF/Dublin Core. Membership to libraries of any description at no charge through July 1, 2000. Currently available for search and display to non-members. Use for MARC, Dublin Core and DC/RDF examples DC.dot Generates records in Dublin Core and RDF/Dublin Core: http://www.ukoln.ac.uk/metadata/dcdot/

22 Georgia Institute of Technology Grace Agnew 1/15/2000 Tools (contd): Reggie http://metadata.net Generates records in Dublin Core and RDF/Dublin Core. Provides a template for establishing a metadata registry. MetaWeb: Provides software for establishing a gateway to search distributed Dublin Core. http://www.dstc.edu.au/RDU/MetaWeb/broker/search.html Crosswalks between Formats: http://www.ukoln.ac.uk/metadata/interoperability/

23 Georgia Institute of Technology Grace Agnew 1/15/2000 Data Element Registration ISO/IEC 11179 - Specification and Standardization of Data Elements Establishes concise, unambiguous definitions and context for atomic data elements, as well as the structure and format for the values that represent the data element, for sharing data, primarily in large datasets or technical reports.

24 Georgia Institute of Technology Grace Agnew 1/15/2000 ISO 11179 Six Parts: 11179-1 Framework for the Specification and Standardization of Data Elements 11179-2 Classification for Data Elements 11179-3 Basic Attributes of Data Elements 11179-4 Rules and Guidelines for the Formulation of Data Definitions 11179-5 Naming and Identification Principles for Data Elements 11179-6 Registration of Data Elements

25 Georgia Institute of Technology Grace Agnew 1/15/2000 Metadata Registry - ISO/IEC 11179 Data elements within the described dataset are registered in ISO 11179-compliant registry to: standardize representation of the data element to enable shareability and durability (reuse) of data establish context and meaning for intelligent retrieval and interpretation of data Data element is the equivalent of an attribute in a data or object model. The representation of a single property of a class of objects in the natural world. Draft Standards: http://pueblo.lbl.gov/~olken/X3L8/drafts/draft.docs.html

26 Georgia Institute of Technology Grace Agnew 1/15/2000 Class class attributes Employee Name Identification Number Address Data elements Employee Name Employee ID No. Employee Address From Framework for the Specification and Standardization of Data Elements (draft) p. B-3 Formal Definition: A unit of data for which the definition, identification, representation and permissible values are specified by means of a set of attributes.

27 Georgia Institute of Technology Grace Agnew 1/15/2000 Principles in the Application of 11179 Each Data Element receives a unique, unintelligent number to create a reusable, international data element Data elements are derived from understanding the domain data content and breaking it into meaningful atomic elements. Metadata registries consist of the data element and its attributes, which provide definition, meaning and precision of application Metadata registries are populated in two ways: Bottom up: Begin with the data element and its attributes Top down: Develop a classification hierarchy and populate with data elements

28 Georgia Institute of Technology Grace Agnew 1/15/2000 Major Data Element Attributes - My Effort at Interpretation of the Standard Name: SMPTE_Time_and_Control_Code Definition: Time and control code for tracking playback of film, audio, and video established by the Society of Motion Picture and Television Engineers Permissible value: example or formulation principle hh:mm:ss;s Value Domain: Set of permissible values (enumerated or unenumerated SMPTE 12M-1995 Type name: Determinant Data Type: Numeric Format: hh:mm:ss;s where hh = hour (00-24) mm= minute (00-59) ss=second (00-59) and s=scene(0-N) Maximum character: unknown Minimum character: unknown

29 Georgia Institute of Technology Grace Agnew 1/15/2000 Data Element ID: unintelligent number identifying reusable data element. May include version number. Will be combined with the registration authority number (and version, if not already included) for a composite ID number 54367 Version:Used to identify modification to the data element 1 Context: Designation or description of the application environment in which the name is applied or from which it originates General: Time and control code used for tracking and editing audio, video and film media Registry: Dublin Core Coverage metadata elements for audio, video and film DTDs: DC:Coverage.t.min DC:Coverage.t.max

30 Georgia Institute of Technology Grace Agnew 1/15/2000 Example: 19:31:57;1 19:32:07;7 Data Element Concept: A concept represented by a data element, independent of any particular representation. Shared perception between two or more parties Audiovisual media time and control code Conceptual Domain: A set of possible valid values of a data element concept expressed without representation time and control code for film, audio and video expressed in hours, minutes, seconds and subseconds. Classification Level: Taxonomic location within the context of the registry Structural Metadata. Audio and Video File Component Identification

31 Georgia Institute of Technology Grace Agnew 1/15/2000 Level of Ambiguity: Precision of Data Element Attribution Generalization Registration Status: incomplete, recorded, certified and registered. Recommendation: work through the registry in many iterations. Do not move to certified or registered until taxonomy of registry is largely populated and data element has proven its durability and functionality through comment, review and use. IncompleteBecause some elements are missing and because I dont know the max and min numbers of characters for representation! Administrative Status:Designation of position of the data element in the registration process. Awaiting Information Caveat: The Above Example is Intended to illustrate the decision-making process for a first iteration of a data element and not to serve as a model.

32 Georgia Institute of Technology Grace Agnew 1/15/2000 Principles in the Application of 11179 Rigorous Registration Process encourages multiple iterations of data elements. Data element statuses: incomplete, recorded, certified and registered BENEFITS OF DATA ELEMENT APPROACH: unambiguous, shareable data that can be evaluated, analyzed and utilized on its intrinsic merits. Any kind of data can be described, including time series data measurements Precise value attributes result in population of data sets with authoritative, highly usable data. Versioning allows data analysts to track changes in naming, definition, etc. for accurate time series analyses DRAWBACKS: Context and description at the data set or information object level is lacking. Searching at the data element level does not provide sufficient description and meaning for document retrieval. Ex: data element species as used in Registration of Endangered Species or Catalog of Species of Northern Michigan

33 Georgia Institute of Technology Grace Agnew 1/15/2000 Registry Name: Biological Class Name Definition:The systematic name that represents the biological Class. Example: Mammalia Identifier: 20733 Version: 1 Administrative Status: Interim Registration Status: Standard Representation Class: Name Unit of Measure: Precision: Submitting Organization: OIRM Origin Description:Summary Report of Data Standards for Biological Taxonomy (Document) Note Description: A Class is a major subdivision of a phylum or division, usually consisting of several orders. Unresolved Issues: DISA: Create Date: 11/05/98 Change Date: 05/26/99 Value Domain Information Definition: All names that represent the portion of a systematic name that is the biological Class. Type Name: Determinant Datatype: Alphanumeric Format: A(50) Determinant Type: Minimum Character: 5 Maximum Character: 50 From: United States Environmental Protection Agency (EPA) Environmental Data Register http://www.epa.gov/edr ISO 11179 Implementation - Environmental Data Registry

34 Georgia Institute of Technology Grace Agnew 1/15/2000 A Country Data element Concept Country Code Domain -identifier: Afghanistan Belgium China..... Conceptual Domain 1..1 represented_by 1..1 specifies ISO 3166 -format: Number -item: 004 056 156...... V V ISO 3166 -Format: Alpha- 3 -item: AFG BEL CHN...... represents 1..1 represented_wit h Conceptual Domain Data Element Concept Atomic Object Country Data Element Country Represented with ISO 3166 Conceptual Domains Country Code Domain Identifier: Afghanistan Belgium China Value Domain ISO 3166 Format: Alpha-3 item: AFG, BEL CHN ISO 3166 Format: Number item: 004 056 156 ISO 11179 Metadata Registry: Implementations From: CBOP Consortium Hajime Horiuchi hori@tiu.jc.jp http://www.cbop.gr.jp

35 Georgia Institute of Technology Grace Agnew 1/15/2000 ISO 11179 Metadata Registry: Implementations Traffic Management Data Dictionary Section 3 Data Elements Version: 1.4 February 5, 1999 Annex 3 - Traffic Modeling Descriptive Name: PREDICTED_HovLaneVehicleCount_quantity Descriptive Name Context: Manage Traffic Definition: Predicted number of vehicles within a user-specified time period that legitimately are using High Occupancy Vehicle (HOV) lanes in the road and highway network. Class Name: Traffic Modeling Classification Scheme Name: IEEE P1489, Annex B Classification Scheme Version: 19980706, V0.1.0 Keywords: HOV Lane Vehicle Count Related Data Concept: Relationship Type: ASN1 Name: Predicted-HOV-lane-vehicle-count ASN1 Data Type: Integer Representation Class Term: Quantity Value Domain: SI 10-1997; vehicles Valid Value Range: Valid Value List: Valid Value Rule: Valid Value Range: VALUE (0 to 100000) Internal Representation Layout: 9999999999 Internal Layout Maximum Size: Internal Layout Minimum Size: Remarks: V1.1 - New data element. Data Concept Identifier: 3550 Data Concept Version: V1.5 Submitter Organization Name: TMDD Last Change Date: 19990205 Joint Effort: Institute of Transportation Engineers (ITE): Federal Highway Administration (FHWA) and the American Association of State Highway and Transportation Officials (AASHTO) http://www.ite.org/tmdd /

36 Georgia Institute of Technology Grace Agnew 1/15/2000 BEGIN_GROUP = MODULE_IDENTIFICATION ; DEDSL_VERSION = 0.1; MODULE_TITLE = "Global Change Master Directory dictionary" ; MODULE_ADID = Not yet registered ; END_GROUP = MODULE_IDENTIFICATION ; BEGIN_GROUP = ENTITY_DEFINITION ; NAME = Entry_ID ; MEANING = Unique identifier of the DIF ; SHORT_MEANING = Directory Entry Identifier ; VALUE_SYNTAX = STRING; END_GROUP = ENTITY_DEFINITION ; BEGIN_GROUP = ENTITY_DEFINITION ; NAME = Entry_Title ; MEANING = Title of the DIF ; SHORT_MEANING = Directory Entry Title ; VALUE_SYNTAX = STRING; END_GROUP = ENTITY_DEFINITION ; BEGIN_GROUP = ENTITY_DEFINITION ; NAME = Data_Set_Citation.Publication_Place ; MEANING = "The name of the city (and state or province and country if needed) where the data set was published or released." ; SHORT_MEANING = "Place where the data set was published or released." ; VALUE_SYNTAX = STRING ; END_GROUP = ENTITY_DEFINITION ; BEGIN_GROUP = ENTITY_DEFINITION ; NAME = Data_Set_Citation.URL ; MEANING = "The Internet Uniform Resource Locator(s) (URL) of the data set." ; SHORT_MEANING = "URL of the data set." ; VALUE_SYNTAX = STRING ; END_GROUP = ENTITY_DEFINITION ; ISO 11179 Variations: NASA Entity Dictionary Specification Language (DESDL) Source: Lou Reich. NASA/CCSDS

37 Georgia Institute of Technology Grace Agnew 1/15/2000 Elaboration in XML Film_Video_Audio_Time_and_Control_Code Time and control code for tracking playback of film, audio, and video Incomplete Awaiting review … Based on: J. McCarthy, et al. Using XML for Environmental Data SharingOpen Forum on Metadata Registries. 1/20/2000

38 Georgia Institute of Technology Grace Agnew 1/15/2000 Metadata Schema Approach Three Types of Metadata (Digital Library Federation Architecture Committee) Descriptive: Discovery and Identification of an Object (Dublin Core, MARC, EAD, etc.) Structural: Used to Display and Navigate an Object. Provide information on internal organization of an object Administrative: Management information. Date created, modified, etc. Content file format (e.g. JPEG); rights information, etc.

39 Georgia Institute of Technology Grace Agnew 1/15/2000 OAIS Reference Model: Content Information: The data object and its representation that makes it understandable to the user ( DLF: Structural) Preservation Description: Provenance, Context, Reference and Fixity (DLF: Structural and Administrative) Descriptive Information: (DLF: Descriptive Information)

40 Georgia Institute of Technology Grace Agnew 1/15/2000 Descriptive: Recommendations: In most cases, use standards-based Dublin Core as the base record - for interoperability. Add fields to serve your domain user group as needed. Document and register any added fields. Create an XML DTD Distribute metadata creation responsibilities: Administrative and Structural : Largely provided by content digitizers Descriptive: Largely provided by domain specialists Recommendation: Use thesauri for controlled subject terminology. Tool: Koch, Traugott, comp. Controlled vocabularies, thesauri and classification systems available in the WWW. DC Subject. http://www.lub.lu.se/metadata/subject-help.html

41 Georgia Institute of Technology Grace Agnew 1/15/2000 Structural: Identification. URN (Uniform Resource Names) - intended to persist IETF RFC 1737 URL: de facto web naming and addressing standard PURL: Permanent URL involves intermediate resolution by a third party. Handles: Developed by CNRI. URN proposal emphasizes persistent names. Names maintained by object publisher or author.. The handle server reconciles permanent name with address changes. http://www.handle.net/ See also: Library of Congress. National Digital Library Program. The Relationship between URNs, Handles and PURLs. http://lcweb2.loc.gov/ammem/award/docs/PURL-handle.html

42 Georgia Institute of Technology Grace Agnew 1/15/2000 Administrative Metadata Issues: Digital Persistence: technology emulation (recreate the technology needed to open and display) migration path/backward compatibility (standards backward compatible 1 or more version to allow migration of data) interpolation (technology interpolates to retrieve or enhance obsolete data)

43 Georgia Institute of Technology Grace Agnew 1/15/2000 Managing Data for Digital Persistence: Maintain information needed to create, retrieve and display each digital object. include platform, processor; version info for software and OS digital creation hardware and software digital editing hardware and software digital viewing hardware and software calibration hardware and software Visual Media: Color: color space (RGB, CMYK); color look up table; color profile for digital camera or scanner; color chart used for calibration.

44 Georgia Institute of Technology Grace Agnew 1/15/2000 Compression: Images: pixels: pixel array (ex: 2,000 x 3,000 ppi) bit depth (8-bit, 16-bit, 24-bit, etc.) Video and Audio: If at all possible, save master file in uncompressed format: e.g. IEEE (Institute of Electrical & Electronics Engineers) CCIR 601 (Broadcast Digital Video) NTSC -- 720 x 480 PAL -- 720 x 576 10-bit or 8-bit For MPEG1, 2 and 4 include level of service; frame rate (fps); frame size; bit depth. Consider IPB ratio.

45 Georgia Institute of Technology Grace Agnew 1/15/2000 Rights Management Management Restrictions Management Conditions Access Restrictions Access Conditions Use Restrictions Use Conditions Define rights management in a format that maps to future use of a resolver (e.g. resolve to an address with copyright and use information as opposed to embedded use and access text) DOI - Digital Object Identifier. Development of the commercial publishing domain. Uses handles technology to resolve access and use (and to support ecommerce applications). http://www.doi.org

46 Georgia Institute of Technology Grace Agnew 1/15/2000 Recommendation: Re-Use metadata elements developed by respected Early Adopters: Making of America II White Paper: http://sunsite.berkeley.edu/moa2/wp-v2.html MOAII Document Type Definition: http://sunsite.berkeley.edu/MOA2/papers/DTD.html National Library of Australia: PANDORA (Preserving and Accessing Networked Documentary Resources of Australia) http://www.nla.gov.au/pandora Library of Congress - Structural Metadata Dictionary for LC Digital Objects http://lcweb.loc.gov:8081/ndlint/repository/attdefs.html UKOLN (UK Office for Library and Information Networking): http://www.ukoln.ac.uk/metadata/cld/

47 Georgia Institute of Technology Grace Agnew 1/15/2000 Rights Metadata: University of Pittsburgh. School of Information Sciences. Functional Requirements for Evidence in Recordkeeping http://www.lis.pitt.edu/~nhprc Video Metadata: Hunter, Jane and Liz Armstrong. A Comparison of Schemas for Video Metadata Representation http://www8.org/w8-papers/3c-hypermedia-video/comparison/comparison.html Hunter, Jane and Jan Newmarch. An Indexing, Browsing, Search and Retrieval System for Audiovisual Libraries. http://link.springer.de/link/service/series/0558/bibs/1696/16960076.htm Administrative Metadata: A-Core IETF Draft Standard for Metadata about Descriptive metadata (documenting provenance, etc.) Iannell & Campbell. http://metadata.net/admin/draft-iannella-admin-01.txt

48 Georgia Institute of Technology Grace Agnew 1/15/2000 Recommendations: * Create XML DTD for metadata records by format * Use Dublin Core with approved qualifiers as the base record * Document metadata elements in a metadata registry * Use RDF as the export wrapper (report format for a relational database) AlphaWorks Tools can assist: DDbE: accepts well-formed XML documents and constructs a DTD XMI Toolkit: generate DTDs and share Java objects XML Parser for Java: validating parser XML Generator: generates instances of valid XML from a DTD http://www.alphaworks.ibm.com

49 Georgia Institute of Technology Grace Agnew 1/15/2000 Metadata Resources. http://dewey.yonsei.ac.kr/metadata/links.htm IFLANET. Digital Libraries: Metadata Resources http://www.ifla.org/II/metadata.htm UK Office of Library Networking. Metadata for Preservation: CEDARS Project Document AIWO1 http://www.ukoln.ac.uk/metadata/cedars/AIW01.html National Library of Australia. PADI: Preserving Access to Digital Information. http://www.nla.gov.au/padi/ National Archives of Australia. Designing and Implementing Recordkeeping Systems. http://www.naa.gov.au/Govserv/techpub/DIRKSman/dirks.html General References


Download ppt "Georgia Institute of Technology Grace Agnew 1/15/2000 SCALABLE DURABLE METADATA: **A Tutorial**"

Similar presentations


Ads by Google