New Century, New Metadata Thomas Krichel University of Surrey, Hitotsubashi University and Long Island University
Why Metadata Fun Information retrieval Support organization of social process
Crisis of Author Self-archiving Formal archiving Small Metadata poor Informal archiving Information retrieval difficult Lack of support infrastructure
Improving formal archiving Strengthen the metadata provision Broaden the mission of archiving Allow usage of archived material in many user services Better report on archive material usage Strengthen the relationship with overlay services
Improving Informal Archiving Build standardized metadata supply format Harvest that metadata into larger digital libraries Offer archival backup for papers
Metadata to Support Self- archiving Simple to compose Intuitive vocabulary that is specific to the academic process, e.g. author instead of creator Widely applicable All disciplines and publication forms High quality i.e. controlled
Metadata Control Any processing that is done to the metadata before its inclusion in a user service. Essential in a situation where metadata is harvested.
Types of Control Syntactic control Relational control Retrieval control Identity control Verity control Accession control
Basic Model Four different record types Document Group Person Organization
Group and document There is only one document type. Groups are used to refine the status of the document. Group construct meant to be defined by librarians, publishers and other intermediaries.
Person and Institution Person and institution admit very similar attributes It is hoped that organizational information will be contributed by intermediaries.
Implementation of Basic Model RePEc documents 100 groups (series) 500 authors 5000 institutions Example Possible to do the same thing for ReLIS
Basic Grammar XML syntax Three groups of XML elements Nouns: element for items described Adjectives: elements that describe nouns Verbs: elements that relate nouns
Modular Design
Relational Design Das Kapital
Other features Lang qualifier to all elements, it ISO if there are two letters and the bibliographic variant of ISO if three letters. Nouns have id. Verbs have startdate and enddate qualifiers, and of course have id. Adjectives can have child elements.
Remaining Problems Resolvability rules for identifiers Dates and history Subject classification using the group mechanism Aliasing of element names
To be done… Complete list of verbs and adjectives Schema design Parsing and validation software. Conversion with test collection ReLIS.
Collaboration is welcome Thanks for listening. Have a happy New Year.