Presentation on theme: "Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of."— Presentation transcript:
Optimising metadata workflows in a distributed information environment R. John Robertson & Jane Barton Centre for Digital Library Research University of Strathclyde, UK
Overview Introductions & definitions: Metadata, workflow & optimisation Diversity & the distributed information environment Models and frameworks: Generic models: repositories, objects & metadata Existing models & frameworks Developing a metadata lifecycle model Using the metadata lifecycle model to optimise workflow Moving forward
Metadata, workflow & optimisation Metadata = good quality metadata = metadata that meets repository requirements Metadata workflow = quality assured metadata by design = metadata creation & QA processes designed to meet repository requirements with available resources Metadata workflow optimisation = refining metadata workflow to improve quality & enhance metadata Critical to functionality, interoperability & sustainability of repositories
Optimising metadata workflow Determine required metadata quality Determine target metadata quality Design & implement workflow Refine workflow Review Determine purpose of metadata Local environment Wider environment Barton, J. & Robertson, R.J. Designing workflows for quality assured metadata. CETIS Metadata & Digital Repositories SIG Meeting, Edinburgh, 10 th March 2005.
Diversity & the dIE In the wider environment, there is considerable diversity of purpose of metadata requirements of metadata creation processes & priorities Diversity presents challenges for interoperability between repositories Diversity also offers potential for refinement of metadata workflow among repositories Assumes/requires persistent object identifiers
Optimising metadata workflow in the dIE Workflow optimisation requires a model of the dIE to facilitate strategic partnerships to inform allocation of resources to foster holistic approach to creation, augmentation & enhancement of metadata To achieve this, two conditions must be met: local workflow must be articulated local workflow must be placed in context of wider environment
Reference models for workflow optimisation Ecology of repositories provides a typology of repositories & associated services models the relationships between them & between their domains Object lifecycle model profiles objects within repositories & their movement, transformation & adaptation within the dIE Metadata lifecycle model profiles metadata within repositories & its movement, augmentation & enhancement within the dIE
Existing models & frameworks Existing models that relate to (parts of) the reference models: the E-Learning Framework McLean & Blincos cosmic view the JISC Information Environment CORDRA the work of Gonçalves et al
The E-Learning Framework (ELF) A common approach to service oriented architectures for education via: a definitional model of service components standards & tools to support their interoperability Addresses a specific domain & provides a typology of functions within that domain (The E-Learning Framework.
McLean & Blincos cosmic view A service domain typology of repositories more comprehensive than ELF but less detailed highlights potential for cross-domain approach identifies need for better articulation of context & methodologies to deal with complex contextual issues (McLean, N. The ecology of repository services: a cosmic view. ECDL,
The JISC Information Environment Provides convenient access to a comprehensive collection of scholarly & educational materials can be viewed as a specific implementation of ELF provides a superstructure to inform & co-ordinate technical infrastructure development focuses on technical solutions to support structural & syntactical interoperability taking a lead in addressing unresolved issues in the object lifecycle (JISC. Strategic activities: Information Environment
CORDRA Enables access to wide range of learning object repositories through federated searching: high common denominator for participating LORs creates community of repositories behind interoperability boundary assumes federation as method of interaction, with metadata integration rather than interoperability, so little potential for metadata workflow optimisation (Kraan,W. & Mason,J. Issues in federating repositories: a report on the first International CORDRA Workshop. D-Lib Magazine, 11(3), 2005.)
Gonçalves et als 5S Complex formal taxonomy of repositories: comprehensively catalogues repositories from five perspectives engages with all three reference models but does not engage with interactions & offers only a static view (Goncalves,M.A. et al. Streams, structures, spaces, Scenarios, societies (5S): a formal model for digital libraries. ACM Transactions on Information Systems, 22(2), 2004.)
Existing models & frameworks In general, existing models address structural & syntactic interactions to a degree but do not address semantic interactions provide voices, vocabularies & grammar for repositories could usefully be extended to profile not only what repositories do but how they might interact with each other
Developing a metadata lifecycle model A metadata lifecycle model (MLM) must: include profiles of each repositorys metadata, ideally at element level, more realistically in terms of structure, semantics & syntax distinguish between local requirements & those of the wider community enable clusters of similar repositories to be identified & relationships established include processes carried out as a result of these relationships, formal or informal
Components of the model
Using the MLM to optimise workflow MLM enables repositories to optimise workflow by: exploiting known metadata sources elsewhere in the dIE via intelligent import or harvesting exploiting formal metadata relationships between repositories & services via negotiation & establishment of minimum standards provides a framework for assessing the cost/benefit of eg implementing particular metadata elements or participating in consortia
Using the MLM: example The NSDL is a centralised service harvesting metadata from multiple sources: breaks harvested metadata into elements & assigns provenance metadata to them creates optimum records by combining metadata elements from various sources creates metadata profiles of sources to enable these processes to be automated demonstrates that metadata workflow optimisation & intelligent harvesting can yield real benefits
Using the MLM: use cases LOR using LOM wants to harvest metadata records, has crosswalks & mappings for structure & syntax, seeks repositories with similar semantic approach federated search service wants to dynamically select search targets that can support MESH departmental repository enhances its metadata by re-harvesting general subject terms from its IR & specialist subject terms from a subject repository centralised service augments metadata automatically & original source re-harvests improved record
Moving forward… In context of rapid repository development with limited resources, must use available resources as effectively as possible Optimising metadata workflow across the dIE can enable repositories to: expand element sets without compromising on quality expand functionality improve ingest processes support more automatic metadata transformation & enhancement
Moving forward… Development of the MLM to support metadata workflow optimisation requires: standard way of profiling repositories at repository, object & metadata level integration with registry projects for repositories, standards, application profiles & vocabularies at individual repository level, a method for the design of metadata workflows that makes reference to & exploits workflows elsewhere in the dIE
Optimising metadata workflow Determine required metadata quality Determine target metadata quality Design and implement workflow Refine workflow Review Determine purpose of metadata Local environment Wider environment Barton, J. & Robertson, R.J. Designing workflows for quality assured metadata. CETIS Metadata & Digital Repositories SIG Meeting, Edinburgh, 10 th March 2005.