How do you squeeze all of a research project into the repository? Michael Wood Institutional Repository Manager ARROW Community Day, Melbourne 27 th September 2007
The promise of repositories Collection, organisation, preservation and dissemination of diverse digital objects Using computers in networks aggregate, re-use and manipulate objects for a variety of purposes Build the global digital library
eScience and eHumanities Increased and improved Reworking data Re-evaluating conclusions Collaboration Comparison through Published data Published methods as well as published conclusions
How do you deliver this? As a repository manager can you deliver this promise now? To investigate delivering part of that promise in practice with a real example at La Trobe How do you truly encapsulate a research project in a repository?
The research project Archaeology Cyprus 15 years Lots of research products to play with Can we encapsulate this project in the repository? How would we go about it? If it isn’t doable now, what are the issues?
Research products – Maps and plans
Divide the site into compounds
Drawings of the compounds
Detail from compound drawing
Photos of features Hearth
Photos of artefacts
Artefacts of diverse sources and uses and relations
Photographs containing data
Drawings of items
Field logs recording and linking
Field record sheets
Records that are indexes
Indexes to objects
Even journal articles!
All in a related collection
Making it meaningful We can easily add these objects to the repository and disseminate them How do we make them meaningful to users? How do we get them to reflect the full meaning of the research? We need to express the relationships of these objects
Why relationships Provide context to objects and hence meaning Find related objects Understand how the objects were created – provenance Manipulate and process related objects A
Expressing relationships Some can be inferred from other metadata –Photos with time in the technical metadata could be related to a phase of the project Some may be expressed through an index object Some need direct expression
Relationships - kinds Hierarchical Spatial Temporal Semantic
Hierarchy - collection isMemberOf Create a “home object” for the collection Unidirectional label Relationships could be reciprocal – an issue
Sub-hierarchies contains “contains” may mean different things – an issue
Spatial foundAdjacentTo
Temporal isOccupationPhaseOf
The importance of Time - 1
The importance of Time - 2 “The recording system is essentially the same as that used since 1992, with only a few modifications and slight variations from one season to another. There will probably be further changes in the future. These differences (the most important involve pottery recording, especially decoration) need to be borne in mind when working with material from different seasons.” Frankel and Webb, Marki Excavation Manual
Provenance isRecordedIn
Use – Publication or Teaching isPublishedIn
Implementing in Fedora Fedora can have: –Objects – the things themselves as a datastream –Metadata about the objects –Methods of disseminating or processing the objects –Identifiers attached to the objects
Implementing – RELS-EXT This is the recommended pathway Fedora objects have a metadata stream called RELS-EXT RDF –In this case subject hasRelationshipWith object Triple store
RDF syntax-ns# s# –
All in a related collection
Issues – Proof of the software We need to see a large triple store indexed and working
Issues – Data and metadata capture Get the objects/metadata in efficiently –In the field –Assisted post-field entry –FieldHelper – USydney Indexing efficiency
Issues - Relationships Create a defined set of relationships specific to the project Relatable where possible to existing schemes Expanding on the existing Fedora set – hierarchy Need vocabularies with name spaces to carry meaning into the future
Issues - retrieval Delivery mechanisms for diverse kinds of objects In combination
Issues - Search Search both ways (reciprocal relationships) In particular from the whole object to its parts and from any part back to the whole User interface that provides rich opportunities without getting the searcher lost
Only beginning Many groups working simultaneously on –Scientific models –Geographical expression –RDF triple stores and indexes –Automated field entry
A work in progress Michael Wood Institutional Repository Manager La Trobe University Library