Presentation on theme: "JEREMY NORDMOE SIL International. It is crucial that the capture of metadata occur as close as possible to the collection of resources to ensure that."— Presentation transcript:
It is crucial that the capture of metadata occur as close as possible to the collection of resources to ensure that vital information is accurately and timely collected.
The SIL International Language & Culture Archives 42,000 + items 1,500 + languages Given this scope and diversity of languages, the archives must rely on field linguists to submit complete and accurate metadata.
Past Practice Field linguists filled out a one page metadata questionnaire to accompany physical material Archives staff processed items into database and on to shelves
Past Practice: Problems 1. Archiving physical material from field was cumbersome and risky 2. Missing or incomplete questionnaires required Archives staff to: o research the missing information, o settle for minimal description, or o postpone processing for lack of key metadata.
New Practice Deployment of a DSpace Institutional Repository Facilitates digital archiving Empowers field linguists to directly engage with the archives both in data discovery and data submissions
Roadblocks Two issues hamper the submission of language resources to DSpace from the field: 1) a wide range of metadata options 2) limited internet connectivity
Challenge #1 So much metadata, so little time… Deploying a simplified interface that handles the intricacies of metadata schemas used for vastly different types of resources: language documentation vernacular literacy products translated texts language & culture descriptions training materials Many of metadata fields are only relevant for specific kinds of documents
Challenge#2 A field linguist’s internet blues… Working in remote areas with unreliable or non-existent internet connections Conventional web applications are ineffective File uploads to over HTTP are not resumable
Building an On- RAMP to the Digital Repository Resource And Metadata Packager (RAMP) a client side application that assembles metadata and all relevant data files in a ‘package’ that the SWORD API decodes into a submission in the repository
RAMP Initial Screen Dots are labeled, show progress through the steps and allow jumping between steps Required metadata is denoted by an orange bar to the left of the input box. Brief help text appears below.
Metadata wizardry Linguists proceed through a series of data entry screens, each addressing a small group of metadata elements similar to a software installation wizard. As the user enters descriptive information, the selection and contents of subsequent data entry screens is affected. As a result, the user never encounters irrelevant questions.
Users simply type ahead to select a language code and name
Before uploading, users have opportunity to review all metadata on one convenient summary screen. Users can initiate an upload to the repository directly from this Summary screen. Users can also export the package to a portable media device, if internet is a problem.
Users are encouraged to maintain a library. A package may be duplicated as a template for future packages.
Uploading in chunks The package of files and metadata is broken down into chunks that are transmitted separately. If a portion fails to upload, that portion is attempted again, thus avoiding the need to re-send the entire package.
How it works o The data entry screens, the rules (condition steps) for displaying them, and the context- sensitive help are dynamically generated from a conveniently editable YAML document maintained by the archiving staff. o Allows for quick and effortless changes without requiring program code to be written or modified. o Customizable to work in additional contexts.
Future development integrate RAMP into other SIL open source tools. The first of these will be SayMore – the language documentation session organizer. http://saymore.palaso.org/ http://saymore.palaso.org/ include a help menu providing more detailed assistance. add a feature allowing users to create custom templates or choose from a template library. add a mechanism for sorting and searching the RAMP library.
Implementation RAMP launched alongside our institutional repository in January 2011 To date, early adopters have packaged and uploaded roughly 1400 items representing a wide variety of resources. Feedback from linguists has confirmed that RAMP streamlined clunky DSpace features and simplified the descriptive process by limiting choices.
Ongoing Challenges Insufficient Upload Capacity At launch, RAMP handled files only as large as 250MB, and many users expect to upload considerably larger data sets, especially in audio and video formats. Recent development work has increased capacity to at least 1.25GB per upload
Ongoing challenges The SWORD API lacks specific error messaging in an instance of import failure that frustrates users and makes troubleshooting difficult unable to import several key pieces of metadata requested in DSpace’s ‘initial questions’ and ‘upload’ screens – requiring submission reviewers to manually insert this data.
Ongoing challenges Greater Simplification Feedback reveals that the descriptive process in RAMP may still be too cumbersome for some. Factors to investigate: 1. Generational 2. Cross-cultural 3. Work-load
Lessons learned Better communication and the enabling of the auto-update during field beta testing. Training plan lagged behind the launch. Integrating the manual, currently delivered via the corporate intranet, into the initial release in order to serve users who have subpar internet capability.
CONCLUSION Good metadata collection will vastly improve the preservation and discovery of language resources archived in both traditional and digital repositories. Linguists who are collecting, analyzing and publishing these resources are the experts when it comes to describing their work. By addressing the dual obstacles of a complex metadata schema and inadequate internet in the field, RAMP enables linguists to easily submit quality archive packages from the field.
AVAILABILITY SIL invites the linguistic community to examine, adapt and improve upon RAMP as a tool for increasing both the quality and quantity of documentation about the world’s languages. RAMP is built on Adobe Air 2.7.1 Free download: http://get.adobe.com/air/http://get.adobe.com/air/ Download RAMP: http://ramp.leancoder.com/http://ramp.leancoder.com/ open source under the GNU general public license v.3.0