Presentation is loading. Please wait.

Presentation is loading. Please wait.

Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African.

Similar presentations


Presentation on theme: "Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African."— Presentation transcript:

1 Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African Studies, London

2 2 Outline 1.Introduction 2.Archive architectures 3.Current Issues 1.value-adding interaction from end users 2.flexibility in access to materials 3.granularity of description of materials 4.Conclusions

3 3 Introduction – ELAR Part of the Hans Rausing Endangered Languages Project (HRELP). Open for deposits since October In the process of designing and implementing key systems.

4 4 Introduction – ELAR ELAR will be the first language archive that allows users to: add metadata in the language of their choice add new metadata (comments, descriptions, links) to existing materials translate metadata into a language of their choice select language preference(s) for viewing existing metadata add metadata to archived materials at different levels of granularity

5 5 Introduction – current issues End users adding value to archive materials who will moderate such additions? Flexible support of access can an archive explicitly support multilingual users? Metadata – comments / description of materials: should the granularity of description be at the level of: files, collections of files, and/or sub-subsections of a file?

6 6 Archive architectures Producers Silo The classic silo view of an archive: little more than disaster-proof backup

7 7 Archive architectures Silo The producers are not the only users: different dissemination formats are required… Dissemination Producers

8 8 Archive architectures Silo The producers are not the only users: different dissemination formats are required… …for different user communities Dissemination Producers Designated communities

9 9 Archive architectures Silo Working formats are not preservation formats: materials may need to be transformed on ingest DisseminationDesignated communities IngestionProducers

10 10 Archive architectures You cannot rigidly preserve digital data: file need to refreshed and migrated to current formats ArchiveDissemination afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds Designated communities IngestionProducers

11 11 Archive architectures …but the objects, metadata and structures are still backed up in disaster-proof silos. ArchiveDissemination afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds Designated communities IngestionProducers

12 12 Archive architectures Archives need to define three types of packages ingestion, archive and dissemination: ArchiveDissemination afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds Designated communities IngestionProducers

13 13 Ingestion (Accession) packages Formats & structures that can be converted to archive formats with minimal effort: open file formats well-documented structures: XML with schema ideal The content needs to take into account the many potential uses of the materials: high quality sound and video a variety of genres detailed metadata and structural information

14 14 Dissemination packages Many potential users of archived materials: researchers speakers educators publishers With many different requirements: access to materials by various methods archive services continuation of ownership of language materials

15 15 Current issues – value adding The current model is fairly uni-directional but users can/should add value to archive materials ArchiveDissemination afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds Designated communities IngestionProducers

16 16 Current issues – value adding Users should be able to add to existing materials: speakers comments on content results of recent research ArchiveDissemination afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds Designated communities IngestionProducers

17 17 Current issues – value adding The archive needs to trust certain users to add metadata to existing materials: should the identity of users be recorded / open? should users be able to challenge existing metadata? Who to trust? depositors cannot moderate all comments on objects, especially if comments can be in any language but can an archive deny a speakers request to add comments to a recording of them speaking?

18 18 Current issues – flexibility of access The archive cannot create different dissemination packages for every language and/or user: ArchiveDissemination afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds Designated communities IngestionProducers

19 19 Current issues – flexibility of access Users should be able to personalize access: language preference(s) for metadata preference on type of materials ArchiveDissemination afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds afd_34 dfa dfadf fds fdafds Designated communities IngestionProducers

20 20 Current issues – flexibility of access Flexibility of search / browse: keyword search engine type search rich relationships between objects for browsing geographic searches research community specific search

21 21 Current issues – flexibility of access Flexibility of language: most metadata in most archives is in English should metadata be multilingual?

22 22 Current issues – flexibility of access If a user prefers to speak Quechua, then Spanish, then English: rather than accessing via one interface per language… OROR …

23 23 If a user prefers to speak Quechua, then Spanish, then English: …users should get all languages at once, according to availability of data and their preferences label in Quechua: photographer in Spanish: date in English: Current issues – flexibility of access

24 24 Current issues – granularity Archives tend to treat archived files as atomic metadata only refers to files as a whole What about a specific comment about a 20 second subsection of the file? a general comment applying to many files?

25 25 Current issues – granularity For example, suppose we have an annotated sound recording of some event:

26 26 Current issues – granularity Some metadata is about the file as a whole: date recorded, speakers, title

27 27 Current issues – granularity Some metadata is about sub-segments: name of a significant person or place specific linguistic phenomena

28 28 Current issues – granularity It is likely that users will want to: add comments to such subsections richly link subsections to other items make unambiguous reference to subsections At the time of deposit, no one can predict which subsections of files will later be significant: users need to be able to explicitly define subsections of archive objects

29 29 Conclusions Archives are not static repositories: an archive supports materials for multiple different user communities in parallel Value-adding interaction: archived materials can be further enriched by users Flexibility in access to materials: personalizable interaction with archive materials Granularity of description of materials: user defined granularity of materials

30 30 Thank you


Download ppt "Current design issues for digital archives Robert Munro (presented by David Nathan) Endangered Languages Archive (ELAR), School of Oriental and African."

Similar presentations


Ads by Google