Presentation on theme: "From Spectator to Annotator: Possibilities offered by User- generated Metadata for Digital Cultural Heritage Collections Seth van Hooland Université Libre."— Presentation transcript:
From Spectator to Annotator: Possibilities offered by User- generated Metadata for Digital Cultural Heritage Collections Seth van Hooland Université Libre de Bruxelles
Metadata creation for image collections Retrieval of high level semantics within images relies entirely on human indexing Indexing of historical image collections is notoriously hard and extremely expensive Digital images are created on a large scale (>10.000) No specifically trained staff for attributing metadata within the institution on an intensive basis => Highly problematic to have enough inhouse ressources to index image collections
Possible solution: distributed indexing Development of web-based collection management software at the end of 90s Possibilies to distribute the access of the database to a larger number of indexers The process of cataloging and indexing is no longer necessary an inhouse activity Example: http://na.memorix.nl/http://na.memorix.nl/
Distributed image indexing: web2.0 tools Passive consumer of information => active user who reorganizes, augments and distributes information (RSS, blogs, wikipedia) Social / colloborative tagging P2P based information retrieval Two emblematic applications: http://del.icio.us/ and http://flickr.com/http://del.icio.us/ http://flickr.com/
Differences with traditional indexation Form nor content of the metadata are controled Produced by the user community: fundamental change in the resource-user relation, where the authority of the librarian/archivist/conservator is questioned Incorporates metadata that are intrinsically linked to the indexer
Possibilities for the cultural sector? Prototype: Steve projectSteve project Advantage: « serendipity »serendipity Desadvantage: very low semantic value of the tags Alternative form of user-generated metadata: user comments Historical context Attempt to evaluate the quality of user-generated metadata and to draw up a typologie of these comments Case study: image database of the National Archives of the Netherlands
Method Information quality definition=> « fitness for purpose », meaning are the comments usefull to the users? Query analysis: compare the content of queries with the content of the comments Mapping with « Shatford-Panofsky » categories
Results: Categorisation of queries: S1=17,50%, S2=5,5%, S3=57%, S4=2,5%, G1=9%, G2=8,5% Categorisation of comments: S1=67,61%, S2=18,87%, S3=30,70%, S4=20,56%, G1=6,29%, G2=1,71%, G3=0,57%, G4=0,29%, A2=2,86% Queries and comments alike concentrate on specific notions, use few generic terms and no abstract terms.
Typologie of the comments: Corrections of the existing metadata: 34,13% Narrativity / context: 18,87% Personnal experiences: 4,29% Opinion: 2,86% Dialogue / questions: 1,15%
Narrativity Certain comments put diverse and scattered information into a context => Lev Manovich « Database as a cultural form » (Language of new media)
Personal experiences Small number of comments reflect on personal experiences regarding the image What is the interest to other users?
Personal opinions Very few personal comments Again: what interest to other users?
Dialog A small number of users poses questions and interacts with other users by sending comments Acts in a forum like manner Helps in the creation of virtual communities around heritage institutions
Postmodern indexing? The role of each intervenant within the information chain is no longer strictly defined: user - librarian - indexer - editor - author Two index-layers: the authority of the librarian/archivist/conservator is confronted with the informal and personal metadata of users How can these different layers be managed within a collection management system? « Narcissm of the viewer» should be avoided