Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interoperability in the Cultural Heritage Domain Lourens van der Meij VU Amsterdam – KB (part of sheets by A.Isaac) October 3 rd, 2008.

Similar presentations


Presentation on theme: "Interoperability in the Cultural Heritage Domain Lourens van der Meij VU Amsterdam – KB (part of sheets by A.Isaac) October 3 rd, 2008."— Presentation transcript:

1 Interoperability in the Cultural Heritage Domain Lourens van der Meij VU Amsterdam – KB (part of sheets by A.Isaac) October 3 rd, 2008

2 Interoperability in the Cultural Heritage Domain Background CATCH (NWO) Continuous Access To Cultural Heritage Computer science research projects Applied to Cultural Heritage (Libraries, Musea) STITCH SemanTic Interoperability To access Cultural Heritage Interoperability: Exchanging (standardization) Integrating (translating, linking) metadata

3 Interoperability in the Cultural Heritage Domain Intention Show through example applications that Integration of data, collections, and services Interoperability: Data standardized such that it can be used across different applications Functionality reusable via services. Creating mappings, semantic links between data from different sources is important in the Cultural Heritage Domain

4 Interoperability in the Cultural Heritage Domain First Illustrate Integrated access to collections in the CH domain by looking at use case. Introduction of the use case About vocabulaires Introduce the collections that will be integrated Faceted browsing What we want -> Demo Requirements, details

5 Interoperability in the Cultural Heritage Domain (Integrated) Access to collections Collections: (records) of books, pieces of art,… Electronic access, web portal. STITCH focuses on semantics: structured access using the available knowledge sources, not full text search Records: meta data, information about the object Author Date Subject CH institutes often maintain knowledge structures(KOS), vocabularies, to facilitate storage and access and maintenance. Subject meta data, access through KOS focus of STITCH.

6 Interoperability in the Cultural Heritage Domain Vocabularies (Knowledge Structures, KOS) Thesauri, classification systems, structuring collections, describing content, form, aspects of collection elements. Many vocabularies, within the KB: STITCH is cooperation between VU Amsterdam (KRR group), National Library(KB) and MPI Nijmegen. In the KB in the order of 10 vocabularies are maintained internally, and 20 or more external vocabularies play a role. Why? History Specialized collections, particular views on the collection and theories how access should be provided. Examples of vocabularies in the demos.

7 Interoperability in the Cultural Heritage Domain Vocabularies Many different (kinds) of Vocabularies Many different representations, data formats, methods of access. Integrated access requires standardized representation of vocabularies and collections standardized access => services Providing links between elements of vocabularies, alignment of vocabularies Next: example of integration

8 Interoperability in the Cultural Heritage Domain Illustration, use case STITCH Integrated access to two collections: KB : geillumineerde manuscripten BnF: Mandragore, manuscrits enluminés STITCH focus: Integration Alignment, techniques (and standards) Interoperability RDF, SKOS Those aspects will be discussed after the first demo.

9 Interoperability in the Cultural Heritage Domain KB Illustrated Manuscripts

10 Interoperability in the Cultural Heritage Domain KB Illustrated Manuscripts: Iconclass

11 Interoperability in the Cultural Heritage Domain Mandragore

12 Interoperability in the Cultural Heritage Domain Mandragore

13 Interoperability in the Cultural Heritage Domain Faceted browsing Access the collection, using structure of the vocabularies Different dimensions: subject, author,.. Use the hierarchy of vocabularies if there is such to group together objects Lions, Giraffes, Zebras -> animals. Distinguish them as a group.

14 Interoperability in the Cultural Heritage Domain What we have

15 Interoperability in the Cultural Heritage Domain What we want

16 Interoperability in the Cultural Heritage Domain Demo KB Illuminated Manuscripts BNF Mandragore Manuscripts mandraNewNONE, amphibianshttp://galjas.cs.vu.nl:33333/MANDRA-SV-ICE- mandraNewNONE Wheat

17 Interoperability in the Cultural Heritage Domain Integrated Access Integrated semantic access requires standardized representation of vocabularies and collections standardized access => services Providing links between elements of vocabularies.

18 Interoperability in the Cultural Heritage Domain Standardized representation Use of semantic web techniques “Things” are represented as “resources”,URIs, over any application and data set Values as simple strings, numbers(Literals), URIs Properties as typed, named links between URIs and URIs and Literals Theory, reasoning methods.  interoperability, some standardization  Still need standardization on how to represent CH objects (xml:Dublin core), vocabularies (SKOS), links between elements of vocabularies.

19 Interoperability in the Cultural Heritage Domain skos:Concept rdf:type skos: broader skos: prefLabel “the Virgin skos: prefLabel “la Vierge skos: inScheme skos:ConceptScheme rdf:type SKOS: Example

20 Interoperability in the Cultural Heritage Domain SKOS (Simple Knowledge Organization System) SKOS offers building blocks to represent KOSs in RDF Objects: Concept and ConceptScheme Lexical properties (multilingual) prefLabel altLabel Semantic relations broader, narrower related Notes scopeNote definition …

21 Interoperability in the Cultural Heritage Domain Vocabulary alignment Aim: finding semantic correspondences between vocabulary elements “klassieke ruïnes” ≈ “landschap met ruïnes” “maagd Maria” = “Heilige Moeder” Doing it (semi-) automatically Vocabularies are big (tens of thousands concepts) They change

22 Interoperability in the Cultural Heritage Domain Automatic alignment techniques Lexical Labels of entities and textual definitions Structural Structure of the vocabularies Background knowledge Using a shared conceptual reference to find links Extensional Object information (e.g. book indexing) céréale, grain, blé blé

23 Interoperability in the Cultural Heritage Domain Automatic alignment techniques Lexical Labels of entities and textual definitions Structural Structure of the vocabularies Background knowledge Using a shared conceptual reference to find links Extensional Object information (e.g. book indexing) céréale, grain, blé blé

24 Interoperability in the Cultural Heritage Domain Extensional Statistical Alignment Object information (e.g. book indexing) Thesaurus 1 Thesaurus 2 Collection of books “Dutch Literature” “Dutch”

25 Interoperability in the Cultural Heritage Domain Results 1: ( ) Schilderijen - schilderkunst 2: ( ) Kwaliteitszorg - kwaliteitsmanagement 3: ( ) Personeelsmanagement - personeelsbeleid 4: ( ) Beeldende kunsten - beeldende kunst 5: ( ) Nederlands - Nederlandse taalkunde 17: ( ) Diabetes mellitus - suikerziekte

26 Interoperability in the Cultural Heritage Domain Alignment: no Trivial Solution Current techniques are not reliable as unique source of knowledge What is a good alignment? Evaluation criteria? => What will it be used for? Usage scenarios Integrated Search Reindexing Thesaurus merging Navigation => faceted browsing

27 Interoperability in the Cultural Heritage Domain What next Evaluation, lessons learned What next -> Second use case: reindexing (Vocabulary service) Conclusion

28 Interoperability in the Cultural Heritage Domain Why usage scenarios Evaluation of alignments depends on its use. Real world applications provide test of quality of alignments Requirements on alignments depend on their use. What kinds of links should be distinguished? Optional demo evaluation: Next, reindexing, nearest to real world application.

29 Interoperability in the Cultural Heritage Domain Situation at Dutch libraries, National Library(=KB) KB: two large collections: DEPOT?Deposit collection: all Dutch language publications) Own Scientific collection Subject indexing using two completely different indexing systems Brinkman, GOO Common automation system for NL, Eu (OCLC-Pica) Meta data of books, contains lots of fields Een boek, publicatie door verschillende bibliotheken voorzien van meta data, gebruik makend van vele verschillende vocabulaires.

30 Interoperability in the Cultural Heritage Domain Reindexing KB has about 20 people indexing books daily, about 20,000 books per year are being indexed. Indexing even internally according to different vocabularies. Indexing: adding keywords and classification information to books. Some books come with indexing done by other libraries (openbare bibliotheken, Biblion). If Biblion indices, or combinations could be translated to KB indices (Brinkman). Less work for KB.

31 Interoperability in the Cultural Heritage Domain WinIBW OCLC (PICA) automatiseringssysteem voor bibliotheken in Nederland, ook gebruikt binnen Europa Online Public Access Catalogue (OPAC) WinIBW internet access to Pica system (local and central). Adding records, adding meta data, searching records. Demo, closest to real world application.

32 Interoperability in the Cultural Heritage Domain Reindexing Biblion -> Brinkman Fietstochten, Kapellen, Beesel, Heiligenbeelden,… -> Brinkman? Use alignment.. Bibl:Fietstochten -> Brinkman? Bibl:Kappellen -> Brinkman? DEMO (Voorbeeld z sel gd? 79)

33 Interoperability in the Cultural Heritage Domain

34

35

36

37

38

39

40

41 Result

42 Interoperability in the Cultural Heritage Domain Reindexing Under evaluation Improvement: Use other meta data Adapt scenario (pass 95% confidence records) Many other uses.

43 Interoperability in the Cultural Heritage Domain Schets vocabulaires van belang voor de KB

44 Interoperability in the Cultural Heritage Domain Integrated Access Services through the internet Protocols, SOAP, REST,.. Collection Access? Vocabulary Access, Alignment access

45 Interoperability in the Cultural Heritage Domain Lessons Using semantic web techniques interoperability and integration of collections can be made easier. Aligning vocabularies is of use in different situations. The alignment methods need to be fine-tuned to the application they are meant for. Introducing new techniques, interaction between field CH and scientific institutes very valuable. Standardization of access to collections and vocabularies should be dealt with (prototype has been developed).

46 Interoperability in the Cultural Heritage Domain Begrippen An ontology in both computer science and information science is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain.computer science information sciencedomainreason Metadata (meta data, or sometimes metainformation) is "data about data", of any sort in any media. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, for example a database schema.datumdatabase schema

47 Interoperability in the Cultural Heritage Domain begrippen A library classification is a system of coding and organizing library materials (books, serials, audiovisual materials, computer files, maps, manuscripts, realia) according to their subject and allocating a call number to that information resource. Similar to classification systems used in biology, bibliographic classification systems group entities that are similar together typically arranged in a hierarchical tree structure.systembooksmapsmanuscriptsrealia In information technology, a thesaurus represents a database or list of semantically orthogonal topical search keys. In the field of Artificial Intelligence, a thesaurus may sometimes be referred to as an ontology.information technologyorthogonal Artificial Intelligenceontology


Download ppt "Interoperability in the Cultural Heritage Domain Lourens van der Meij VU Amsterdam – KB (part of sheets by A.Isaac) October 3 rd, 2008."

Similar presentations


Ads by Google