Presentation is loading. Please wait.

Presentation is loading. Please wait.

Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop 22-23 November 2007.

Similar presentations


Presentation on theme: "Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop 22-23 November 2007."— Presentation transcript:

1 Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007

2 Agenda TELPlus Context Improving subject access –3 sub-tasks Services for TEL

3 TELPlus Context Started October 2007 Running 27 months Content WPs –OCRing previously digitised material –Improving the usability of TEL through OAI PMH compliancy –Improving Access –Integrating services with TEL portal –User personalisation services –Extending TEL to Bulgaria & Romania

4 WP3 – Improving Access Task 1: Indexing for usability –Review/test state-of-the-art semantic search engines On content of documents Task 2: Improving subject access Task 3: FRBR aggregation, search and browsing –Create/exploit FRBR metadata repositories Task 4: Focus on users –Focus groups on prototypes

5 WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Search through collections –Using metadata –In a controlled setting Paving the way for enhanced usages –Advanced treatments mentioned in TELplus need conceptual structures and links between these structures E.g. clustering

6 WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Reference: MACS project –Manually-built semantic equivalences between Rameau, SWD & LCSH headings

7 MACS: Querying Collections

8 MACS: Query Reformulation Options

9 WP 3 Task 2 – Improving Subject Access Improving subject access via semantic alignment between subjects Reference: MACS project –Manual equivalences between Rameau, SWD, LCSH headings Here: an experiment on deploying automatic alignment techniques –Determining possible strategies –Assessing feasibility and usefulness –MACS context

10 WP3.2 Sub-tasks Converting the subjects to standard representation language –Semantic web format (SKOS) Aligning the vocabularies –Semantic correspondences between subjects Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other

11 Converting subjects to standard representation language Goal: solving syntactic heterogeneity between vocabularies Enabling the use of standard tools –E.g. for query (re)formulation Paving the way for dealing with semantic heterogeneity –Definitions of concepts expressed according to a common model

12 Converting subjects to standard representation language Approach: Semantic Web and SKOS Semantic Web –Knowledge objects as web resources (URIs) –Description by linking resources (RDF) –Description using shared formal vocabularies (ontologies) SKOS –A standard Semantic Web model (ontology) –For knowledge organization systems (thesauri, subject heading lists…)

13 skos:Concept rdf:type skos: broader skos: prefLabel the Virgin skos: prefLabel la Vierge skos: inScheme skos:ConceptScheme rdf:type SKOS: Example

14 Converting subjects to standard representation language - Process Getting processable versions from owners –E.g. XML Analyzing the models Converting to SKOS

15 WP3.2 Sub-tasks Converting the subjects to standard representation language –Semantic web format (SKOS) Aligning the vocabularies –Semantic correspondences between subjects Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other

16 Vocabulary Alignment Specifying required alignment format (links) –Type of mapping links: equivalence, broader –Cardinality: one-to-one, one-to-many –Taking application context (TEL) into account

17 Vocabulary Alignment Specifying required alignment format (links) Selecting (& running) alignment techniques/tools –Inspired by semantic web approaches

18 Vocabulary Alignment Techniques Similar to ontology alignment problem Existing approaches for (semi-) automatic ontology alignment –Using techniques from linguistics, computer science, statistics Problem: performances do not allow 100% automatic alignment Problem: multilingual case –Some techniques cannot be used

19 Background knowledge Potential Technique: Using Background Knowledge Using a shared conceptual reference to find links SHL 1 SHL 2 Calendar Publication

20 Potential Technique: Statistical Alignment Object information (book indexing) SHL 1SHL 2 Dually-indexed books Dutch Literature Dutch

21 Vocabulary Alignment Specifying required alignment format (links) Selection (& running) of tool/method Evaluation (& cleaning) –Considering application

22 Evaluation of Alignments MACS has produced mappings! –Possible gold standard But: has MACS produced all mappings? –Which proportion of the SHLs is covered? –Taking into account all indexing strings? Are MACS mappings the only interesting ones? –Serendipity mappings Concepts that are not equivalent but could bring useful results when added to queries –Compensating for indexing variability

23 Evaluation of Alignments Several scenarios for using and evaluating alignments –Concept-based search –Re-indexing –Integration of one SHL into the other –SHL Merging –Free-text search –Navigation

24 Evaluation of Alignments Several scenarios for using and evaluating alignments –Concept-based search Retrieving books indexed by SHL1 using SHL2 concepts –Re-indexing –Integration of one SHL into the other –SHL Merging –Free-text search Matching user search terms to both SHL1 or SHL2 concepts –Navigation Browsing several collections using one SHL structure

25 Evaluation of Alignments Several settings for a single scenario –Fully automatic reformulation vs assisted reformulation (candidates) Different evaluation measures –Good mappings vs acceptable ones –Number of candidates for reformulation –Semantic closeness to original query

26 Vocabulary Alignment Specifying required alignment format (links) Selection (& running) of tool/method Evaluation (& cleaning) Assessment of the approach –Efforts required, quality, extendibility

27 WP3.2 Sub-tasks Converting the subjects to standard representation language –Semantic web format (SKOS) Aligning the vocabularies –Semantic correspondences between subjects Deploying the alignment knowledge obtained into TEL framework –E.g. using links to reformulate queries from one subject list to the other

28 Deploying the alignment knowledge obtained into TEL framework Observing integration of MACS data into TEL –Conceptual input for alignment requirements Integration of the obtained alignment in TEL Assessment of the alignment integration –Technical aspects, usage aspects

29 Reminder Alignment is a difficult problem Application-specific alignment pretty much unexplored in Semantic Web research More a feasibility study than a complete solution to the problem Practical goal: investigate how automatic techniques could help MACS-like initiatives Manual mapping is labour-intensive

30 Agenda TELPlus Context Improving subject access –3 sub-tasks Services for TEL

31 WP4 – Integrating services with the European Library portal Theo van Veen (KB) Tasks: Identifying services that are going to give the user the greatest return Creating new services Integrating services within TEL …

32 WP4 – Some Services Mentioned Preliminary inventory: no official commitment! Services based on controlled vocabularies: Thesaurus and name authority service –Providing terms linked to query terms Semantic enrichment service –Users can annotate search results with terms Distance between terms and related terms

33 WP4 – Some Services Mentioned Preliminary inventory: no official commitment! Services based on controlled vocabularies: Thesaurus and name authority service Semantic enrichment service Distance between terms and related terms Adding more value from controlled vocabularies and alignments between them

34 Thanks!


Download ppt "Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop 22-23 November 2007."

Similar presentations


Ads by Google