Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.

Similar presentations

Presentation on theme: "Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich."— Presentation transcript:

1 Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich

2 2 Demo Resources Search Procedure Added Value of Multilingual Semantic Search Architecture Functionality & Features User Feedback & Plans for Final Phase Overview

3 3 Example Setting The Dutch student Willem is an exchange student in Germany. For a course about Semantic Web, every student has to give a presentation about a related topic. Willems topic is markup languages. The teacher tells the students that they can use ILIAS to find learning materials on the topic in various languages. Willem knows: 1. Dutch, 2. English, 3. German Demo

4 4 Demo with ILIAS Demo

5 5 1.A multilingual document collection 2.An ontology including a domain ontology on the domain of the documents 3.Concept lexicalisations in different languages 4.Concept annotation of the documents Resources for semantic search

6 6 PL PT RO EN MT NL BG CD DE Lexicons: Term Concept LOs Ontology BG CS DE EN MT NL PL PT RO Relation between resources

7 7 Search Procedure search languages visualisation language found concepts normalise & lexicon lookup extract & visualise ontology fragments search terms retrieval languages found LOs ontology fragments user-selected concepts search retrieval languages (two kinds of results) found LOs (navigate and select)

8 8 1. Better retrieval of LOs –Find LOs that would not be found by simple text search (where exact search word must occur in text) –Example: search for screen – retrieve LO that contains monitor but not screen. 2. Multilinguality –One implementation applies to all languages in the project 3. Crosslinguality –Possible to find LOs in languages different from search/interface language No need to translate search query Search possible with passive foreign language knowledge Added Value

9 9 LMS / ILIAS / other system Lexicon Lookup Component Ontology Management System Ontology Search Engine LexiconsOntologyLearning Objects LT4eL Web-Services Full-Text Search Keyword Search Crosslingual Semantic Search Search architecture

10 10 1.Query Lexicon terms 2.Terms Concepts 3.Concepts Relevant documents 4.Ranking for found documents 5.Show concept neighbourhood Internal components

11 11 Why start with a free text query? –User wants results fast (as in Google) –Compete with fulltext search and keyword search –Find starting point for ontology browsing Query lexicon: adopted/implemented strategies for –Tokenise create combinations for multiword terms (e.g. "space bar") –Diacritic/case-insensitive match (é e; E e) –Use word forms from corpus for coverage 450 new word forms 170 lexicon entries (average per language) 1: Query Terms

12 12 Possible situations: Corresponding concept is missing from ontology –LT4eL: not in lexicon no result Unique result: term is lexicalisation of one concept Multiple concepts from more domains: –Window (graphical representation on monitor) –Window (part of a building) Multiple concepts from one domain, e.g.: –Key (from keyboard) –Key (in database) Different concepts for different languages: –Kind (English: sort/type) –Kind (German: child) By using ontology navigation, user can choose 2: Term Concept

13 13 Query expansion at concept level: Super concepts one level (not integrated) Direct sub concepts one level (not integrated) Direct + indirect concepts (integrated) AND/OR At concept level (precise) At text input level: –AND-search: for every term, at least one denoted concept must be found & occur in the LO –OR-search: for at least one of the terms a denoted concept must be found & occur in the LO 3: Concept Documents

14 14 Number of different matching concepts Annotation frequency: number of times search concepts are annotated in the document –Normalise: divide by document length Superconcepts and subconcepts of search concepts have lower weight –A factor determines their weight (0.3) Language of document: –Sort per language (currently) 4: Ranking (ID List ranked ID List)

15 15 Input: a set of concepts (+ visualisation languages) –Concepts that have been found by lexicon lookup –Concepts that have been selected in ontology browser Output: XML document, containing for each of the specified concepts: –Neighbourhood: all concepts that can be reached directly by a relation in any direction Currently: always one super-concept + a set of sub-concepts –For specified concepts + neighbourhood concepts: Head term from lexicon (for each of the specified languages) Concept definition (for each of the specified languages) 5: Concept neighbourhood

16 16 Search Process free-text query two modes of ontology usage: –fully-automatic (simple for user) –select concepts (precise; get more insight) find ontology-starting-point without browsing orthographic variation –normalisation for upper/lowercase and diacritics –multi-word terms cross-lingual retrieval Functionality & Features (1)

17 17 Internal query expansion with subconcepts word forms mapped to lexicon entries Presentation & Results ontology visualisation in preferred language –best representing term –definition ranking of found LOs matching topics are displayed Functionality & Features (2)

18 18 Semantic search & overview of related topics is nice BUT: Not clear what is the language of a found LO Not clear how multi-word input is treated / AND-search is not possible Ontology fragments disappear after adding LO to personal desktop Difference between semantic search / ontology browsing is unclear For many concepts, no parent concept is shown (upper ontology not navigatable) No good overview in ontology browsing: branches stay small (if you go deeper, higher ones disappear; no relation to other branches) Show part of the LO where concept occurs Combination of search methods would be good Feedback from user scenario experiments

19 19 Integrate AND/OR search Separate semantic search from ontology browsing Integrate improved ontology with relations Navigate entire ontology, including upper part Show Google-like snippet of LO Checkboxes instead of drop-down list Plans for the final phase

20 20 Use a language-independent ontology for retrieval of Los: –Automatic, with traditional text-field for query –Explicit: visualise ontology in various languages for presentation to user Overall feedback from users was positive: –Users liked the possibility to find LOs in multiple languages –Users liked the related topics hierarchy Make even more useful: integrate improved ontology and functionalities Conclusion

Download ppt "Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich."

Similar presentations

Ads by Google