Presentation is loading. Please wait.

Presentation is loading. Please wait.

Patrizia Paggio Center for Sprogteknologi A Modular and Scalable Environment for the Semantic WEB.

Similar presentations


Presentation on theme: "Patrizia Paggio Center for Sprogteknologi A Modular and Scalable Environment for the Semantic WEB."— Presentation transcript:

1 Patrizia Paggio Center for Sprogteknologi A Modular and Scalable Environment for the Semantic WEB

2 IT-højskolen2 Goals To develop an innovative methodological environment and software that will enable content providers to build a knowledge grid in which l the content of WEB pages can be managed in a modular and scalable way, and l queries can be posed in natural language to extract relevant content from the grid based on the underlying ontologies Testbed: a demonstrator to search university sites

3 IT-højskolen3 Domain areas involved l Semantic Web l Ontology mapping l Knowledge management l Topic maps l Text and data mining l Intelligent agents

4 IT-højskolen4 NL-based, intelligent content search

5 IT-højskolen5 MOSES consortium FINSA, Italian software company (agent technology, requirements engineering) Mondeca, French software company (Knowledge management and semantic markup, graph theory) Parabots, Dutch software company (text and data mining) Rome III Univ. (user partner, graph theory) Rome II Univ. (language technology, machine learning) CST (language technology, content-based search)

6 IT-højskolen6 MOSES consortium

7 IT-højskolen7 Planning

8 IT-højskolen8 The semantic web The present web is a collection of texts for humans to inspect and use. On the semantic web, texts are structured (marked up) so that programs (agents) can manipulate them. The semantic structure refers to common repositories e.g. ontologies.

9 IT-højskolen9 A scenario A student/researcher looking for information on university courses or research activities in Europe. “I need a list of institutes offering post-graduate courses in computational linguistics including corpus linguistics where the teaching language is...” “Which Danish university offers Danish language courses for foreign students?”

10 IT-højskolen10 Our vision Content of web pages is structured according to relevant templates and ontologies (the project will create those relevant for the domain) Help is provided by the system to find the templates that best match the pages to be marked up Search is based not on the words in the text, but on the semantic templates A linguistic agent processes the results to generate relevant answers

11 IT-højskolen11 Our vision “I need a list of institutes offering...” “The following institutes offer post-graduate courses in computational linguistics including corpus linguistics:...” “Which Danish university offers...” “The University of Cph offers Danish language courses for foreign students”

12 IT-højskolen12 Main work packages 1. Requirements and domain analysis 2. Architecture design 3. Semantic structure and tools 4. Implementation of agents 5. Content-based engine 6. Test and validation

13 IT-højskolen13 Query analyser Investigate methods and develop tools to analyse user queries and convert them into semantic descriptions. Based on a realistic corpus of questions/queries. Use of shallow linguistic analysis. Specific linguistic items, e.g. interrogative pronouns.

14 mapping analysis da_query_1 da_ontology da_analysed_query_1 search it_ontology it_analysed_query_1 search Multilingual search as ontological mapping

15 IT-højskolen15 A1A1 T2T2 R2R2 T4T4 R4R4 T1T1 R1R1 A2A2 T6T6 R6R6 R7R7 T5T5 R5R5 T3T3 R3R3 Topic maps Topics and associations

16 A CS R2R2 EM R3R3 BV R1R1 “Bernard Vatant is instructor of a tutorial about Content Structure Engineering hold at Extreme Markup Languages 2002” Association Example This association represents an assertion about three topics l One person : “Bernard Vatant” l One space-time event : “Extreme Markup Languages 2002” l One concept : “Content Structure Engineering”

17 IT-højskolen17 Example “List alle lektorerne i italiensk i efteråret 2003” (List all associated professors in Italian in the Autumn 2003) list-all(x) [lektor(x), subject(italian), time(autumn-2003)]

18 IT-højskolen18 Example, cont. At = course-assoc Rt1 = instructor Rt2 = subject Rt3 = institution Rt4 = time instructor professore professore ricercatore professor lektor UA ordinario associato

19 IT-højskolen19 Example, cont. list-all(x) [lektor(x), subject(italian), time(autumn-2003)] list-all(x) [instructor(x), OR list-all(x) [prof-associato(x), subject(italian), subject(italian), time(autumn-2003)] time(autumn-2003)]

20 IT-højskolen20 Answer generation - example “kurser i datalingvistik” (courses in computational linguistics)...educational programme in computational linguistics, Göteborg University. A Swedish program offering bachelor's and master's degrees...Lund University’s curriculum 2001-2002. Computational linguistics deal with automatic analysis of texts and other linguistic material... (Result of a Google search: texts are not tagged with concepts! Bold face added to relevant information)

21 IT-højskolen21 Answer generation, cont. “I have found the following courses:”...educational programme in computational linguistics, Göteborg University. A Swedish program offering bachelor's and master's degrees...Lund University’s curriculum 2001-2002. Computational linguistics deal with automatic analysis of texts and other linguistic material... The introductory sentence should be in the language of the query!

22 IT-højskolen22 Answer generation, cont. I have found the following courses: l Göteborg University, bachelor’s and master’s degrees + link l Lund University + link Introductory sentence and relevant concepts (bachelor’s and master’s degrees) should be in the language of the query!

23 IT-højskolen23 More information MOSES web site coming up soon Link from www.cst.dk THANK YOU


Download ppt "Patrizia Paggio Center for Sprogteknologi A Modular and Scalable Environment for the Semantic WEB."

Similar presentations


Ads by Google