LT4EL - Integrating Language Technology and Semantic Web techniques in eLearning Lothar Lemnitzer GLDV AK eLearning, 11. September 2007.

Slides:



Advertisements
Similar presentations
1 Copyright ©2007 Sandpiper Software, Inc. Vocabulary, Ontology & Specification Management at OMG Elisa Kendall Sandpiper Software
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
IST Humboldt University Berlin, Germany – Computer and Media Service – Electronic Publishing Group Birgit Matthaei, 4th Sept. 2003, Bath,
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Elearning Quality for Learning Repositories in Secondary Education Elearning Quality for Learning Repositories in Secondary Education e-Learning Quality:
Using a domain-ontology and semantic search in an eLearning environment Lothar Lemnitzer, Kiril Simov, Petya Osenova, Eelco Mossel and Paola Monachesi.
Crosslingual Ontology-Based Document Retrieval (Search) in an eLearning Environment Eelco Mossel LSP 2007, Hamburg.
WP 4: Integration of Language Technology Tools into ILIAS Learning Management System Alexander Killing Project review, Utrecht, 1 Feb 2007.
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Applying Ontology-Based Lexicons to the Semantic Annotation of Learning Objects Kiril Simov and Petya Osenova BulTreeBank Project
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
26/10/2008 SWESE'08 1 Enhanced Semantic Access to Software Artefacts Danica Damljanović and Kalina Bontcheva.
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Multilingual eLearning in LANGuage Engineering. Project Overview  Project span: Oct 2004 – Oct 2007  Kick-off meeting Oct  Project goals:
Using language services to enrich the LOs' descriptions Dr. Vassilis Protonotarios University of Alcala, Spain 10 th Strategic Seminar / Conference 6-7.
Information and Business Work
Crosslingual Ontology-Based Document Retrieval (Search) in an eLearning Environment RANLP, Borovets, 2007 Eelco Mossel University of Hamburg.
Supporting e-learning with automatic glossary extraction Experiments with Portuguese Rosa Del Gaudio, António Branco RANLP, Borovets 2007.
Crosslingual Retrieval in an eLearning Environment Cristina Vertan, Kiril Simov, Petya Osenova, Lothar Lemnitzer, Alex Killing, Diane Evans, Paola Monachesi.
WP 2: Semi-automatic metadata generation driven by Language Technology Resources Lothar Lemnitzer Project review, Utrecht, 1 Feb 2007.
Keyword extraction for metadata annotation of Learning Objects Lothar Lemnitzer, Paola Monachesi RANLP, Borovets 2007.
LTeL - Language Technology for eLearning -
A System for A Semi-Automatic Ontology Annotation Kiril Simov, Petya Osenova, Alexander Simov, Anelia Tincheva, Borislav Kirilov BulTreeBank Group LML,
LT4eL - WP1: Setting the scene WP leader: UAIC Univ. AI. I. Cuza of Iasi Faculty of Computer Science Dan Cristea, Corina Forăscu, Dan Tufiş, Ionuţ Pistol,
AceMedia Personal content management in a mobile environment Jonathan Teh Motorola Labs.
LTeL - Language Technology for eLearning - Paola Monachesi, Lothar Lemnitzer, Kiril Simov, Alex Killing, Diane Evans, Cristina Vertan.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
What is a document? Information need: From where did the metaphor, doing X is like “herding cats”, arise? quotation? “Managing senior programmers is like.
Semi-automatic glossary creation from learning objects Eline Westerhout & Paola Monachesi.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
What Linguists Want (we think) Helen Aristar Dry & Anthony Aristar LINGUIST List & E-MELD.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
1 LOMGen: A Learning Object Metadata Generator Applied to Computer Science Terminology A. Singh, H. Boley, V.C. Bhavsar National Research Council and University.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Metadata generation and glossary creation in eLearning Lothar Lemnitzer Review meeting, Zürich, 25 January 2008.
February 2007MCST - FP7 Launch1 Michael Rosner Department of Computer Science and Artificial Intelligence University of Malta.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
FIIT STU Bratislava Classification and automatic concept map creation in eLearning environment Karol Furdík 1, Ján Paralič 1, Pavel Smrž.
Multilingual Information Exchange APAN, Bangkok 27 January 2005
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Eurocris Membership Meeting Lisbon 9-11 November 2005 Sérgio Tenreiro de Magalhães Luís Amaral University.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
WP5: Validation Anne De Roeck Diane Evans The Open University, UK.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Project Overview Vangelis Karkaletsis NCSR “Demokritos” Frascati, July 17, 2002 (IST )
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
© Copyright 2013 STI INNSBRUCK “How to put an annotation in HTML?” Ioannis Stavrakantonakis.
L JSTOR Tools for Linguists 22nd June 2009 Michael Krot Clare Llewellyn Matt O’Donnell.
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
GEMET GEneral Multilingual Environmental Thesaurus leading the way to federated terminologies Stefan Jensen, Head of information services group with input.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Topic Maps introduction Peter-Paul Kruijsen CTO, Morpheus software ISOC seminar, april 5 th 2005.
Adaptive Book: Teaching and Learning Environment for Programming Education Ananda Gunawardena & Victor Adamchik.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
Semantics and the EPA System of Registries Gail Hodge IIa/ Consultant to the U.S. Environmental Protection Agency 18 April 2007.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Usage scenarios, User Interface & tools
Institute of Informatics & Telecommunications NCSR “Demokritos”
Development of the Amphibian Anatomical Ontology
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
Chaitali Gupta, Madhusudhan Govindaraju
Presentation transcript:

LT4EL - Integrating Language Technology and Semantic Web techniques in eLearning Lothar Lemnitzer GLDV AK eLearning, 11. September 2007

LT4eL - Language Technology for eLearning Start date: 1 December 2005 Duration: 30 months Partners: 12 EU finacing: 1.5 milion Euro Type project: STREP IST

LT4eL - Partners Utrecht University, The Netherlands (coordinator) University of Hamburg, Germany University Al.I.Cuza of Iasi, Romania University of Lisbon, Portugal Charles University Prague, Czech Republic IPP, Bulgarian Academy of Sciences, Bulgaria University of Tübingen, Germany ICS, Polish Academy of Sciences, Poland Zürich University of Applied Sciences Winterthur, Switzerland University of Malta, Malta Eidgenössische Hochschule Zürich Open University, United Kingdom

LT4eL- Objectives -1- Scientific and Technological Objectives –Integration of language technology resources and tools in eLearning –Integration of semantic Knowledge in eLearning –Improve (multilingual) retrieval of learning material

LT4eL - Languages Bulgarian Czech Dutch German Maltese Polish Portuguese Romanian English

LT4eL- Objectives -2- Political objectives –Support multilinguality –Knowledge transfer –Awareness raising –Exploitation of resources –Facilitate access to education

Tasks Creation of an archive of learning objects Semi-automatic metadata generation driven by NLP tools: –Keyword extractor –Definition extractor Enhancing eLearning with semantic knowledge –ontologies Integration of functionalities in the ILIAS Learning Management System; Validation of new functionalities in the ILIAS Learning Management System; Address Multilinguality

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

Creation of a learning objects archive collection of the learning material (uploads & updates at - passwd protected) IST domains for the LOs: 1. Use of computers in education, with sub-domains: 1.1 Teaching academic skills, with sub-domains: Academic skills Relevant computer skills for the above tasks (MS Word, Excel, Power Point, LaTex, Web pages, XML) Basic computer skills (use of computer for beginners) (chats, , Intenet) 1.2 Impact of e-Learning on education 2. Calimera documents (parallel corpus developed in the Calimera FP5 project, )

Collection of learning materials and linguistic tools normalization of the learning material convertors from html/txt to basic XML format Inventarization and classification of existing tools ( relevant to: –the integration of language technology resources in eLearning –the integration of semantic knowledge Inventarization and classification of existing language resources corpora and frequencies lists: lexica:

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

Semi-automatic metadata generation with LT and NLP Aims: supporting authors in the generation of metadata for LOs improving keyword-driven search for LOs supporting the development of glossaries for learning material

Metadata metadata is essential to make LOs visible for larger groups of users authors are reluctant or not experienced enough to supply it NLP tools will help them in that task the project uses the LOM metadata schema as a blueprint

Identification of keywords Good keywords have a typical, non random distribution in and across LOs Keywords tend to appear more often at certain places in texts (headings etc.) Keywords are often highlighted / emphasised by authors

Modelling Keywordiness Residual Inverse document frequency used to model inter text distribution of KW Term burstiness used to model intra text distribution of KW Knowledge of text structure used to identify salient regions (e,g, headings) Layout features of texts used to identify emphasised words and weight them higher

Challenges Treating multi word keywords (suffix arrays will be used to identify n-grams of arbitrary length) Assigning a combined weight which takes into account all the aforementioned factors Multilinguality

Evaluation Manually assigned keywords will be used to measure precision and recall of key word extractor Human annotator to judge results from extractor and rate them

Identification of definitory contexts Empirical approach based on linguistic annotation of LO Identification of definitory contexts is language specific Workflow –Definitory contexts are searched and marked in LOs (manually) –Local grammars are drafted on the basis of these examples –Linguistic annotation is used for these grammars –Grammars are applied to new LOs –Extraction of definitory context performed by Lxtransduce (University of Edinburgh - LTG)

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

Ontology-based cross-lingual retrieval Metadata can also be represented by ontologies Creation of a domain ontology in the area of LOs For consistency reasons we employ also an upper ontology (DOLCE) Lexical material in all 9 languages is mapped on the ontology and on the upper ontology Ontology will allow for multilingual retrieval of LOs

Domain Ontology creation lexicon (vocabulary with natural language definitions) simple taxonomy thesaurus (taxonomy plus related-terms) relational model (unconstrained use of arbitrary relations) fully axiomatized theory

Domain Ontology terminological dictionary in chosen domain - term in English, - a short definition in English - translation of the term formalize the definitions to reflect the relations like is-a, part-of, used-for; definitions translated in OWL-DL not achieve a fully axiomatized theory, but relational model of the domain connection to the upper ontology will enforce the inheritance of the axiomatization of the upper ontology to the concepts in the domain ontology

Upper Ontology: DOLCE the ontology should be constructed on rigorous basis it should be easy to be represented as an ontological language such as RDF or OWL there are domain ontologies constructed with respect to it it can be related to lexicons - either by definition, or by already existing mapping to some lexical resource

Lexikon CZ EN CONVERTOR 1 Documents SCORM Pseudo-Struct. Basic XML LING. PROCESSOR Lemmatizer, POS, Partial Parser CROSSLINGUAL RETRIEVAL LMS User Profile Documents SCORM Pseudo-Struct Metadata (Keywords) Ling. Annot XML Ontology CONVERTOR 2 Documents HTML Lexikon PT Lexikon RO Lexikon PL Lexicon GE Lexikon MT Lexikon BG Lexikon DT Lexicon EN PLGE BG PTMTDTRO EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY Glossary

Integration in ILIAS Integration of LT4eL functionalities for semi- automated metadata generation, definitory context extraction and ontology supported extended data retrieval into a learning management system (prototype based on ILIAS LMS) Developing and providing documentation for a standard-technology-based interface between the language technology tools and learning management systems

Integration of functionalities ILIAS Server Java Webserver (Tomcat) Application Logic User Interface KW/DC/Onto Java Classes / Data Webservices Axis nuSoap Servlets/JSP Development Server (CVS) KW/DC Code Code/Data Ontology Code ILIAS Content Portal LOs Evaluate functionalities directly Evaluate functionalities in ILIAS Nightly Updates Use functionalities through SOAP Migration Tool Third Party Tools

Validation of enhanced LMS. Challenge is to answer these questions: –How does this compare with what can already be done with existing systems? –What added value is there? –What is the educational / pedagogic value of these functionalities? Problem is to evaluate the functionality and separate from issues of usability or unfamiliarity with the LMS platform.How can we expect users to identify any benefit?

How can we expect users to identify any benefit? Present them with tasks to complete using LMS With no project functionality With project functionality – Partial – Full Identify potential users –Course Creators –Content Authors or Providers –Teachers –Students studying in their own language studying in a second language

Create outline User Scenarios We define scenarios, in this context, as – a story focused on a user or group of users which provides information on the nature of the users, the goals they wish to achieve and the context in which the activities will take place. –They are written in ordinary language, and are therefore understandable to various stakeholders, including users. –They may also contain different degrees of detail.

Conclusions Improve retrieval of learning material Facilitate construction of user specific courses Improve creation of personalized content Support decentralization of content management Allow for multilingual retrieval of content

Contact Contact for information: