Related terms search based on WordNet / Wiktionary and its application in ontology matching RCDL'2009 St. Petersburg Institute for Informatics and Automation.

Slides:



Advertisements
Similar presentations
Web 2.0 Programming 1 © Tongji University, Computer Science and Technology. Web Web Programming Technology 2012.
Advertisements

European Space Weather Portal The European gateway to Space Weather resources.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Welcome to the OED Online tour. The Oxford English Dictionary is widely regarded as the accepted authority on the English language. It is an unsurpassed.
2. Searching the OED. Type a word or phrase in the ‘Quick search’ box…
Usage of the memoQ web service API by LSP – a case study
Mining Wiki Resources for Multilingual Named Entity Recognition Alexander E. Richman & Patrick Schone Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
COMP 6703 eScience Project Commercial Semantic Web of Digital Library  Student : Yin Chen  Client/Technical Supervisor : Tom Worthington  Academic Supervisor.
COMP 6703 eScience Project Commercial Wiki of Academic Journal  Student : Yin Chen  Client/Technical Supervisor : Mr Tom Worthington  Academic Supervisor.
Presented by Mina Haratiannezhadi 1.  publishing, editing and modifying content  maintenance  central interface  manage workflows 2.
Premier Accessibility Suite Software for Reading and Writing.
Web Interfaces and Data Portals John Porter Department of Environmental Sciences University of Virginia.
4. The Historical Thesaurus. The Historical Thesaurus is a semantic index of the contents of the OED…
Content Management Systems Equals Distributed Web Site Maintenance Robert Gulick, EdD DBA / Technology Trainer Carmi Gulick.
WordNet CMS Presented By: Konkani NLP team Goa University.
WORDNET Approach on word sense techniques - AKILAN VELMURUGAN.
Training Course 2 User Module Training Course 3 Data Administration Module Session 1 Orientation Session 2 User Interface Session 3 Database Administration.
Dynamic Web Pages (Flash, JavaScript)
Exploiting Wikipedia as External Knowledge for Document Clustering Sakyasingha Dasgupta, Pradeep Ghosh Data Mining and Exploration-Presentation School.
Entity Recognition via Querying DBpedia ElShaimaa Ali.
Sarasota Policy Wiki Why Wiki? To provide a new platform for community input on public policies and issues. To encourage engagement.
PHP and MySQL by Example COMP YL Professor Mattos.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Introduction to MediaWiki EnhanceEdu, IIIT-Hyderabad 1.
Copyright 2007, Paradigm Publishing Inc. WORD 2007 Chapter 2 BACKNEXTEND 2-1 LINKS TO OBJECTIVES Spell Checking a Document Spell Checking a Document Checking.
Symfony web development framework is used to develop rapid, complex and large scale web applications faster and in an effective way.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
P2Pedia A Distributed Wiki Network Management and Artificial Intelligence Laboratory Carleton University Presented by: Alexander Craig May 9 th, 2011.
Course grading Project: 75% Broken into several incremental deliverables Paper appraisal/evaluation/project tool evaluation in earlier May: 25%
Curtis Spencer Ezra Burgoyne An Internet Forum Index.
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
Senior Project, 2015, Spring Senior Project Web Site –Version 5 Student: Jacek Kopczynski, Florida International University Mentor: Masoud Sadjadi, Florida.
8. ONLINE REFERENCE TOOLS Dictionaries and Thesauruses Concordancers and corpuses for language analysis Translators for language analysis Encyclopedias.
© 2006 DTP PMC; made available under the EPL v1.0 | July 12, 2006 | DTP Enablement Project Creation Review Creation Review: Eclipse Data Tools Platform.
A Short Tutorial to Semantic Media Wiki (SMW) [[date:: July 21, 2009 ]] At [[part of:: Web Science Summer Research Week ]] By [[has speaker:: Jie Bao ]]
IndoWordNet Database Design Presented By: Konkani NLP Team Goa University IndoWordNet Database Design 1.
LIBRARIES MEET THE GRID: Librarians in Cyberspace Virginia Allen Beth Avery.
Introduction to MediaWiki EnhanceEdu, IIIT-Hyderabad
Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Wikis: tools for collaboration Ace School Librarianship ICT Applications.
Blogs made simple English 490/590 Winter A what? A diary that happens to be online. The term "weblog" was coined by Jorn Barger on 17 December 1997.
Semantic (web) activity at Elsevier Marc Krellenstein VP, Search and Discovery Elsevier October 27, 2004
Understanding Web-Based Digital Media Production Methods, Software, and Hardware Objective
1 CS 8803 AIAD (Spring 2008) Project Group#22 Ajay Choudhari, Avik Sinharoy, Min Zhang, Mohit Jain Smart Seek.
Last Update : Dec 2013 EBSCO KOREA
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Device Apps Instructor Name, Position Workshop Date.
Big Data: Every Word Managing Data Data Mining TerminologyData Collection CrowdsourcingSecurity & Validation Universal Translation Monolingual Dictionaries.
Ontology / Terminology December 1, 2008 – Slide 1 Ontology / Terminology Sergio D’Elia - EOP-GR Service Support and Ground Segment Technology.
Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida HealthInsight Workshop, Oslo, Norway.
English-Lithuanian-English Lexicon Database Management System for MT Gintaras Barisevicius and Elvinas Cernys Kaunas University of Technology, Department.
Chapter 13 Web Application Infrastructure
GATE and the Semantic Web

Dynamic Web Pages (Flash, JavaScript)
Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary
WordNet: A Lexical Database for English
Objective Understand web-based digital media production methods, software, and hardware. Course Weight : 10%
Semantic Soccer: Implementation on Semantic Wiki Platform
Web Page Concept and Design :
A method for WSD on Unrestricted Text
DBpedia 2014 Liang Zheng 9.22.
HTML 5 SEMANTIC ELEMENTS.
Web Application Development Using PHP
SDMX IT Tools SDMX Registry
Using Dictionaries in Translation (223 TRAJ)
Presentation transcript:

Related terms search based on WordNet / Wiktionary and its application in ontology matching RCDL'2009 St. Petersburg Institute for Informatics and Automation of RAS Feiyu Lin, A. Krizhanovsky (andrew.krizhanovsky at gmail.com) Jönköping University, Sweden

2 Contents Wiki and Wiktionary intro MRD, parser and Wiktionaries comparison Correlation of relatedness measures Experiment scheme Result and comparison Results, applications and future

Goal Is it possible to find related terms by the current version of Wiktionary as successfully as by WordNet? for ontology matching, for application in text search systems, etc. What advantages?

4 Wiki-resources Distributed users and authors (edit pages) Centralized storage (e.g. MySQL, Apache, PHP) Set of hyper linked articles Each article has one or more categories (tree) * Example:

Wiktionary is a free-content multilingual dictionary

6 Wiktionary data: +, -, simplicity & complexity −Different wiktionaries have different levels of standartization. −Fast growing data, but it’s created by a huge community (a developed parser should be very stable) +Rich data +thesaurus (synonyms, antonyms ) +phrase books +etymologies +pronunciations +sample quotations +translations +Fast growing data +Interwiki (add. data) +GNU DFL

7 Wiktionary machine- readable dictionary database scheme

Size of Wiktionaries WordNet (2006): 150,000 words, 115,000 synsets

A shortest path in Russian Wiktionary

Correlation of relatedness measures Correlation with human judgments of relatedness measures 353-TC to measures based on WordNet, English Wikipedia, Russian Wiktionary

Largest eight Wiktionary editions (March 2008)

Application of Machine- readable dictionary (MRD) Thesaurus data: Related Terms Search Search request extension (by synonyms) / request reformulation (in search systems) Request recognition in question-answering systems Word sense disambiguation Media data (audio + pictures) Language learning

Work plan: done and todo Russian Wiktionary Extraction (by RE) –Definition –Relations (synonyms…) –Translation –Audio –Graphics Database API Visualization (MRD browser) Quiz & tests (test application) Russian Wiktionary Database scheme –Definition –Relations (synonyms…) –Translation –Audio –Graphics Database API English Wiktionary

15 Implementation Software based on Synarcher code Java MySQL or SQLite database JUnit test framework

16 Results The scheme of the experiment for calculating the semantic relatedness measure based on Russian Wiktionary data The parser of Russian Wiktionary Database scheme designed Database API implemented in Java Compared the results of related terms search based on Wiktionary and WordNet Project site (Wiki tool kit)

Future work Finish creation MRD Database and software Russian Wiktionary and English Wiktionary Visualization (JavaFX) MRD browser Quiz & tests (learning application) Online application (Java Web-start) asdf

Thank you!