5 th -6 th December 2002 6 th Meeting Paris WP2: NERC.

Slides:



Advertisements
Similar presentations
© NCSR, Paris, December 5-6, 2002 WP1: Plan for the remainder (1) Ontology Ontology  Enrich the lexicons for the 1 st domain based on partners remarks.
Advertisements

Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Domain-Independent Data Extraction: Person Names Carl Christensen and Deryle Lonsdale Brigham Young University
InterPARES Project Joanne Evans, School of Information Management and Systems, Monash University Description Cross-domain Description Cross Domain - Metadata.
Chapter 6 Methodology Conceptual Databases Design Transparencies © Pearson Education Limited 1995, 2005.
A System for A Semi-Automatic Ontology Annotation Kiril Simov, Petya Osenova, Alexander Simov, Anelia Tincheva, Borislav Kirilov BulTreeBank Group LML,
Using Information Extraction for Question Answering Done by Rani Qumsiyeh.
Lecture Fourteen Methodology - Conceptual Database Design
Detecting Economic Events Using a Semantics-Based Pipeline 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) September.
Toward Semantic Web Information Extraction B. Popov, A. Kiryakov, D. Manov, A. Kirilov, D. Ognyanoff, M. Goranov Presenter: Yihong Ding.
Project Planning with IT Y/601/7321
Modelling Techniques - Normalisation Description and exemplification of normalisation.Description and exemplification of normalisation. Creation of un-normalised.
Final Review 31 October WP2: Named Entity Recognition and Classification Claire Grover University of Edinburgh.
Presentation Handout EDBA – Module 8 Information Technology 21 st December 2014 By K.M.Prashanthan.
Methodology - Conceptual Database Design Transparencies
Methodology Conceptual Databases Design
University of Economics Prague Information Extraction (WP6) Martin Labský MedIEQ meeting Helsinki, 24th October 2006.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
Lifecycle Metadata for Digital Objects November 22, 2004 Usage and Rights Management Metadata.
Combining terminology resources and statistical methods for entity recognition: an evaluation Angus Roberts, Robert Gaizauskas, Mark Hepple, Yikun Guo.
Methodology: Conceptual Databases Design
FNERC OVERVIEW 05/12/2002. Lingway, of December 2002 FNERC : introduction Lingway entered the project while CDC had already worked on FNERC Lingway.
Methodology - Conceptual Database Design
APEC-TEL Broadband Study TEL03/2009A – status report Bangkok, May 20 th.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
17 Apr 2002 XML Syntax: Documents Andy Clark. Basic Document Structure Element tags – Elements have associated attributes Text content Miscellaneous –
Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Comparability of language data and analysis Using an ontology for linguistics Scott Farrar, U.
05/03/03-06/03/03 7 th Meeting Edinburgh Naïve Bayes Fact Extractor (NBFE) v.1.
Benchmarking ontology-based annotation tools for the Semantic Web Diana Maynard University of Sheffield, UK.
Project Overview Vangelis Karkaletsis NCSR “Demokritos” Frascati, July 17, 2002 (IST )
Sheffield -- Victims of Mad Cow Disease???? Or is it really possible to develop a named entity recognition system in 4 days on a surprise language with.
28/02/02-01/03/02 4 th Meeting Athens ENERC v.2. 28/02/02-01/03/02 4 th Meeting Athens Updates Change in early tokenisation: identification of words now.
1 Italian FE Component CROSSMARC Eighth Meeting Crete 24 June 2003.
© NCSR, Frascati, July 18-19, 2002 WP1: Plan for the remainder (1) Ontology Ontology  Use of PROTÉGÉ to generate ontology and lexicons for the 1 st domain.
MedKAT Medical Knowledge Analysis Tool December 2009.
NCSR “Demokritos” Institute of Informatics & Telecommunications CROSSMARC CROSS-lingual Multi Agent Retail Comparison Costas Spyropoulos & Vangelis Karkaletsis.
University of Rhode Island EDC 452.  “A statement of what students ought to be able to do as a consequence of instruction”. (Goodlad)”
ICDCRome November 2001CROSSMARC Third meeting French NERC (first version and results) CROSSMARC Project IST Third meeting Rome November 2001.
Description and exemplification use of a Data Dictionary. A data dictionary is a catalogue of all data items in a system. The data dictionary stores details.
February 25,  The BDE(Begin-During-End) event.  Worksheet – Exercise # 9 Instructions  2nd Period Test.
Power Designer n See course web page for additional information on using Power Designer n Business rules – Come from a description of activities – Example.
Ontology Based Annotation of Text Segments Presented by Ahmed Rafea Samhaa R. El-Beltagy Maryam Hazman.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Institute of Informatics & Telecommunications NCSR “Demokritos” Spidering Tool, Corpus collection Vangelis Karkaletsis, Kostas Stamatakis, Dimitra Farmakiotou.
Lifecycle Metadata for Digital Objects November 13, 2002 Rights Management Metadata.
WP1: Plan for the remainder (1) Ontology –Finalise ontology and lexicons for the 2 nd domain (RTV) Changes agreed in Heraklion –Improvement to existing.
© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.
NCSR “Demokritos” Institute of Informatics & Telecommunications CROSSMARC CROSS-lingual Multi Agent Retail Comparison WP3 Multilingual and Multimedia Fact.
WP2: Hellenic NERC Vangelis Karkaletsis, Dimitra Farmakiotou Paris, December 5-6, 2002 Institute of Informatics & Telecommunications NCSR “Demokritos”
Methodology Conceptual Databases Design
Evaluation Anisio Lacerda.
 Corpus Formation [CFT]  Web Pages Annotation [Web Annotator]  Web sites detection [NEACrawler]  Web pages collection [NEAC]  IE Remote.
Methodology Conceptual Database Design
Institute of Informatics & Telecommunications NCSR “Demokritos”
Improving Data Discovery Through Semantic Search
Institute of Informatics & Telecommunications
6th Annual CTSOG Workshop, Ann Arbor MI
Automatic Hedge Detection
Warm Up, May 2nd Of the documents listed below, write down the titles of all the ones you still need for your portfolio: Reflection and Self-Assessment.
ESS Standardisation State of play
Methodology Conceptual Databases Design
CS246: Information Retrieval
8th meeting of the Expert Group on Reporting (2 December 2010)
CS703 - Advanced Operating Systems
Review plan of the nature reporting – update 6
Case Study: Important Supreme Court Cases
9th meeting of the Expert Group on Reporting (22 March 2011)
Task 1 A British tourist is visiting New York and Newark, New Jersey. They want to find out what’s happening back in the UK in the news by visiting the.
Presentation transcript:

5 th -6 th December th Meeting Paris WP2: NERC

5 th -6 th December th Meeting Paris D2.3 Currently a nearly-finished draft. Reports on NERC v.2 NERC v.2 still deals only with 1 st domain but name matching and normalisation now added. Contains system documentation and evaluation results for ENERC, HNERC and INERC so far: FNERC still to come. New in this version is the use of the NERC-based demarcator between NERC and FE: this affects evaluation.

5 th -6 th December th Meeting Paris D2.3: NERC Systems Various improvements to ENERC, HNERC and INERC: tokenisation, lexical resources etc. Addition of name matching and normalisation but issue raised about whether these are best done as part of NERC or FE.

5 th -6 th December th Meeting Paris Normalisation & Name Matching Normalisation best performed after FE –For efficiency: only normalise units that are part of facts –FE disambiguates certain entities (e.g. SPEED) and this helps normalisation Name Matching could be done in NERC or FE –For co-referential entities within the same product description we need name matching before FE –For other entities, it can be done after FE where it will be helped by FE disambiguation (e.g. SOFT_OS).

5 th -6 th December th Meeting Paris Normalisation & Name Matching HNERC: name matching of coreferential entities within same product description – after NERC and Demarcator but before FE. All other name matching and normalisation after FE. INERC: name matching as part of NERC: integrated into the ontology look-up process. ENERC: both modules operate on entities and encode results as attributes on the entities (which can then be inherited by the facts). Can be done after NERC or FE (or both).

5 th -6 th December th Meeting Paris Evaluation Annotators of gold standard were instructed only to annotate entities which are part of product descriptions. Demarcation now happens after NERC, therefore the NERC modules annotate entities throughout the page. Evaluation of NERC against the gold standard gives false measure of precision. Evaluation of NERC+Demarcation combined would give accurate measure if the Demarcator is totally accurate, but not otherwise.

5 th -6 th December th Meeting Paris D2.3: NERC Evaluation HNERCINERCENERC RecPrec F-score RecPrecF-scoreRecPrec F-score MANUF MODEL PROC SOFT_OS

5 th -6 th December th Meeting Paris WP2 Tasks Finish D2.3 Finalise DTD for 2 nd domain and start annotating as soon as 2 nd domain corpus is ready. Build and evaluate NERC v.3 which deals with both domains.