SWG Strategy (C) Copyright IBM Corp. 2006, 2012. All Rights Reserved. International Technology Alliance Programme: Fact Extraction using a Controlled Natural.

Slides:



Advertisements
Similar presentations
Towards Data Mining Without Information on Knowledge Structure
Advertisements

1 Knowledge Representation Introduction KR and Logic.
Artificial Intelligence: Natural Language and Prolog
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 5 Author: Julia Richards and R. Scott Hawley.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 4 Author: Julia Richards and R. Scott Hawley.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
OLIF V2 Gr. Thurmair April OLIF April 2000 OLIF: Overview Rationale Principles Entries Descriptions Header Examples Status.
Slide 1 of 18 Uncertainty Representation and Reasoning with MEBN/PR-OWL Kathryn Blackmond Laskey Paulo C. G. da Costa The Volgenau School of Information.
SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. P4 Task 2 Fact Extraction using a CNL Current Status David Mott, Dave Braines, ETS,
International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Fact Extraction.
SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. v1 ACITA 2011 demonstration of ongoing NLP work Dave Braines, David Mott, ETS, Hursley,
International Technology Alliance In Network & Information Sciences International Technology Alliance In Network & Information Sciences Semiotics and NLP.
A Controlled Natural Language Interface for Semantic MediaWiki Jie Bao Rensselaer Polytechnic Institute Paul R. Smart, Nigel R. Shadbolt University of.
International Technology Alliance in Network & Information Sciences Dave Braines, John Ibbotson, Graham White (IBM UK) SPIE Defense Security & Sensing.
0 - 0.
Addition Facts
Limitations of the relational model 1. 2 Overview application areas for which the relational model is inadequate - reasons drawbacks of relational DBMSs.
1 Word Grammar and other cognitive theories Richard Hudson Budapest March 2012.
SQL: The Query Language Part 2
Introduction Lesson 1 Microsoft Office 2010 and the Internet
Prolog programming....Dr.Yasser Nada. Chapter 8 Parsing in Prolog Taif University Fall 2010 Dr. Yasser Ahmed nada prolog programming....Dr.Yasser Nada.
İDB 408 LINGUISTIC PHILOSOPHY 2010/2011 Spring Term Instructor: Dr. Filiz Ç. Yıldırım.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Semantics (Chapter 17) Muhammed Al-Mulhem March 1, 2009.
June, 2006 The 11th CAiSE06 International Workshop on Exploring Modeling Methods in Systems Analysis and Design (EMMSAD06), Luxembourg Ontological.
1 UML ++ Mohamed T IBRAHIM University of Greenwich -UK.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Computational language: week 10 Lexical Knowledge Representation concluded Syntax-based computational language Sentence structure: syntax Context free.
Systems Analysis and Design with UML Version 2.0, Second Edition
This work was partially funded by the RNTL initiative (LUTIN project) 1 Refactoring to Object-Oriented Design Patterns Mikal Ziane (LIP6 and Université.
Addition 1’s to 20.
Week 1.
1 Unit 1 Kinematics Chapter 1 Day
1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
Natural Language Processing Lecture 2: Semantics.
07/05/2005CSA2050: DCG31 CSA2050 Introduction to Computational Linguistics Lecture DCG3 Handling Subcategorisation Handling Relative Clauses.
Statistical NLP: Lecture 3
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
C. Varela; Adapted w/permission from S. Haridi and P. Van Roy1 Declarative Computation Model Defining practical programming languages Carlos Varela RPI.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
February 2009Introduction to Semantics1 Logic, Representation and Inference Introduction to Semantics What is semantics for? Role of FOL Montague Approach.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Computational Linguistics INTroduction
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
ITA Annual Fall Meeting September 2014 The International Technology Alliance in Network and Information Sciences Challenges Solved and Unsolved in Fact.
Artificial Intelligence 4. Knowledge Representation Course V231 Department of Computing Imperial College, London © Simon Colton.
International Technology Alliance in Network & Information Sciences Using the English Resource Grammar to extend fact extraction capabilities v1.1 David.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
PUBLIC RELEASE – DISTRIBUTION UNLIMITED SPIE 2015 The International Technology Alliance in Network and Information Sciences Collaborative human- machine.
November 2003CSA4050: Semantics I1 CSA4050: Advanced Topics in NLP Semantics I What is semantics for? Role of FOL Montague Approach.
Understanding Natural Language
Postgraduate Diploma in Translation Lecture 1 Computers and Language.
Writing an ERG mal-rule David Mott IBM Emerging Technology Services.
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
Rules, Movement, Ambiguity
CSA2050 Introduction to Computational Linguistics Parsing I.
“Mr Brown” a simple logic puzzle requiring common sense David Mott (ETS, IBM UK) Nov 2014 David Mott (ETS, IBM UK) Nov 2014 International Technology Alliance.
SYNTAX.
Complex sentence analysis (2) D. Mott, ETS, IBM 5 th Nov 2014.
AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.
Artificial Intelligence Knowledge Representation.
ACITA 12 demo outline v0 Dr David Mott (IBM UK) International Technology Alliance In Network & Information Sciences International Technology Alliance In.
NL Processing and Fact Extraction 11th May 2013
Statistical NLP: Lecture 3
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
CSC 594 Topics in AI – Applied Natural Language Processing
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
CS246: Information Retrieval
Presentation transcript:

SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. International Technology Alliance Programme: Fact Extraction using a Controlled Natural Language David Mott, Dave Braines, ETS, Hursley, IBM UK

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 2 Team Dave Braines, David Mott –IBM, Hursley Steve Poteet, Ping Xue, Anne Kao –Boeing, Seattle Paul Smart, Antonio Penta, Ron Tasker –University of Southampton

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 3 International Technology Alliance (ITA) in network and information sciences How can coalition operations be assisted by networks of computer systems? US/UK Academic/Industry collaboration 10 year programme ending in May 2016 –Sponsored by UK MOD and US ARL –Research must be scientific, fundamental, reviewed by academic peers, and published

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 4 ITA Consortium Members

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 5 Fundamental Research Issues How do we assist people to create and use applications that reason? – Modelling concepts, relationships and rules of inference – Grasping the basic logic of the model and rules – Understanding the reasoning performed by others – Sharing understanding across the human team – Sharing reasoning and artefacts across different systems

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 6 Supporting the "analyst" doc27 CE Facts InferenceRationale Argumentation Query Analysts Conceptual Model Assumption s Uncertainty CNL Tools NLP Requirements Product

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 7 Analysts's "Conceptual Model" Analyst represents specialist knowledge as concepts, facts and rules for inference –a conceptual model –a common set of concepts The system must "understand" the conceptual model –assist analyst to search for patterns, deduce information A language to build the conceptual model –analyst: easy to understand –system: readable, unambiguous and formal We use Controlled English to express the model

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 8 Controlled English A Controlled Natural Language, being a subset of English –limited syntax, but still readable as English –meanings of the expressions unambiguously defined Avoids the complexity of a real Natural Language –computer systems can read, interpret and apply it Retains the appearance of a real language –humans can naturally use it, without learning "computer speak" The analyst may use Controlled English to construct their Conceptual Model the person John is married to the person Jane and has red as hair colour. Based on work by John Sowa

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 9 CE for Reasoning CE used to define: –"propositions", facts, assumptions –logical rules –queries –meta model of concepts Inference engines constructed to apply logical rules –Specific Prolog implementations –CE Store based on Java and SQL Rationale may be constructed: –presented to users for hybrid man/machine reasoning –to determine dependencies Formal semantics for CE –(partially defined) in FOPL Applications –analysis of information –societal and open government data –planning and resource allocation –(in progress) NLP

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 10 Fact Extraction using Controlled Natural Language As the target of the NL processing –facts in documents can be used for further reasoning As a means of describing the NL processing –to share understanding of the linguistic processing –to help configure NL tooling

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 11 Controlled English is "Curiously Useful" – Why? perhaps because humans are naturally good at using language to model, understand and reason we can build upon "literary devices" already developed to solve problems in expressing knowledge

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 12 Conceptual Model(s) Meta Model Concept, Entity Concept, Relation Concept, Conceptual Model belongs to, has as domain Semiotic Triangle Thing, Meaning, Symbolstands for, expresses General Agent, Spatial Entity, Temporal Entity, Situation, Container has as agent role, is contained in Linguistic Sentence, Phrase, Word, Noun, Linguistic Category, Linguistic Frame has as dependent, is parsed from ACM Place, Church, Person, Village, IED, Facility,....is located in meaning symbol thing conceptualises stands for expresses "Our" Semiotic Triangle, based on the original [Ogden, C. K. and Richards, I. A. (1923). ]

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 13 Current NL Processing Stanford Parser Entity Extractor Situation Extractor Names CE Aggregator CEStore SYNCOIN Reports Message PreProcessor "Stylistic" CE Conceptual Model (concepts, logical rules, linguistic expression) Proper Nouns (places, units) For Analysis Our focus is on the semantics of the conceptual model

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 14 General Semantics: Containers if ( the prepositional phrase PP has the word '|in|' as head and has the noun phrase NP2 as object ) and ( the noun phrase NP2 stands for the thing T2 ) then ( the thing T2 is a container ). the noun phrase np1 the prepositional phrase pp1 has as dependent "the patrol in East Rashid discovers the facility." the word |in| the thing t1 stands for the noun phrase np2 has as headhas as object container is a the thing t2 stands for is contained in if ( the noun phrase NP1 stands for the thing T1 and has the prepositional phrase PP as dependent ) and ( the prepositional phrase PP has the word '|in|' as head and has the noun phrase NP2 as object ) and ( the noun phrase NP2 stands for the container T2) then ( the thing T1 is contained in the container T2 ). Least Commitment approach – dont say what sort of container

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 15 Specific Semantics: Entities from Noun Phrases the noun phrase np1 if ( the noun phrase NP has the noun N as head and stands for the thing T ) and ( the noun N expresses the entity concept C ) then ( the thing T realises the entity concept EC ). "the patrol in East Rashid discovers the facility." the noun |patrol| has as head the thing s1 stands for the entity concept 'patrol unit' expresses realises patrol unit Analyst's helper is a Requires "expresses" link between words and concepts

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 16 "Analyst's Helper" Analyst Helper NL parser "expresses" conceptual model Proper Names wordnet/etc meta information ITAnet MetaModel generator gazetteers etc Analyst the word |xxx| is an unrecognised word wordnet/etcgazetteers etc translate semantic rules the word |www| expresses the concept yyy Only the analyst knows what the concepts mean

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 17 Current question How should the "expresses" link be made more expressive! –conditional rules to handle ambiguous words –selectional constraints based on semantics of models? –introduce verbnet, etc? –...

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 18 The ambiguity barrier we start from basic CE and move towards full English Can we control the crossing of the ambiguity barrier? Basic CE anaphoric reference sub clauses prepositional phrasesflexible identities verb inflections domain specific syntax Ambiguity Ambiguity Barrier Full English CE needs to be enhanced

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 19 "Identical" NL and CNL parsers NL Parser CNL Parser lexicon conceptual model Reference English Grammar Semantic Theory Increase stylistic expressibility of CE Better understanding of linguistics stylistically expressive CE basic CE or predicate logic or CE-in-Java stylistically expressive CE NLP

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 20 Linguistic Frame for semantics there is a linguistic frame named vp0 that has 'is the dog Fido' as example and defines the verb phrase VP_vp0 and has the sequence ( the copula BE_vp0, and the noun phrase OBJ_vp0 ) as syntactic pattern and is predicated on the thing T and has the statement that ( the noun phrase OBJ_vp0 is predicated on the thing OBJ ) and ( the thing T is the same as the thing OBJ ) as semantic statement. the word |is| belongs to the linguistic category 'copula'. the word |dog| is a noun. the entity concept ce:Dog is expressed by the word |dog| and has 'dog' as concept term. semantics syntax copula noun phrase verb phrase is the dog fido v(OBJ), dog(OBJ).. v(T) T=OBJ,... Analyst's Conceptual Model Linguistic Model We want exactly the same logic here as in the real NL processing

SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, All Rights Reserved. 21 Could we? use LKB instead of the Stanford Parser? use the ERG instead of WordNet etc? –where does the Analysts Helper fit in? improve our linguistic model to take account of LKB semantic theory? represent MRS in CE? represent linguistic rules in CE?