Presentation is loading. Please wait.

Presentation is loading. Please wait.

SWG Strategy (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. v1 ACITA 2011 demonstration of ongoing NLP work Dave Braines, David Mott, ETS, Hursley,

Similar presentations


Presentation on theme: "SWG Strategy (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. v1 ACITA 2011 demonstration of ongoing NLP work Dave Braines, David Mott, ETS, Hursley,"— Presentation transcript:

1 SWG Strategy (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. v1 ACITA 2011 demonstration of ongoing NLP work Dave Braines, David Mott, ETS, Hursley, IBM UK Steve Poteet, Boeing

2 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 2 Supporting the analyst doc27 CE Facts InferenceRationale Argumentation Search Analysts Conceptual Model Assumption s Uncertainty CE Tools NLP Requirements Product

3 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 3 Controlled English A Controlled Natural Language, being a subset of English –limited syntax, but still readable as English –meanings of the expressions unambiguously defined Avoids the complexity of a real Natural Language –computer systems can read, interpret and apply it Retains the appearance of a real language –humans can naturally use it, without learning "computer speak" The analyst may use Controlled English to construct their Conceptual Model the person John is married to the person Jane and has red as hair colour.

4 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 4 Current NLP Research Objectives Improve Natural Language Processing of facts from documents –analyst may utilise more information when inferencing Allow the humans to be part of the NL processing –hybrid reasoning about ambiguities, incomplete parsing, etc Facilitate configuration of NLP tools in CE Define a model of linguistics, grammar, semantics Improve Expressibility of CE –much interest, but needs a more powerful grammar How is the Analysts Conceptual Model related to Natural Language?

5 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 5 We have used CE to model: [5] Collaborative Planning Analysis of IED activities and societal influences Matching Sensors to Missions Provenance Social Networks (Twitter) UK Government data (crimes, accidents, schools) NL processing itself

6 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 6 Our design principles for CE enhancement Retain existing principles of a CE conceptual model Based on full English grammar Chart parser for efficient syntax parsing Formal semantics, based upon scientific theory Higher level extensions handled in same theory Parser configurable in CE, based on linguistic model Modelling of Sentence Context Aim to significantly enhance CE expressivity

7 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 7 Parallel NL and CNL parsers NL Parser CNL Parser lexicon conceptual model Reference English Grammar Semantic Theory Increase expressibility of CE Better understanding of linguistics expressive CE basic CE or predicate logic expressive CE NLP

8 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 8 Control of ambiguity we start from basic CE and move towards full English How do we handle crossing the ambiguity barrier? Basic CE anaphoric reference sub clauses prepositional phrasesflexible identities verb inflections domain specific syntax Ambiguity Ambiguity Barrier Full English

9 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 9 Stanford parser as reference But only provides syntax, what about semantics? there is a person named Joe. Stanford CE parser

10 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 10 Extended CE Parser S NPVP EX there NP DT VP NN aperson VBZ is VBN named NP NNP Joe person(Joe) v(A), A=Joe, person(A) v(A), A=Joe exists(A) v(A), person(A) Semantics (based on Montague semantics) @copula @be@postmodifier @nonfinite Full English Syntax

11 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 11 Linguistic Frame there is a linguistic frame named vp0 that has 'is the dog Fido' as example and defines the verb phrase VP_vp0 and has the sequence ( the copula BE_vp0, and the noun phrase OBJ_vp0 ) as syntactic pattern and is predicated on the thing T and has the statement that ( the noun phrase OBJ_vp0 is predicated on the thing OBJ ) and ( the thing T is the same as the thing OBJ ) as semantic statement. the word |is| belongs to the linguistic category 'copula'. the word |dog| is a noun. the entity concept ce:Dog is expressed by the word |dog| and has 'dog' as concept term. semantics syntax copula noun phrase verb phrase is the dog fido v(OBJ), dog(OBJ).. v(T) T=OBJ,... Analyst's Conceptual Model Linguistic Model Makes explicit a semantic theory

12 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 12 Allowing analyst to define how words express concepts Analyst Helper Conceptual Model wordnetitanet Entity Extractor Stanford parser Document the concept C has the same meaning as the synset S. the noun phrase NP has the word W as head/modifier and stands for the thing T. the thing T is categorised as the concept C.

13 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 13 Mapping CE concepts to words via WordNet synsets meaning synset concept word sense word lexicographeranalyst word sense word the synset {tank, armoured combat vehicle} means the same as the concept tank. {tank,armoured combat vehicle} armoured combat vehicle/1 tank/1 armoured combat vehicle tank conceptualise a ~ tank ~ T. meeting of minds the synset {tank, armoured combat vehicle} has the word sense tank/1 as component. the word |tank| expresses the concept tank. The Analyst STILL has to decide the lexical relations, since only he knows what his concept is

14 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 14 CE rules to use WordNet to relate words to concepts if ( the synset S means the same as to the concept C ) and ( the synset S has the word sense WS as component ) and ( the word sense WS has the word W as word ) then ( the word W expresses the concept C ) Analyst provides the link between his meaning and a standard meaning Now the parser can link words to concepts

15 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 15 Rationale for entity extraction the concept C has the same meaning as the synset S. the noun phrase NP has the word W as head/modifier the word sense WS adds meaning to the wordnet synset S. the thing T is categorised as the concept C the noun phrase NP stands for the thing T. the word W expresses the concept C. the word W expresses the word sense WS Stanford Parser wordnet Document Entity Extractor the word sense WS adds meaning to the ita synset S. the word W expresses the word sense WS Analyst Helper Wordnet Inference there is an ita synset named S. (General Semantics)

16 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 16 Hierarchy of linguistic frames predicate CE semantics syntax the person John attends the meeting X. the person Jane attends the meeting X. there is a situation X that is categorised as the concept meeting and has the person John as agent role and has the person Jane as patient role. linguistic CE semantics syntax domain CE semantics syntax specialist CE semantics syntax John attends a meeting with Jane. Predicate Logic the formula f3 has the statement that ( there is a meeting situation [123] that has the person Jane as patient agent and has the person John as agent role ) as semantic expression

17 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 17 Combining Linguistic and Analytic Rationale A fact extracted by a parser may lead to conclusions via analysts reasoning –may include assumptions and uncertainty The extraction of the fact may itself include assumptions and uncertainty The total rationale graph of linguistic and analysts reasoning shows all sources of uncertainty –removing a linguistic assumption may lead to no support for the analysts conclusions Argumentation may need to occur at both the linguistic and analytic level –but different skills (and people) needed for the different levels

18 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 18 CE Store and Agents CE Store pre- processing Analysts Model Documents, Reports Analysis product dialog context grammar parsing1 semantic1 semantic2 semanticN analysts inference semantic models Metadata structure grammar parsing2 semantic3 Entities and relations Lexicon/ Grammar rules Parses Rules Metadata structure

19 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 19 Extractor/Anaphor Agent CE Store Analysts Model Stanford Parser Entity Extraction Entities and "same as" relations Parse Tree Rules SYNCOIN sentences Anaphor Resolution Java Agent Linguistic Model Analysts Model Linguistically Identified Linguistic Model Stanford Parser reads SYNCOIN data and generates parse trees Anaphor/Extractor Agent reads parse information and uses rules + models to: turn noun phrases into entities ("market") link noun phrases that are anaphoric references ("he")

20 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 20 Sample Entity Extraction Rules in CE if ( the noun phrase NP stands for the thing T and has the noun N as head ) and ( the noun N expresses the concept C ) then ( the thing T is categorised as the concept C ). if ( the noun phrase NP stands for the thing T and has the adjective A as modifier ) and ( the adjective A expresses the concept C ) then ( the thing T is categorised as the concept C ). if ( the noun phrase NP stands for the thing T and has the personal pronoun |he| as head ) then ( the thing T is a man ).

21 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 21 Simplistic Anaphor Rules in CE if ( the noun phrase NP has the personal pronoun PRP as head ) then ( the noun phrase NP is an anaphor ). if ( the noun phrase NPA is an anaphor ) and ( the noun phrase NPA follows the noun phrase NP ) and ( the noun phrase NP stands for the man T ) and ( the noun phrase NPA stands for the man TA ) then ( the noun phrase NPA is coreferent with the noun phrase NP ). if ( the noun phrase NP1 is coreferent with the noun phrase NP2 ) and ( the noun phrase NP1 stands for the thing T1 ) and ( the noun phrase NP2 stands for the thing T2 ) then ( the thing T1 is the same as the thing T2 ). Needs much more rules with selection constraints on the target NP Needs to handle more categories

22 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 22 Extended CE Parser Agent CE Store CE parser CE semantics semantic statement Entities Lexicon SYNCOIN sentences Grammar pattern Linguistic Frame mapping to concepts Predicate Logic Model SYNCOIN Model CE Parser agent reads SYNCOIN data and runs simple CE linguistic frames Agent extracts best" parse", turns into low level CE This is simple entity extraction when the noun phrase is at the start ("the man...") Java Agent Analysts Model

23 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 23 Extended CE Parser Chart Parser Phrase structure grammar lexical categories annotations lexicon of words, categories and syntactic features Semantic processor Semantic representation and combination lock-step Parse Trees Logical Representation Documents, Reports CE mapping to concepts semantic statement (1-1) syntactic pattern linguistic frame Linguistic Model Analyst's Conceptual Model Predicate Logic Model Mapping assumes simple 1=1 word to concept

24 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 24 CE fact extraction framework SYNCOIN Sentence as parsed by Stanford Parser + CE semantic extraction rules SYNCOIN Sentence as parsed by CE Parser + CE semantic extraction rules Basic syntactic parse tree information from Stanford Parser Basic syntactic parse tree information from CE Parser Semantic information more general than the ACM Semantic information added from Analysts Conceptual Model CE facts extracted from sentence

25 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 25 Applying rules to find entities

26 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 26 Prepositional phrase "in" as a container

27 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 27 Backup

28 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 28 Using WordNet to extend the linguistic mappings meaning synset concept word sense word lexicographeranalyst word sense word the synset {tank, armoured combat vehicle} means the same as the concept tank. {tank,armoured combat vehicle} armoured combat vehicle/1 tank/1 armoured combat vehicle tank conceptualise a ~ tank ~ T. meeting of minds the synset {tank, armoured combat vehicle} has the word sense tank/1 as component. synset the synset {tank,armoured combat vehicle} ' is a hyponym of the synset {military vehicle}'. {military vehicle}'. word military vehicle. the synset {military vehicle} means the same as the concept tank. the word |military vehicle| expresses the concept tank.

29 SWG Strategy – Emerging Technology Services, Hursley (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. 29 CE rules to use WordNet to extend word-to-concept relations if ( the synset S means the same as the concept C ) and ( the synset S is a hyponym of the synset Super ) then ( the synset Super means the same as the concept C ).


Download ppt "SWG Strategy (C) Copyright IBM Corp. 2006, 2011. All Rights Reserved. v1 ACITA 2011 demonstration of ongoing NLP work Dave Braines, David Mott, ETS, Hursley,"

Similar presentations


Ads by Google