Presentation on theme: "Automated Question Answering. Motivation: support for students Demand is for 365 x 24 support – Students set aside time to complete task – If problem."— Presentation transcript:
Automated Question Answering
Motivation: support for students Demand is for 365 x 24 support – Students set aside time to complete task – If problem encountered immediate help required Majority of responses direct students to teaching materials; so not a case of “not there” Poor search forums – Search per forum - not course – Free-text search options fixed by RDBMS No explicit operators (AND, OR, NEAR)
Research questions Given the current level of development of natural language processing (NLP) tools, is it possible to: – Classify messages as question/non-question – Identify the topic of the question – Direct users to specific course resources
Natural Language Processing tools Tokenisation (words, numbers, punctuation, whitespace) Sentence detection Part of speech tagging ( verbs, nouns, pronouns, etc. ) Named entity recognition (names, locations, events, organisations) Chunking/Parsing (noun/verb phrases and relationships) Statistical modelling tools Dictionaries, word-lists, WordNet, VerbNet Corpora tools (Lucene, Lemur)
Question answering solutions Open domain – No restrictions on question topic – Typically answers from web resources – Extensive literature Closed domain – Restricted question topics – Typically answers from small corpus Company documents Structured data
Open domain QA research Well established over two decades TREC (Text REtrieval Conference) – funded by NIST/DARPA since 1992 – QA track 1999 – 2007, directed at ‘Factoids’ CLEF (Cross Language Evaluation Forum) – current – Information Retrieval, language resources NTCIR (NII Test Collection for IR Systems) – 1997 – current – IR, question answering, summarization, extraction
TREC Factoids Given a fact-based question: – How many calories in a Big Mac? – Who as the 16 th President of the United States? – Where is the Taj Mahal? Return an exact answer in 50/250 bytes – 540 calories – Abraham Lincoln – Agra, India
Minimal factoid process Question analysis Normalisation (verbs, auxiliaries, modifiers) Identify entities (people, locations, events) Pattern detection (who was X?, how high is Y?) Query creation, expansion, and execution Ordered terms, combined terms, weighted terms Answer analysis Match answer type to question type
OpenEphyra: open source QA Source:
OpenEphyra: question analysis Question ‘who was the fourth president of the USA’ Normalization ‘who be fourth president of USA’ Answer type NEproperName->NEperson Interpretation property: NAME target:fourth president context:USA
OpenEphyra: query expansion 1."fourth president USA" 2.(fourth OR 4th OR quaternary) president (USA OR US OR U.S.A. OR U.S. OR "United States" OR "United States of America" OR "the States" OR America) 3."fourth president" "USA" fourth president USA 4."was fourth president of USA“ 5."fourth president of USA was”
OpenEphyra: result answer: James Madison score: docid: Document content: James Madison - 4th President of USA James Madison (March 16, June 28, 1836) was fourth President of the United States ( ), and one of the Founding Fathers of the United States...
Shallow answer selection Answer based on reformulation of question – Who was the fourth president of the United States ? – James Maddison was the fourth president of the United States Students don’t ask questions and we don’t provide answers!
Importance of named entities Search results tagged with NEs Question processed for NEs Extracted NEs link question and answer Search engine Answer matching
Task list: the real work Create database of forum messages Adapt open source NLP tools – Tokenisation, sentence detection, Parts Of Speech, parsing Establish question patterns Create language analysis tools – Word frequency – Named-entities: define, build, and train models Prepare corpus – Format and tag documents (doc, html, pdf) – Build Indri catalogue and search interface Iterative process: build, test, refine
NLP tools Predominantly Java – Stanford, OpenNLP, Lingpipe – GATE: complete analysis + processing system – IKVM permits use with.NET framework Some C++, C# – WordNet, Lemur/Indri, Nooj, SharpNLP Python NLTK – Complete NLP toolset and corpus Lisp, Prolog
Message database MySQL database for FirstClass messages Extract: – Forum, Subject, Date, Author – Body Use subject to classify as Original or Reply No clean-up or filtering of message content undertaken at this stage
Raw forum message (Sample 1) Help Please!!!? Urgent T320 09B Eclipse Support I am trying to open an existing project but can't do it. It's driving me mad. I know the project folders are located in the workspaceblock4 folder. I have deleted all the open projects in the project explorer window (without deleting content). BUT how on earth do I know proceed to reload some of the projects without starting from scratch? When I select open file... it doesn't let me open any projects files - only the individual files in the project folder. In other words I cannot get any project files to appear in the project explorer window. Please can anyone help me as I have booked a lot of time off work to concentrate on the project, but I am a dead end.
Raw forum message (Sample 2) Block 4 Practical booklet 6 activity 4- Unable to get a fault! T320 09B Eclipse Support I have followed the set up and altered the fault to "none" and simulation to normal, but I do not get any faults at all or a listing that resembles the list on page 12, particularly line 12. I have attached my bpel file and my screenshot, any help appreciated.
Process bpelEcho3pScope: Instance 1 created.
Process bpelEcho3pScope: Executing [/process]
Process Suspended [/process]
Receive ClientRequestMessage: Executing Scope : Completed normally [/process/flow/scope]
Reply ClientResponseMessage: Executing Reply ClientResponseMessage: Completed normally Process bpelEcho3pScope: Completed normally [/process]
Eclipse console listing or XML
T320 09B database properties Total messages:4246 Non-replies:1051 Manually tagged questions:777 Average length (lines)7.9 Containing XML:17 Containing Eclipse content:37
Creating question patterns Extract text from forum messages (non-replies) Create n-grams (‘n’ adjacent words) Perform frequency analysis of n-grams Manually review n-grams to create question patterns
N-gram results Number of wordsUnique patterns
5-word frequency analysis FrequencyN-Gram 17An unexpected error has occurred. 16point me in the right 14I get the following error 13me in the right direction 12unexpected error has occurred. UDDIException 9does not seem to be 8get the following error message 8I get an error message 8system cannot find the path 7Any help would be appreciated. 7I am not sure if 7I can not seem to 7I do not know what 6A problem occured while running 6but I get the following 6cannot find the path specified 6error has occurred. UDDIException java. 6has occurred. UDDIException java. net. 6I am not sure how 6I do not seem to Top 20 results
Sliding window across message FrequencyN-gram 1N-gram 2 1am not that knowledgable HelpI am not that knowledgable 1am not the early adopterI am not the early 1am not thinking straight todayI am not thinking straight 1am not too far offI am not too far 1am not too sure ifI am not too sure 1am not using the faultI am not using the 1am noticing in the consoleI am noticing in the 1am now a while laterI am now a while 1am now adding my exceptionI am now adding my 1am now getting the followingI am now getting the 1am now held up againI am now held up 1am now not sure ifI am now not sure 1am now stuck on activityI am now stuck on 1am now trying not toI am now trying not 1am now trying to startI am now trying to 1am now willing to submitI am now willing to 1am obviously missing something here
Candidate question patterns Class namePattern #question(a|my) question (about|on|for|is) #appreciateappreciate (.*) (advice|comment|guidance|help|direction) #can/could(can|could|will|would) (any|some)\s?(body|one)) (.*) (explain|tell me) #doesdoes (any|some)\s?(body|one) (have|know) #having(have|having) (.*) (problem|nightmare)s? #howhow (best|can|does|do i|do you|do we) #i ami am not (really )?sure (if|how|what|when|whether|why) #i cannoti (can not|cannot|could not) find (.*) answer (.*) question) #justjust wonder(ed|ing)? (if|what) #point mepoint (me|one) (.*) right direction
Generalisation of patterns using POS Question partPOS tag any|someDT advice|comment|guidanceNN appreciated|welcomedVB(N|D)../. POS pattern matching failed due to errors in assigning tags Can/MD anyone/NN offer/VB some/DT help/NN ?/. Can/MD someone/NN offer/VB some/DT help/NN ?/. Can/MD anybody/RB give/VB some/DT guidance/NN ?/. Could/MD somebody/RB give/VB some/DT direction/NN ?/.
XML within messages Detected as single sentence
Eclipse console listing within message Line breaks not recognised as end of sentence
Open-source NLP problems Sentence detection failures: – Bad style (capitalisation, punctuation) – Ellipsis (i tried... it failed... error message...) – XML, BPEL segments concatenated to single sentence Tokenisation failures: – Multiple punctuation ???, !!! (student emphasis) – Abbreviations (im, cant, doesnt, etc.) POS errors – Spelling, grammar
Purpose built tools Tokeniser – Re-coded for typical forum content/style Multiple punctuation Abbreviations Common contractions Sentence detector – New detector based on token sequences Pre-filter messages – Remove XML, console listing, error messages
Message pre-filters Short-forms – i’m, im, i mi am – can’t, cant, can tcan not Line numbers Repeated punctuation (!!!, ???,...) Smilies Salutations (Hi all, Hiya, etc.) Names, signature, course codes
Filtered message Raw message containing Eclipse console listing Filtered message ready to process
PRELIMINARY RESULTS: question classification
Message-set properties Number of messages:1051 (100%) Number of questions(M):777 (73.9%)(100%) Number of questions(A):756 (97.3%) False Positives (A not M):58 (7.4%) False Negatives (M not A):79 (10.2%) M = manually annotated question, A = automatically annotated question Approx 90% success rate
Message-set properties – cont. Average # pattern matches: Min # pattern matches:1 Max # pattern matches:12 Average # of lines (ASCII linefeed)7.9 Min # Lines in a message1 Max # Lines in a message68 Average # of sentences5.0 Min # Sentences in a message 1 Max # Sentences in a message 89 Messages containing XML17 Messages containing BPEL37
Distribution of pattern match count Number of messages Number of pattern matches
Challenges: false positives
Challenges: false negatives
Challenges: detecting the question
Messages matching question pattern Pattern ID Number of messages Pattern IDs
Common question patterns (10) any – (advice|clarification|clue|comment| – further thought|guidance| – help|hint|idea|opinion| – pointer|reason|suggestion|taker)(s)?.* appreciated|welcome|welcomed 216 matches Terms added over time to improve detection of questions
Sample question match (10)
Common question patterns (50) get|getting|gives|got|receive.* error(s)? 102 matches
Sample question match (50)
Discrimination vs Classification Number of messages Pattern ID Low discrimination >>> Increases successful classification at the risk of false-positives High discrimination >>> Reduces successful classification and risk of false-positives
Does process transfer? Tested against TT380 forums 04J – 07J – Preliminary results look promising – Need to manually tag >4000 messages – Review message pre-filters Need access to Humanities course material
Basic method Identify named entities – NEs are block-specific – Majority of questions linked to assignments Parse sentence for dependencies – Nouns (that are NEs) – Verbs
Named entities: inconsistent usage Message body Message subject Error handling Exception handling
Deep parsing: dependencies advmod(delete-5, How-1) aux(delete-5, can-2) nsubj(delete-5, I-3) advmod(delete-5, properly-4) dobj(delete-5, PLTs-6) conj_and(PLTs-6, PLs-8) conj_and(PLTs-6, roles-10) det(project-13, the-12) prep_from(delete-5, project-13) prep_in(delete-5, order-15) aux(have-17, to-16) xcomp(delete-5, have-17) det(sheet-20, a-18) amod(sheet-20, clean-19) dobj(have-17, sheet-20) advmod(have-17, again-21) How can I properly delete PLTs and PLs and roles from the project in order to have a clean sheet again.
Sentences per message Sentence count Number of messages Sentence counts under-estimated due to spelling /grammar errors. Of the 120 single-sentence questions >80% are multiple sentences.
Guess the topic Excuse me for directing this question at you, but when I try to contact my tutor through my homepage i still go to the details for John Stephenson but I am sure that he is ill at the moment. My question refers to the entities described in ECA part2 page 2, it states that the term identifier must be unique within the UK business domain. I thought Buyers ID and Sellers ID could be their address, however, I am stuck on the Order ID which might refer to a depatch note as I do not know what standard these identifiers have to conform to in UK business. I would appreciate being directed as to where I can find this information.
Current status Unable to establish question topic for the 95% of detected questions Current NLP techniques (anaphora and co-reference resolution) for multi-sentence questions not well established.
Pattern matching in console listing
Practical work: exact patterns Process|Assign|Invoke|Scope|Reply.* Completed with fault: invalidVariables|uninitializedVariable|joinFailure Provide direct link to FAQ or teaching materials
Future work Further work on sentence detection – Everything else depends on this Create patterns to identify content – “how do i (.*)” – “are you now saying (.*)” – “(.*) word count” Establish relationships between initial message and replies Build tool to process Eclipse console listings – Could address 5% of all ECA related questions