Watson and the Jeopardy! Challenge Michael Sanchez

Slides:



Advertisements
Similar presentations
© Mark E. Damon - All Rights Reserved Another Presentation © All rights Reserved
Advertisements

About Certiport Worldwide administrator of the Microsoft Business Certification program: –Microsoft Business Certification Credentials Microsoft Office.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
IND 205 Audit Populations & Samples TLO #10: Given the requirement to audit a Contractors Property Management System, determine how to define a population,
Statistics Loyola Law School – Spring 2008 Doug Stenstrom phone: (213)
1 Welcome to Seminar Unit 9: Analysis of an Optimum Model of Health and Wellness Robyn Hoban.
Set Up Instructions Place a question in each spot indicated Place an answer in each spot indicated Remove this slide Save as a powerpoint slide show.
Name: Date: Read temperatures on a thermometer Independent / Some adult support / A lot of adult support
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200.
Instructions for Playing Jeopardy Click on the question that you want to attempt, example $100 Read the question and click on the slide to advance to.
Did You Know! Ever Changing World.
SEARCHING BOOKS prepared by Literature Searching Team
Credit Card Understanding Your Credit Card Credit Cards 101 Trivia.
1 IT Rocks Let IT Take You There! Speaker Name. 2 Innovations That Were Not in the Market 30 Years Ago…what will the next 30 years bring ? The next few.
an Internet bookmark acts as a marker for a Web site.
By Sophie Hutchinson (sxh07u). Contents Introduction to Real-time systems Two main types of system Testing real-time software Difficulties with testing.
GAME RULES Chose teams One team, picks one case The team opens up to 6 of the remaining 25 cases. The Banker makes an offer. If the team declines they.
Effective Test Planning: Scope, Estimates, and Schedule Presented By: Shaun Bradshaw
Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Advance Database Systems and Applications COMP 6521
An advanced testing, recording, and evaluation tool for online training and performance evaluation. RoboTutor Software, 8980 Cheshire Drive, Sandy, Utah.
Understanding Tables on the Web Jingjing Wang. Problem to Solve A wealth of information in the World Wide Web Not easy to access or process by machine.
By: Jordan Dalstrom Computers 8 December Table of Contents Best airsoft gun Airsoft (First timers) Airsoft (Pros) Legal and illegal in Canada Camouflage.
Stat 35: Introduction to Probability with Applications to Poker Outline for the day: 1.Addiction 2.Syllabus, etc. 3. Wasicka/Gold/Binger Example 4.Meaning.
Combining Human and Machine Capabilities for Improved Accuracy and Speed in Visual Recognition Tasks Team 1 Amir Schir (Team Leader), Fritz Gabriel, Jean.
AUDREY PREYOR EDT Information Technology; A Tool for the 21 st century learner.
Adaptation of University and College Graduates to Real Life: Fighting Down Unemployment and Improvement of Youth Competitiveness Mr. Dmitry LIVANOV, Minister.
Equal or Not. Equal or Not
Making Numbers Two-digit numbers Three-digit numbers Click on the HOME button to return to this page at any time.
$100 $100 $100 $100 $100 $200 $200 $200 $200 $200 $300 $300 $300 $300 $300 $400 $400 $400 $400 $400 $500 $500 $500 $500 $500.
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
A Wizard Workshops Creation © all rights reserved Feel Free to use this Powerpoint for your review games,,,, but please,
Item Response Theory in the Secondary Classroom: What Rasch Modeling Can Reveal About Teachers, Students, and Tests. T. Jared Robinson tjaredrobinson.com.
© Copyright 2012 STI INNSBRUCK Apache Lucene Ioan Toma based on slides from Aaron Bannert
OCLC Research TAI CHI Webinar 5/27/2010 A Gentle Introduction to Linked Data Ralph LeVan Sr. Research Scientist OCLC Research.
From NLP to Cognitive Computing Dr. Mauricio García.
UIMA David Gondek Knowledge Capture and Learning DeepQA IBM Research.
Watson Systems By- Team 7 : Pallav Dhobley Vihang Gosavi Ashish Yadav
Leveraging Community-built Knowledge For Type Coercion In Question Answering Aditya Kalyanpur, J William Murdock, James Fan and Chris Welty Mehdi AllahyariSpring.
Showcasing work by IBM on IBM’s Watson and Jeopardy!
IBM’s DeepQA, or Watson. Little history Carnegie Mellon (CMU) collab. OpenEphyra (2002) Piquant (2004) Initially 15% accuracy 15% is not very good, is.
Decision Making in IBM Watson™ Question Answering Dr. J
Watson Robert Yates Watson Core Development.  A brief History of Watson  What is it good for?  How does it work?  Current Focus Agenda.
Watson: The Jeopardy! Machine Robin Sturm. Who is Watson? This is Watson! (His “face” at least)
IBM’s Watson. IBM’s Watson represents an innovation in Data Analysis Computing called Deep QA (Question Answering) Their project is a hybrid technology.
CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering April 4, 2011 Marco Valtorta How Does Watson Work?
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Solving Crossword Puzzles with AI:
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal VideoConference Archives Indexing System.
Grades: 6-8 Subject: Artificial Intelligence An Introduction to the Turing Test.
Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –
AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
DAVID CALAWA IBM DATA MINING TOOLS. PRODUCTS Cognos A suite of products focusing on analyzing and displaying data Watson A cloud based analytics service.
Automatic Question Answering Beyond the Factoid Radu Soricut Information Sciences Institute University of Southern California Eric Brill Microsoft Research.
Recruiting TOP Performers using the Precision Questioning Technique Bryan Starbuck, CEO Semantic Search that Sources Job Boards,
Aakarsh Malhotra ( ) Gandharv Kapoor( )
Cognitive Computing for Democratizing Domain-Specific Knowledge.
Do Computers think?.
Type Subject Here (1). Type Subject Here (1) Type Subject Here (2)
CSCE 190 November 17, 2015 Marco Valtorta
CSCE 390 Professional Issues in Computer Science and Engineering
JEOPARDY.
Introduction to Artificial Intelligence
Presentation transcript:

Watson and the Jeopardy! Challenge Michael Sanchez IBM DeepQA Watson and the Jeopardy! Challenge Michael Sanchez

What is DeepQA? QA – Question Answering Systems designed to answer questions posed in natural language Goal – create a system capable of playing Jeopardy! at human championship level In real time IBM’s follow up project to DeepBlue

Initial Performance Baseline performance of adapted PIQUANT (Practical Intelligent Question Answering Technology) to Jeopardy! challenge. Had been in development for several years prior One of the top 3-5 in TREC (Text Retrieval Conference) QA system. Winners cloud - % of questions answered vs. % correct for winners. Dark dots = Ken Jennings PIQUANT – 5% of question it most confident in, correct < 50% of the time

DeepQA Architecture Major overhaul from PIQUANT system. Massively parallel, probabilistic, evidence based Every step done multiple different ways, scored, and weighted.

Content Acquisition Take an initial corpus of documents Unstructured data For Jeopardy! – roughly ~400 TB of data Including all of Wikipedia Parsed into “Syntactic Frames” Subject-Verb-Object Generalized into “Semantic Frames” Probability associated Forms a “Semantic Net” Inventors patent inventions (.8) Fluid is a liquid (.6) Liquid is a fluid (.5) Vessels sink (.7) People sink 8-balls (.5) – (.8 in pool) 400 TB -> 20 TB for Semantic Net Semantic Net kept for searches

Question Analysis Attempt to understand what the question is asking Many different approaches are taken Attempting to come up with all possible interpretation of the question Question Classification – what type of question? Puzzle, math, definition Focus/LAT: Focus – what the question is asking about LAT - lexical answer type, single word that determines the type of answer Relation Detection Decomposition – break question into more easily answered subquestions

Hypothesis Generation Primary Search – Attempt to come up with as much answer content as possible from sources Various search techniques 85% of time correct answer within top 250 at this stage Candidate Answer Generation – Use appropriate techniques to extract answer from content Filter – Lightweight scoring of candidate answers ~100 answers let through Primary Search Techniques Multiple Text search engines Document Search Passage Search etc… Recall vs. precisions As many answer as possible – if answer isn’t here, Watson can’t get it! Candidate Answer Generation if “title base search”, extract title as answer

Hypothesis Scoring Retrieve additional evidence supporting each candidate answer that passed filtering Score the candidate answers based on supporting evidence More than 50 different types of scoring methods Ex. Temporal, Geospatial, Popularity, Source Reliability Temporal – Time reference Geospatial – spatial relationship – boundaries, relative locations

Result Merging Identify related answers and combine their scores Generate confidence estimation Indicates how confident in the answer the system is System training is important here Different question types might weigh scores differently Probabilistic Results are then ranked on confidence Highest confidence = best answer Abe Lincoln + Honest Abe – same answer/related, merge results Confidence based on how good the scores are. Ability to learn dynamically what a category is asking for.

DeepQA Performance Initial jump in performance when DeepQA architecture implemented Incremental improvements followed over time

DeepQA on Watson With a single CPU - ~ 2 hours to get an answer Not fast enough for Jeopardy! Questions take ~ 3 seconds on average to read Take advantage of the parallel capabilities of DeepQA 90 Power 750 servers = 2880 CPUs 80 TFLOPS Able to answer in 3-5 seconds Embarrassingly parallel POWER7 CPU = 8C / 32T 16 TB RAM Was ranked 94th in Top 500 super computers 200M pages of data -> 4TB disk storage. RAM used for speed

Jeopardy! Challenge In January 2011 Watson competed against two of the best Jeopardy! Champions Ken Jennings – $3,172,700 in winnings Brad Rutter - $3,470,102 in winnings Two matches played Questions chosen from unaired episodes Airdates February 14 & 15th 2011 IBM worried questions would be created to deliberately take advantage of Watson’s weaknesses. Unaired questions compromise

Outcome First Game Second Game Watson wins - $35,734 Rutter - $10,400, Jennings - $4,800 Second Game Watson - $77,147 Jennings - $24,000, Rutter - $21,600 Watson only rang in if confidence high enough. - Defaults to 50%, but can shift. Demonstrates complicated betting strategies depending on game state - If large lead, more conservative

Future Applications Take advantage of DeepQA’s ability to process large amounts of unstructured Data Medicine Amount of data increasing doubling every 5 years Almost entirely unstructured Finance 5 documents from Wall Street every minute Millions of transactions Memorial Sloan-Kettering Cancer Center Citi

Questions? “I for one welcome our new computer overlords” –Ken Jennings

References The AI Behind Watson: http://www.aaai.org/Magazine/Watson/watson.php What is Watson?: http://static.usenix.org/event/lisa11/tech/slides/perrone.pdf Building Watson: http://www.youtube.com/watch?v=3G2H3DZ8rNc IBM Watson: The Science Behind an Answer: http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=6177717 Watson: http://en.wikipedia.org/wiki/Watson_%28computer%29 Question Answering: http://en.wikipedia.org/wiki/Question_answering DeepQA Research Team: http://researcher.watson.ibm.com/researcher/view_project_subpage.php?id=2159