We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byGraham Voshall
Modified about 1 year ago
© 2012 The MITRE Corporation. All rights reserved. For Internal MITRE Use Evaluation and STS Workshop on a Pipeline for Semantic Text Similarity (STS) March 12, 2012 Sherri Condon The MITRE Corporation
© 2012 The MITRE Corporation. All rights reserved. For Internal MITRE Use Page 2 Evaluation and STS Wish-list ■Valid: measure what we think we’re measuring (definition) ■Replicable: same results for same inputs (annotator agreement) ■Objective: no confounding biases (from language or annotator) ■Diagnostic –Not all evaluations achieve this –Understanding factors and relations ■Generalizable –Makes true predictions about new cases –Functional: if not perfect, good enough ■Understandable: –Meaningful to stakeholders –Interpretable components ■Cost effective
© 2012 The MITRE Corporation. All rights reserved. For Internal MITRE Use ■Meaning as extension: same/similar denotation –Anaphora/coreference and time/date resolution –The evening star happens to be the morning star –“Real world” knowledge = true in this world ■Meaning as intension –Truth (extension) in the same/similar possible worlds –Compositionality: inference and entailment ■Meaning as use –Equivalence for all the same purposes in all the same contexts –“Committee on Foreign Affairs, Human Rights, Common Security and Defence Policy” vs “Committee on Foreign Affairs” –Salience, application specificity, implicature, register, metaphor ■Yet remarkable agreement in intuitions about meaning Quick Foray into Philosophy Page 3
© 2012 The MITRE Corporation. All rights reserved. For Internal MITRE Use ■Computer vision through a human lens –Recognize events in video as verbs –Produce text descriptions of events in video ■Comparing human descriptions to system descriptions raises all the STS issues –Salience/importance/focus (A gave B a package. They were standing.) –Granularity of description (car vs. red car, woman vs. person) –Knowledge and inference (standing with motorcycle vs. sitting on motorcycle: motorcycle is stopped) –Unequal text lengths ■Demonstrates value of/need for understanding these factors DARPA Mind’s Eye Evaluation Page 4
© 2012 The MITRE Corporation. All rights reserved. For Internal MITRE Use ■Basic similarity scores based on dependency parses –Scores increase for matching predicates and arguments –Scores decrease for non-matching predicates and arguments –Accessible syn-sets and ontological relations expand matches ■Salience/importance –Obtain many “reference” descriptions –Weight predicates and arguments based on frequency ■Granularity of description –Demonstrates influence of application context –Program focus is on verbs, so nouns match loosely ■Regularities in evaluation efforts that depend on semantic similarity promise common solutions (This is work with Evelyne Tzoukermann and Dan Parvaz) Mind’s Eye Text Similarity Page 5
© 2012 The MITRE Corporation. All rights reserved. For Internal MITRE Use Predicate ArgumentWeight dobjclosecan1x dobjcloseit3 dobjholdlid6x dobjplaceit1 dobjplacelid2 dobjputit1xxx dobjputlid5x dobjtakelid1 nsubjclosewoman1 nsubjholdwoman2x Nominal errors Verbal Errors01116 Predicate errors Total Score Test Sentences Page 6
When What You Have Is Not Enough NCOLCTL 24 April 2009 Ray Clifford.
Artificial Intelligence and the Internet Edward Brent University of Missouri – Columbia and Idea Works, Inc. Theodore Carnahan Idea Works, Inc.
Introduction to Logic and Prolog Sabu Francis, B.Arch (Hons)
Dr. William G. Huitt Valdosta State University Introduction to Psychology Last revised: May 2005.
TASK-BASED ASSESSMENT IN LANGUAGE LEARNING PROGRAMS: PIECES OF THE PUZZLE Geoff Brindley Macquarie University, Sydney, Australia.
Zádor Dániel Kelemen Budapest University of Technology and Economics, SQI - Hungarian Software Quality Consulting Institute Supervisors: Katalin Balla.
What is the Value of Architecture Andrew L Macaulay Global Head of Architects Community March 2006 In collaboration with Microsoft Architect Insight Conference.
Chapter - 5 Understanding Requirements Unit II. Introduction Definition : “The broad spectrum of tasks and techniques that lead to an understanding of.
AME 4163 University of Oklahoma Understanding Customer Requirements Principles of Design Zahed Siddique Assistant Professor School of Aerospace and Mechanical.
1() Information Extraction – why Google doesnt even come close Diana Maynard Natural Language Processing Group University of Sheffield, UK.
1 Knowledge Representation Introduction KR and Logic.
1 Ontologizing the Ontolog Content Protégé Workshop Denise A. D. Bedford, Ph.d. July 23, 2006.
Exam Preparation. How to Study You may notice after doing a few practice exams you still have a few learning gaps! Go back to fill those learning gaps.
Experimental concepts. TheoryTheory. A theory can be defined as a "general principle proposed to explain how a number of separate facts are related." In.
1 Science as a Process Chapter 1 Section 2. 2 Objectives Explain how science is different from other forms of human endeavor. Identify the steps that.
Hyperintensionality and Impossible Worlds: An Introduction David Chalmers.
University of Toronto Michael Gruninger University of Toronto, Canada Leo Obrst MITRE, McLean, VA, USA February 6, 2014February 6, 2014February 6, 2014.
ABCs of CBMs Summary of A Practical Guide to Curriculum-Based Measurement By: Michelle Hosp, John Hosp, & Kenneth Howell.
Ed-D 420 Inclusion of Exceptional Learners. CAT time Learner-Centered - Learner-centered techniques focus on strategies and approaches to improve learning.
Improving the Productivity of Software Development Essential Steps & Problems.
Dr. Leo Obrst MITRE Information Semantics Center for Innovative Computing & Informatics October 12, 2006 Ontologies & Databases: Similarities & Differences.
1 Topic-focus ambiguities in natural language Marie Duží VSB-Technical University Ostrava & Charles University of Prague
Page 1 + Advanced Intelligence Community R&D Meets the Semantic Web! Lucian Russell, PhD Expert Reasoning & Decisions LLC Semantic Technology Conference.
1 The RealWorld Evaluation Approach to Impact Evaluation With reference to the chapter in the Country-led monitoring and evaluation systems book Michael.
© 2002 Franz J. Kurfess Logic and Reasoning 1 CPE/CSC 481: Knowledge-Based Systems Dr. Franz J. Kurfess Computer Science Department Cal Poly.
PART I – THE DISTINCTION BETWEEN QUALITATIVE AND QUANTITATIVE RESEARCH METHODS PART II – EVALUATION OF QUALITATIVE METHODS INCLUDING CONCEPTS LIKE CREDIBILITY,
What is District Wide Accreditation? Ensure Desired Results Improve Teaching & Learning Foster a Culture of Improvement A powerful systems approach to.
Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 9: Natural Language Processing and IR. Tagging, WSD, and Anaphora Resolution.
Research on Assessment 21 st Century Paradigm Shifts ◦ NCLB calls teachers and teacher candidates to be accountable for student learning citing research.
1 Evaluating Reasoning Systems: Ontology Languages Michael Grunginger, U. Toronto Conrad Bock, U.S. NIST February 22 nd, 2007.
© 2016 SlidePlayer.com Inc. All rights reserved.