Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Elena Messina.

Slides:

Advertisements

Similar presentations

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Advertisements

Pat Langley Arizona State University and Institute for the Study of Learning and Expertise Expertise, Transfer, and Innovation in.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California and Center for the Study of Language and Information Stanford University,

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, CA Cumulative Learning of Relational and Hierarchical Skills.

METAGAMER: An Agent for Learning and Planning in General Games Barney Pell NASA Ames Research Center.

General learning in multiple domains transfer of learning across domains Generality and Transfer in Learning training items test items training items test.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.

Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California A Cognitive Architecture for Complex Learning.

SETTINGS AS COMPLEX ADAPTIVE SYSTEMS AN INTRODUCTION TO COMPLEXITY SCIENCE FOR HEALTH PROMOTION PROFESSIONALS Nastaran Keshavarz Mohammadi Don Nutbeam,

ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

Sociology: Chapter 1 Section 1

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.

Dynamics of Comparison Comparing Political Systems.

12/04/2006 Understanding & Predicting E-commerce Adoption: An Extension of the Theory of Planned Behavior (MIS Quarterly March 2006) Presented by: Yasmine.

Introduction to Research

Chapter 12 Instructional Methods

Research problem, Purpose, question

RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY

Science Inquiry Minds-on Hands-on.

INTELLIGENT SYSTEMS Artificial Intelligence Applications in Business.

Data Mining Techniques

Research method2 Dr Majed El- Farra 1 Research methods Second meeting.

Unit 2: Engineering Design Process

Copyright © 2010 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.

Experiments and Observational Studies. Observational Studies In an observational study, researchers don’t assign choices; they simply observe them. look.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.

Understanding Statistics

Big Idea 1: The Practice of Science Description A: Scientific inquiry is a multifaceted activity; the processes of science include the formulation of scientifically.

SOFTWARE DESIGN.

Assumes that events are governed by some lawful order

1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.

Quantitative and Qualitative Approaches

Notes on Research Design You have decided –What the problem is –What the study goals are –Why it is important for you to do the study Now you will construct.

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 1. The Statistical Imagination.

Construct-Centered Design (CCD) What is CCD? Adaptation of aspects of learning-goals-driven design (Krajcik, McNeill, & Reiser, 2007) and evidence- centered.

Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.

1 Chapter Two: Sampling Methods §know the reasons of sampling §use the table of random numbers §perform Simple Random, Systematic, Stratified, Cluster,

The Next Generation Science Standards: 4. Science and Engineering Practices Professor Michael Wysession Department of Earth and Planetary Sciences Washington.

CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.

Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?

C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.

1 Psychology 2020 Unit 1 Science, Research & Ethics.

Learning from Model-Produced Graphs in a Climate Change Science Class Catherine Gautier Geography Department UC Santa Barbara.

CHAPTER 2 Research Methods in Industrial/Organizational Psychology

MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.

IST_Seminar II CHAPTER 12 Instructional Methods. Objectives: Students will: Explain the role of all teachers in the development of critical thinking skills.

Research for Nurses: Methods and Interpretation Chapter 1 What is research? What is nursing research? What are the goals of Nursing research?

Making knowledge work harder Process Improvement.

Topic by Topic Performance of Information Retrieval Systems Walter Liggett National Institute of Standards and Technology TREC-7 (1999)

What is Science? SECTION 1.1. What Is Science and Is Not  Scientific ideas are open to testing, discussion, and revision  Science is an organize way.

1 Simulation Scenarios. 2 Computer Based Experiments Systematically planning and conducting scientific studies that change experimental variables together.

Lesson Overview Lesson Overview What Is Science?.

Value network analysis for complex service systems: Author : Juite Wang Jung-Yu Lai Li-Chun Hsiao Professor : Soe-Tsyr Daphne Yuan Presenter ： Po-Wei Chiang.

Done by Fazlun Satya Saradhi. INTRODUCTION The main concept is to use different types of agent models which would help create a better dynamic and adaptive.

Air Force Institute of Technology

DSS: Decision Support Systems and AI: Artificial Intelligence

CHAPTER 2 Research Methods in Industrial/Organizational Psychology

Chapter 1 Database Systems

Chapter 1 Database Systems

Statistics for the Social Sciences

CHAPTER I. of EVOLUTIONARY ROBOTICS Stefano Nolfi and Dario Floreano

Presented By: Darlene Banta

Statistics for the Social Sciences

DESIGN OF EXPERIMENTS by R. C. Baker

Presentation transcript:

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Elena Messina Intelligent Systems Division National Institute of Standards and Technology Gaithersburg, Maryland Experimental Studies of Integrated Cognitive Systems Thanks to David Aha, Michael Genesereth, and Barney Pell. This work was funded in part by DARPA IPTO, which is not responsible for the points made herein.

Experimentation in Artificial Intelligence Controlled experiments are the primary evaluation tool in modern AI, including the subfields of: supervised learning and reinforcement learning; supervised learning and reinforcement learning; generative planning and scheduling; generative planning and scheduling; computational linguistics and text processing; computational linguistics and text processing; but not for work on integrated cognitive systems. but not for work on integrated cognitive systems. Extending experimental methods to the latter is crucial, since it deals with the ultimate goals of artificial intelligence.

Challenges for Experimentation The reasons that experiments with integrated cognitive systems have lagged behind are clear from the phrase itself: systems are harder to evaluate than component algorithms; systems are harder to evaluate than component algorithms; cognitive methods involve complex, multi-step reasoning; cognitive methods involve complex, multi-step reasoning; integrated software relies on interactions among components. integrated software relies on interactions among components. Together, these factors have slowed the development and wide acceptance of an experimental framework. In this talk, we propose the key elements of an experimental method for the study of integrated cognitive systems.

Dependent Variables: Basic Measures Dependent variables in an experiment measure system behavior. Some basic measures of integrated cognitive systems include: success or failure on a given problem; success or failure on a given problem; speed or efficiency of the systems response; speed or efficiency of the systems response; desirability or quality of the systems response. desirability or quality of the systems response. Such metrics provide the building blocks for more sophisticated and informative measures of behavior.

Dependent Variables: Combined Measures Statistics tells us we should not draw conclusions from one case. Collecting multiple samples supports combined measures like: average behavior of the system; average behavior of the system; cumulative behavior of the system; cumulative behavior of the system; variance of the systems behavior. variance of the systems behavior. Combined measures also partly cancel variation due to unknown or uncontrolled factors. However, this requires some population from which samples are drawn, which one should always specify clearly.

Dependent Variables: Higher-Order Metrics Combined measures present only a small window on behavior. However, one can also derive higher-order measures such as: the slope and intercept with respect to a control system; the slope and intercept with respect to a control system; the intercept, rate, and asymptote of a learning curve. the intercept, rate, and asymptote of a learning curve. Such metrics let one summarize behavior even when variation across samples is not systematic. Conclusions about higher-order measures are more important than ones about basic or combined variables.

Independent Variables: Task Characteristics Independent variables in an experiment reflect factors thought to influence system behavior. An important class of factors are domain or task features like: the complexity of the environment; the complexity of the environment; the difficulty of achieving a given task; the difficulty of achieving a given task; the resources available for pursuing the task. the resources available for pursuing the task. Experiments that vary these factors reveal how the intelligent systems behavior depends on them. Synthetic domains let one alter such variables systematically, but it is crucial that they be similar to natural domains.

Independent Variables: System Characteristics Another important class of variables involves system features. Varying these factors leads to different types of experiments: parametric studies (altering system parameters); parametric studies (altering system parameters); lesion studies (removing a system component); lesion studies (removing a system component); replacement studies (replacing one module with another). replacement studies (replacing one module with another). Such experiments suggest ways that the intelligent systems behavior depends on its parameters and components. Studies that vary two or more factors can reveal interactions among them.

Independent Variables: System Knowledge A third class of factors concerns the knowledge and experience of the intelligent system. One can adapt lesion and replacement studies to examine: the presence or absence of types of knowledge; the presence or absence of types of knowledge; the amount of knowledge about a given subject; the amount of knowledge about a given subject; the amount of experience with a class of tasks. the amount of experience with a class of tasks. Such experiments let one plot behavioral measures as a function of knowledge and experience (learning curves). They also let one compute higher-order measures such as rate of improvement and asymptotic performance.

Repositories for Cognitive Systems Public repositories are now common among the AI subfields, and they offer clear advantages for research by: providing fast and cheap materials for experiments; providing fast and cheap materials for experiments; supporting replication and standards for comparison. supporting replication and standards for comparison. However, they can also produce undesirable side effects by: focusing attention on a narrow class of problems; focusing attention on a narrow class of problems; encouraging a bake-off mentality among researchers. encouraging a bake-off mentality among researchers. To support research on cognitive systems, we need testbeds and environments designed to evaluate general intelligence.

Desirable Characteristics of Testbeds Testbeds that are designed to support research on integrated cognitive systems should: include a variety of domains to ensure generality; include a variety of domains to ensure generality; be well documented and simple for researchers to use; be well documented and simple for researchers to use; have standard formats to ease interface with systems. have standard formats to ease interface with systems. However, these features are already present in many existing repositories, and more work is necessary.

Desirable Characteristics of Testbeds In addition, testbeds for integrated cognitive systems should: contain not data sets but task environments contain not data sets but task environments which support agents that exist over time which support agents that exist over time at least some of which involve physical domains at least some of which involve physical domains provide an infrastructure to ease experimentation with provide an infrastructure to ease experimentation with external databases (e.g., geographic information systems) external databases (e.g., geographic information systems) controlled capture, replay, and restart of scenarios controlled capture, replay, and restart of scenarios methods for recording performance measures methods for recording performance measures Also, environments should have little or no dependence on sensory processing.

Physical vs. Simulated Environments For domains that involve external settings, one can either a physical or a simulated environment for evaluation. Simulated environments have many advantages, including: ability to vary domain parameters and physical layout; ability to vary domain parameters and physical layout; ease of recording traces of behavior and cognitive state. ease of recording traces of behavior and cognitive state. One can make simulated environments more realistic by: using simulators that support kinematics and dynamics; using simulators that support kinematics and dynamics; including data from real sensors in analogous locations. including data from real sensors in analogous locations. This approach combines the relevance of physical testbeds with the affordability of synthetic ones.

Some Promising Domains A number of domains hold promise for the experimental study of integrated cognitive systems: urban search and rescue (Balakirsky & Messina, 2002); urban search and rescue (Balakirsky & Messina, 2002); flying aircraft on military missions (Jones et al., 1999); flying aircraft on military missions (Jones et al., 1999); driving a vehicle in a city (Choi et al., 2004); driving a vehicle in a city (Choi et al., 2004); playing strategy games (Aha & Molineaux, 2004); playing strategy games (Aha & Molineaux, 2004); general game playing (Genesereth, 2004). general game playing (Genesereth, 2004). Each requires the integration of cognition, perception, and action in a complex, dynamical setting.

Goals of Scientific Experimentation Science aims not to show that one method is better than another, but to understand the reasons for complex behavior. This goal can best be achieved through experimental studies that: ask clear questions or test specific hypotheses ask clear questions or test specific hypotheses examine relations between behavior and independent factors examine relations between behavior and independent factors move beyond descriptions to explanations of phenomena move beyond descriptions to explanations of phenomena Good experiments provide insight into the reasons that underlie system behavior. Also, whether or not they support an hypothesis, they do not end the story, but rather suggest ideas for further studies.

Concluding Remarks challenges posed by their distinctive characteristics; challenges posed by their distinctive characteristics; dependent measures that describe their behavior; dependent measures that describe their behavior; independent variables that influence this behavior; independent variables that influence this behavior; the need for environments and testbeds that: the need for environments and testbeds that: exercise the full capabilities of integrated agents; exercise the full capabilities of integrated agents; evaluate their behavior at the system level; evaluate their behavior at the system level; support studies of interactions among components. support studies of interactions among components. In this talk, we considered the experimental study of integrated cognitive systems, including: Taking these into account will transform the study of integrated cognitive systems into a well-balanced experimental science.

End of Presentation