Pat Langley Dan Shapiro Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Slides:

Advertisements

Similar presentations

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley Arizona State University and Institute for the Study of Learning and Expertise Expertise, Transfer, and Innovation in.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Elena Messina.

Center for the Study of Language and Information Stanford University, Stanford, California March 20-21, 2004 Symposium on Reasoning and Learning in Cognitive.

Pat Langley Seth Rogers Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA.

Some Information 1.Parking is free in Stanford lots over the weekend. 2.There is no registration fee thanks to ONR and NSF. 3.Participation in the meeting.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, CA Cumulative Learning of Relational and Hierarchical Skills.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Varieties of Problem Solving in a Unified Cognitive Architecture.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Extending the I CARUS Cognitive Architecture Thanks to D. Choi,

Pat Langley Dongkyu Choi Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Mental Simulation and Learning in the I CARUS Architecture.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

IL Kickoff Meeting June 20-21, 2006 DARPA Integrated Learning POIROT Project 1 Learning Hierarchical Task Networks by Analyzing Expert Traces Pat Langley.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA A Unified Cognitive Architecture for Embodied Agents Thanks.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Cognitive Architectures and Virtual Intelligent Agents Thanks.

Pat Langley Computer Science and Engineering Arizona State University Tempe, Arizona USA Institute for the Study of Learning and Expertise Palo Alto, California.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California A Cognitive Architecture for Complex Learning.

1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory.

ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS

Intelligent Architectures for Electronic Commerce Part 1.5: Symbolic Reasoning Agents.

Artificial Intelligence

Chapter Five The Cognitive Approach II: Memory, Imagery, and Problem Solving.

Emotion-Driven Reinforcement Learning Bob Marinier & John Laird University of Michigan, Computer Science and Engineering CogSci’08.

Introduction to SOAR Based on “a gentle introduction to soar: an Architecture for Human Cognition” by Jill Fain Lehman, John Laird, Paul Rosenbloom. Presented.

Faculty of Management and Organization Emergence of social constructs and organizational behaviour How cognitive modelling enriches social simulation Martin.

Organizational Notes no study guide no review session not sufficient to just read book and glance at lecture material midterm/final is considered hard.

The Importance of Architecture for Achieving Human-level AI John Laird University of Michigan June 17, th Soar Workshop

What is Cognitive Science? … is the interdisciplinary study of mind and intelligence, embracing philosophy, psychology, artificial intelligence, neuroscience,

Models of Human Performance Dr. Chris Baber. 2 Objectives Introduce theory-based models for predicting human performance Introduce competence-based models.

Physical Symbol System Hypothesis

What is Cognitive Science? … is the interdisciplinary study of mind and intelligence, embracing philosophy, psychology, artificial intelligence, neuroscience,

Cognitive Modeling & Information Processing Metaphor.

Cognitive level of Analysis

Knowledge representation

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

New SVS Implementation Joseph Xu Soar Workshop 31 June 2011.

Bob Marinier Advisor: John Laird Functional Contributions of Emotion to Artificial Intelligence.

Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.

Chapter Five The Cognitive Approach II: Memory, Imagery, and Problem Solving.

CHAPTER FIVE The Cognitive Approach II: Memory, Imagery, and Problem Solving.

Sensory and Working Memories Reviewing Behaviorism Information Processing Memory Test your perception—top down or bottom up.

Cognitive Architectures For Physical Agents Sara Bolduc Smith College CSC 290.

RULES Patty Nordstrom Hien Nguyen. "Cognitive Skills are Realized by Production Rules"

SOAR A cognitive architecture By: Majid Ali Khan.

Cognitive Architectures and General Intelligent Systems Pay Langley 2006 Presentation : Suwang Jang.

Son Thanh To Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California Dongkyu Choi Department of Aerospace Engineering University.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA

Cognitive Modeling Cogs 4961, Cogs 6967 Psyc 4510 CSCI 4960 Mike Schoelles

COGNITIVE LEVEL OF ANALYSIS An Introduction. Cognitive Psychology studies: how the human mind comes to know things about the world AND how the mind uses.

Done by Fazlun Satya Saradhi. INTRODUCTION The main concept is to use different types of agent models which would help create a better dynamic and adaptive.

Learning Fast and Slow John E. Laird

Learning Teleoreactive Logic Programs by Observation

Artificial Intelligence Chapter 25 Agent Architectures

Seven Principles of Synthetic Intelligence

IN THE NAME OF “ALLAH” THE MOST BENIFICENT AND THE MOST MERCIFUL

Symbolic cognitive architectures

An Integrated Theory of the Mind

Gestalt Theory.

SOAR 1/18/2019.

Artificial Intelligence Chapter 25. Agent Architectures

Artificial Intelligence Chapter 25 Agent Architectures

A Value-Driven Architecture for Intelligent Behavior

LEARNER-CENTERED PSYCHOLOGICAL PRINCIPLES. The American Psychological Association put together the Leaner-Centered Psychological Principles. These psychological.

Presentation transcript:

Pat Langley Dan Shapiro Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California A Value-Driven Architecture for Intelligent Behavior This research was supported in part by Grant NCC from NASA Ames Research Center. Thanks to Meg Aycinena, Michael Siliski, Stephanie Sage, and David Nicholas.

Assumptions about Cognitive Architectures 1.We should move beyond isolated phenomena and capabilities to develop complete intelligent agents. 2.Artificial intelligence and cognitive psychology are close allies with distinct but related goals. 3.A cognitive architecture specifies the infrastructure that holds constant over domains, as opposed to knowledge, which varies. 4.We should model behavior at the level of functional structures and processes, not the knowledge or implementation levels. 5.A cognitive architecture should commit to representations and organizations of knowledge and processes that operate on them. 6.An architecture should come with a programming language for encoding knowledge and constructing intelligent systems. 7.An architecture should demonstrate generality and flexibility rather than success on a single application domain.

Examples of Cognitive Architectures ACTE through ACT-R (Anderson, 1976; Anderson, 1993) ACTE through ACT-R (Anderson, 1976; Anderson, 1993) Soar (Laird, Rosenbloom, & Newell, 1984; Newell, 1990) Soar (Laird, Rosenbloom, & Newell, 1984; Newell, 1990) Prodigy (Minton & Carbonell., 1986; Veloso et al., 1995) Prodigy (Minton & Carbonell., 1986; Veloso et al., 1995) PRS (Georgeff & Lansky, 1987) PRS (Georgeff & Lansky, 1987) 3T (Gat, 1991; Bonasso et al., 1997) 3T (Gat, 1991; Bonasso et al., 1997) EPIC (Kieras & Meyer, 1997) EPIC (Kieras & Meyer, 1997) APEX (Freed et al., 1998) APEX (Freed et al., 1998) Some of the cognitive architectures produced over 30 years include: However, these systems cover only a small region of the space of possible architectures.

Goals of the I CARUS Project integrate perception and action with cognition integrate perception and action with cognition combine symbolic structures with affective values combine symbolic structures with affective values unify reactive behavior with deliberative problem solving unify reactive behavior with deliberative problem solving learn from experience but benefit from domain knowledge learn from experience but benefit from domain knowledge We are developing the I CARUS architecture to support effective construction of intelligent autonomous agents that: In this talk, we report on our recent progress toward these goals.

Design Principles for I CARUS 1. Affective values pervade intelligent behavior; 2. Categorization has primacy over execution; 3. Execution has primacy over problem solving; 4. Tasks and intentions have internal origins; and 5. The agents reward is determined internally. Our designs for I CARUS have been guided by five principles: These ideas distinguish I CARUS from most agent architectures.

A Cognitive Task for Physical Agents (car self) (#back self 225) (#front self 235) (#speed self 60) (car brown-1)(#back brown-1 208)(#front brown-1 218)(#speed brown-1 65) (car green-2)(#back green-2 251)(#front green-2 261)(#speed green-2 72) (car orange-3)(#back orange-3 239)(#front orange-3 249)(#speed orange-3 56) (in-lane self B)(in-lane brown-1 A)(in-lane green-2 A)(in-lane orange-3 C)

Overview of the I CARUS Architecture* Long-TermConceptualMemory Long-Term Skill Memory Short-TermConceptualMemory Short-Term Nominate Skill Instance Categorize / Compute Reward Repair Skill Conditions Select / Execute Skill Instance Abandon PerceiveEnvironment Environment PerceptualBuffer * without learning

Some Motivational Terminology Reward the affective value produced on the current cycle. Reward the affective value produced on the current cycle. Past reward the discounted sum of previous agent reward. Past reward the discounted sum of previous agent reward. Expected reward the predicted discounted future reward. Expected reward the predicted discounted future reward. I CARUS relies on three quantitative measures related to motivation: These let an I CARUS agent make decisions that take into account its past, present, and future affective responses.

Long-Term Conceptual Memory Boolean concepts that are either True or False; Boolean concepts that are either True or False; numeric concepts that have quantitative measures. numeric concepts that have quantitative measures. I CARUS includes a long-term conceptual memory that contains: Each Boolean concept includes an associated reward function. * Icarus concept memory is distinct from, and more basic than, skill memory, and provides the ultimate source of motivation. primitive (corresponding to the results of sensory actions); primitive (corresponding to the results of sensory actions); defined as a conjunction of other concepts and predicates. defined as a conjunction of other concepts and predicates. These concepts may be either:

Examples of Long-Term Concepts (ahead-of(?car1 ?car2) (coming-from-behind (?car1 ?car2) :defn(car ?car1) (car ?car2) :defn(car ?car1) (car ?car2) :defn(car ?car1) (car ?car2) :defn(car ?car1) (car ?car2) (#back ?car1 ?back1)(in-lane ?car1 ?lane1) (#back ?car1 ?back1)(in-lane ?car1 ?lane1) (#front ?car2 ?front2)(in-lane ?car2 ?lane2) (#front ?car2 ?front2)(in-lane ?car2 ?lane2) (> ?back1 ?front2) (adjacent ?lane1 ?lane2) (> ?back1 ?front2) (adjacent ?lane1 ?lane2) :reward(#dist-ahead ?car1 ?car2 ?d) (faster-than ?car1 ?car2) :reward(#dist-ahead ?car1 ?car2 ?d) (faster-than ?car1 ?car2) (#speed ?car2 ?s)(ahead-of ?car2 ?car1) ) :weights( ) ) :reward(#dist-behind ?car1 ?car2 ?d) :weights( ) ) :reward(#dist-behind ?car1 ?car2 ?d) (#speed ?car2 ?s) :weights( ) ) :weights( ) ) (clear-for(?lane ?car) :defn(lane ?lane ?left-line ?right-lane) :defn(lane ?lane ?left-line ?right-lane) (not (overlaps-and-adjacent ?car ?other)) (not (coming-from-behind ?car ?other)) (not (coming-from-behind ?other ?car)) :constant10.0 ) :constant10.0 )

in-lane adjacent #yback lane A Sample Conceptual Hierarchy clear-for overlaps-and-adjacent ahead-ofcoming-from-behind faster-than overlaps car #speed #yfront

Long-Term Skill Memory an :objective field that encodes the skills desired situation; an :objective field that encodes the skills desired situation; a :start field that must hold for the skill to be initiated; a :start field that must hold for the skill to be initiated; a :requires field that must hold throughout the skills execution; a :requires field that must hold throughout the skills execution; an :ordered or :unordered field referring to subskills or actions; an :ordered or :unordered field referring to subskills or actions; a :values field with numeric concepts to predict expected value; a :values field with numeric concepts to predict expected value; a :weights field indicating the weight on each numeric concept. a :weights field indicating the weight on each numeric concept. I CARUS includes a long-term skill memory in which skills contain: These fields refer to terms stored in conceptual long-term memory. * Icarus skill memory encodes knowledge about how and why to act in the world, not about how to solve problems.

Examples of Long-Term Skills (pass (?car1 ?car2 ?lane)(change-lanes (?car ?from ?to) :start (ahead-of ?car2 ?car1) :start(in-lane ?car ?from) :start (ahead-of ?car2 ?car1) :start(in-lane ?car ?from) (in-same-lane ?car1 ?car2) :objective(in-lane ?car ?to) (in-same-lane ?car1 ?car2) :objective(in-lane ?car ?to) :objective(ahead-of ?car1 ?car2) :requires(lane ?from ?shared ?right) :objective(ahead-of ?car1 ?car2) :requires(lane ?from ?shared ?right) (in-same-lane ?car1 ?car2) (lane ?to ?left ?shared) (in-same-lane ?car1 ?car2) (lane ?to ?left ?shared) :requires(in-lane ?car2 ?lane) (clear-for ?to ?car) :requires(in-lane ?car2 ?lane) (clear-for ?to ?car) (adjacent ?lane ?to) :ordered(*shift-left) (adjacent ?lane ?to) :ordered(*shift-left) :ordered(speed&change ?car1 ?car2 ?lane ?to) :constant0.0 ) :ordered(speed&change ?car1 ?car2 ?lane ?to) :constant0.0 ) (overtake ?car1 ?car2 ?lane) (overtake ?car1 ?car2 ?lane) (change-lanes ?car1 ?to ?lane)) (change-lanes ?car1 ?to ?lane)) :values(#distance-ahead ?car1 ?car2 ?d) :values(#distance-ahead ?car1 ?car2 ?d) (#speed ?car2 ?s) :weights( ) ) :weights( ) )

These encode temporary beliefs, intended actions, and their values. * Icarus short-term memories store specific, value-laden instances of long-term concepts and skills. I CARUS Short-Term Memories a perceptual buffer with primitive Boolean and numeric concepts a perceptual buffer with primitive Boolean and numeric concepts (car car-06), (in-lane car-06 lane-a), (#speed car-06 37) (car car-06), (in-lane car-06 lane-a), (#speed car-06 37) a short-term conceptual memory with matched concept instances a short-term conceptual memory with matched concept instances (ahead-of car-06 self), (faster-than car-06 self), (clear-for lane-a self) (ahead-of car-06 self), (faster-than car-06 self), (clear-for lane-a self) a short-term skill memory with instances of skills that the agent intends to execute a short-term skill memory with instances of skills that the agent intends to execute (speed-up-faster-than self car-06), (change-lanes lane-a lane-b) (speed-up-faster-than self car-06), (change-lanes lane-a lane-b) Besides long-term memories, I CARUS stores dynamic structures in:

Categorization and Reward in I CARUS Long-TermConceptualMemoryShort-TermConceptualMemory Categorize / Compute Reward PerceiveEnvironment Environment PerceptualBuffer Categorization occurs in an automatic, bottom-up manner. A reward is calculated for every matched Boolean concept. This reward is a linear function of associated numeric concepts. Total reward is the sum of rewards for all matched concepts. * Categorization and reward calculation are inextricably linked.

Long-Term Skill Memory Short-Term Nominate Skill Instance Abandon Short-TermConceptualMemory Skill Nomination and Abandonment I CARUS adds skill instances to short-term skill memory that: refer to concept instances in short-term conceptual memory; refer to concept instances in short-term conceptual memory; have expected reward > agents discounted past reward. have expected reward > agents discounted past reward. I CARUS removes a skill when its expected reward << past reward. * Nomination and abandonment create highly autonomous behavior that is motivated by the agents internal reward.

Short-TermConceptualMemory Short-Term Skill Memory Select / Execute Skill Instance PerceiveEnvironment Environment PerceptualBuffer Skill Selection and Execution On each cycle, I CARUS executes the skill with highest expected reward. Selection invokes deep evaluation to find the action with the highest expected reward. Execution causes action, including sensing, which alters memory. * I CARUS makes value-based choices among skills, and among the alternative subskills and actions in each skill.

I CARUS Interpreter for Skill Execution Given Start: If not (Objectives) and Requires, then - choose among unordered Subskills - choose among unordered Subskills - consider ordered Subskills - consider ordered Subskills Objectives Subskill Start Requires (speed&change (?car1 ?car2 ?from ?to) :start(ahead-of ?car1 ?car2) :start(ahead-of ?car1 ?car2) (same-lane ?car1 ?car2) (same-lane ?car1 ?car2) :objective (faster-than ?car1 ?car2) :objective (faster-than ?car1 ?car2) (different-lane ?car1 ?car2) :requires(in-lane ?car2 ?from) :requires(in-lane ?car2 ?from) (adjacent ?from ?to) (adjacent ?from ?to) :unordered(*accelerate) :unordered(*accelerate) (change-lanes ?car1 ?from ?to) ) (change-lanes ?car1 ?from ?to) ) * I CARUS skills have hierarchical structure, and the interpreter uses a reactive control loop to identify the most valuable action.

Cognitive Repair of Skill Conditions Long-Term Skill Memory Short-Term Repair Skill Conditions * This backward chaining is similar to means-ends analysis, but it supports execution rather than planning. I CARUS seeks to repair skills whose requirements do not hold by: finding concepts that, if true, would let execution continue; finding concepts that, if true, would let execution continue; selecting the concept that is most important to repair; and selecting the concept that is most important to repair; and nominating a skill with objectives that include the concept. nominating a skill with objectives that include the concept. Repair takes one cycle and adds at most one skill instance to memory.

Learning Hierarchical Control Policies Induction internal reward streams hierarchical skills learned value functions ( pass (?x) :start (behind ?x)(same-lane ?x) :objective (ahead ?x)(same-lane ?x) :requires(lane ?x ?l) :components((speed-up-faster-than ?x) (change-lanes ?l ?k) (overtake ?x) (change-lanes ?k ?l))) (speed-up-faster-than (?x) :start (slower-than ?x) :objective(faster-than ?x) :requires( ) :components((accelerate))) (change-lanes (?l ?k) :start (lane self ?l) :objective(lane self ?k) :requires(left-of ?k ?l) :components((shift-left))) (overtake (?x) :start(behind ?x)(different-lane ?x) :objective(ahead ?x) :requires(different-lane ?x)(faster-than ?x) :components((shift-left))) ( pass (?x) :start (behind ?x)(same-lane ?x) :objective (ahead ?x)(same-lane ?x) :requires(lane ?x ?l) :components((speed-up-faster-than ?x) (change-lanes ?l ?k) (overtake ?x) (change-lanes ?k ?l))) (speed-up-faster-than (?x) :start (slower-than ?x) :objective(faster-than ?x) :requires( ) :components((accelerate))) (change-lanes (?l ?k) :start (lane self ?l) :objective(lane self ?k) :requires(left-of ?k ?l) :components((shift-left))) (overtake (?x) :start(behind ?x)(different-lane ?x) :objective(ahead ?x) :requires(different-lane ?x)(faster-than ?x) :components((shift-left)))

Revising Expected Reward Functions I CARUS uses a hierarchical variant of Q learning to revise estimated reward functions based on internally computed rewards: * This method learns 100 times faster than nonhierarchical ones. pass *shift-left change-lanes pass speed&change R(t) Update Q(S) = with R(t), Q(s´) *accelerate

Intellectual Precursors earlier research on integrated cognitive architectures earlier research on integrated cognitive architectures especially influenced by ACT, Soar, and Prodigy especially influenced by ACT, Soar, and Prodigy earlier work on architectures for reactive control earlier work on architectures for reactive control especially universal plans and teleoreactive programs especially universal plans and teleoreactive programs research on learning value functions from delayed reward research on learning value functions from delayed reward especially hierarchical approaches to Q learning especially hierarchical approaches to Q learning decision theory and decision analysis decision theory and decision analysis previous versions of I CARUS (going back to 1988). previous versions of I CARUS (going back to 1988). Our work on I CARUS has been influenced by many previous efforts: However, I CARUS combines and extends ideas from its various predecessors in novel ways.

Directions for Future Research forward chaining and mental simulation of skills; forward chaining and mental simulation of skills; allocation of scarce resources and selective attention; allocation of scarce resources and selective attention; probabilistic encoding and matching of Boolean concepts; probabilistic encoding and matching of Boolean concepts; flexible recognition of skills executed by other agents; flexible recognition of skills executed by other agents; caching of repairs to extend the skill hierarchy; caching of repairs to extend the skill hierarchy; revision of internal reward functions for concepts; and revision of internal reward functions for concepts; and extension of short-term memory to store episodic traces. extension of short-term memory to store episodic traces. Future work on I CARUS should introduce additional methods for: Taken together, these features should make I CARUS a more general and powerful architecture for constructing intelligent agents.

Concluding Remarks includes separate memories for concepts and skills; includes separate memories for concepts and skills; organizes concepts and skills in a hierarchical manner; organizes concepts and skills in a hierarchical manner; associates affective values with all cognitive structures; associates affective values with all cognitive structures; calculates these affective values internally; calculates these affective values internally; combines reactive execution with cognitive repair; and combines reactive execution with cognitive repair; and uses expected values to nominate tasks and abandon them. uses expected values to nominate tasks and abandon them. I CARUS is a novel integrated architecture for intelligent agents that: This constellation of concerns distinguishes I CARUS from other research on integrated architectures.