Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Institute for the Study of Learning and Expertise Palo Alto,

Slides:

Advertisements

Similar presentations

BLR’s Human Resources Training Presentations

Advertisements

Second Information Technology in Education Study (SITES) A Project of the International Association for the Evaluation of Educational Achievement (IEA)

Performance Assessment

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley Arizona State University and Institute for the Study of Learning and Expertise Expertise, Transfer, and Innovation in.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Elena Messina.

Center for the Study of Language and Information Stanford University, Stanford, California March 20-21, 2004 Symposium on Reasoning and Learning in Cognitive.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California and Center for the Study of Language and Information Stanford University,

Pat Langley Seth Rogers Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA.

Some Information 1.Parking is free in Stanford lots over the weekend. 2.There is no registration fee thanks to ONR and NSF. 3.Participation in the meeting.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, CA Cumulative Learning of Relational and Hierarchical Skills.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Varieties of Problem Solving in a Unified Cognitive Architecture.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Extending the I CARUS Cognitive Architecture Thanks to D. Choi,

Pat Langley Dongkyu Choi Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Mental Simulation and Learning in the I CARUS Architecture.

General learning in multiple domains transfer of learning across domains Generality and Transfer in Learning training items test items training items test.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

IL Kickoff Meeting June 20-21, 2006 DARPA Integrated Learning POIROT Project 1 Learning Hierarchical Task Networks by Analyzing Expert Traces Pat Langley.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.

Learning Procedural Planning Knowledge in Complex Environments Douglas Pearson March 2004.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA A Unified Cognitive Architecture for Embodied Agents Thanks.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Cognitive Architectures and Virtual Intelligent Agents Thanks.

Pat Langley Computer Science and Engineering Arizona State University Tempe, Arizona USA Institute for the Study of Learning and Expertise Palo Alto, California.

Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California A Cognitive Architecture for Complex Learning.

1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory.

Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.

ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS

Chapter 11 user support. Issues –different types of support at different times –implementation and presentation both important –all need careful design.

Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.

SHOP2: An HTN Planning System Nau, D.S., Au, T.C., Ilghami, O., Kuter, U., Murdock, J.W., Wu, D. and Yaman, F. (2003) "SHOP2: An HTN Planning System",

Bayesian Network and Influence Diagram A Guide to Construction And Analysis.

Software Testing and Quality Assurance

Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.

Brent Dingle Marco A. Morales Texas A&M University, Spring 2002

1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.

Polyscheme John Laird February 21, Major Observations Polyscheme is a FRAMEWORK not an architecture – Explicitly does not commit to specific primitives.

Automated Planning and HTNs Planning – A brief intro Planning – A brief intro Classical Planning – The STRIPS Language Classical Planning – The STRIPS.

Meaningful Learning in an Information Age

Building Knowledge-Driven DSS and Mining Data

Cognitive Science Overview Design Activity Cognitive Apprenticeship Theory Cognitive Flexibility Theory.

Intelligent Tutoring Systems Traditional CAI Fully specified presentation text Canned questions and associated answers Lack the ability to adapt to students.

Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.

Click to edit Master title style  Click to edit Master text styles  Second level  Third level  Fourth level  Fifth level  Click to edit Master text.

GENERAL CONCEPTS OF OOPS INTRODUCTION With rapidly changing world and highly competitive and versatile nature of industry, the operations are becoming.

1 Machine Learning: Lecture 11 Analytical Learning / Explanation-Based Learning (Based on Chapter 11 of Mitchell, T., Machine Learning, 1997)

Introduction to AI Robotics Chapter 2. The Hierarchical Paradigm Hyeokjae Kwon.

11 C H A P T E R Artificial Intelligence and Expert Systems.

GRASP: Designing Objects with Responsibilities

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

1© 2010 by Nelson Education Ltd. Chapter Five Training Design.

1 Knowledge Acquisition and Learning by Experience – The Role of Case-Specific Knowledge Knowledge modeling and acquisition Learning by experience Framework.

Facilitate Group Learning

AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)

Some Thoughts to Consider 8 How difficult is it to get a group of people, or a group of companies, or a group of nations to agree on a particular ontology?

A Roadmap towards Machine Intelligence

CpSc 810: Machine Learning Analytical learning. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various.

Competency based learning & performance Ola Badersten.

1 Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University.

Cognitive Architectures and General Intelligent Systems Pay Langley 2006 Presentation : Suwang Jang.

Son Thanh To Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California Dongkyu Choi Department of Aerospace Engineering University.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA

Learning Analytics isn’t new Ways in which we might build on the long history of adaptive learning systems within contemporary online learning design Professor.

Learning Teleoreactive Logic Programs by Observation

Presentation transcript:

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Institute for the Study of Learning and Expertise Palo Alto, California USA Challenges in Learning Plan Knowledge Thanks to D. Choi, T. Konik, U. Kutur, N. Li, D. Nau, N. Nejati, and D. Shapiro for their many contributions. This talk reports research funded by grants from DARPA IPTO, which is not responsible for its contents.

Outline of the Talk 1. 1.Brief review of learning plan knowledge 2. 2.Learning from different sources 3. 3.Learning for new performance tasks 4. 4.Learning in different scenarios 5. 5.Learning with novel representations 6. 6.Some responses to these challenges 7. 7.Concluding remarks

The Problem: Learning Plan Knowledge Given: Basic knowledge about some action-oriented domain. (e.g., state/goal representation, operators) Given: A set of training problems (e.g., initial states, goals, and possibly more) Given: Some performance task that the system must carry out. Given: A performance mechanism that can use knowledge to carry out that task. Learn: Knowledge that will let the system improve its ability to perform new tasks from the same or similar domain.

Topics Not Covered This talk will range widely, but I will not cover issues related to: Learning with impoverished representations Interested in human-like, intelligent behavior Most work on reinforcement learning is irrelevant Acquiring basic knowledge about domain Interested in building on such knowledge Most work on learning action models is too basic Nonincremental learning from large data sets Interested in human-like incremental learning This rules out most data-mining approaches

Historical Topics There has been a long history of work on learning plan knowledge: Forming macro-operators Fikes et al. (1972), Iba (1988), Mooney (1989), Botea et al. (2005) Inducing forward-chaining control rules Anzai & Simon (1978) Mitchell et al. (1981), Langley (1982) Learning control rules analytically Laird et al. (1986), Mitchell et al. (1986), Minton (1988) Problem solving by analogy Veloso (1994), Jones & Langley (1995), VanLehn & Jones (1994) Inducing control rules for partial-order plans Kautukam & Kambhampati (1994), Estlin & Mooney (1997)

Historical Trends Work on learning plan knowledge has seen many shifts in fashion: Early hope for improving problem solvers/planners ( ) Excitement/confusion introduced by EBL movement ( ) Some doubts raised by the utility problem ( ) Mass migration to reinforcement learning paradigm ( ) Resurgence of interest in learning plan knowledge (2004 present) Throughout these changes, the problems and potential of learning plan knowledge have remained.

Traditional Sources of Information Most research on learning for planning has assumed the system uses search to generate: Successful paths that achieve the goals (positive instances) Failed paths that do not achieve the goals (negative instances) Alternative paths of different desirability (preferred instances) But humans learn from other sources of information and our AI systems should as well.

Challenge: Learn from Many Sources There has been relatively little research on plan learning from: Demonstrations of solved problems (Nejati et al., 2006) Explicit instruction from teacher (Blythe et al., 2007) Advice or hints from teacher (Mostow, 1983) Mental simulations or daydreaming (Mueller, 1985) Undesirable side effects during execution Humans learn from all of these sources, and our learning systems should support the same capabilities. Moreover, we should develop single systems that integrate plan knowledge learned from all of them (Oblinger, 2006).

Traditional Performance Tasks Most research on learning for planning has assumed the system aims to improve: The efficiency of plan generation (nodes expanded, time) The quality of generated plans (path length, utility) The coverage of plan knowledge (problems solved) But humans learn and use plan knowledge for other purposes that are just as valid.

Challenge: Learn for Plan Execution Many important domains require executing plan knowledge in some environment that includes: operators with likely but nonguaranteed effects external events not directly under the agents control other agents that are pursuing their own goals Urban driving is one setting that raises all three of these issues. Complex board games like chess, although deterministic, still require interleaving of planning and execution. We need more research on plan learning in contexts of this sort (e.g., Benson, 1995; Fern et al., 2004).

Challenge: Learn for Plan Understanding Another understudied problem is learning for plan understanding. Given: A partially observed sequence of states influenced by another agents actions. Given: Learned knowledge about how to achieve goals. Find: The other agents goals and the plans it is pursuing to achieve them. Plan understanding is important not only in complex games, but in military planning, politics, and other settings. This performance task suggests new learning problems, methods, and evaluation criteria.

Traditional Learning Scenarios Most research on learning for planning has assumed the system: Trains on problems from a given distribution / domain Tests on problems from the same distribution / domain Success depends on the extent to which the learner generalizes well to new problems from the same domain. But humans also use their learned plan knowledge in other, more flexible ways to improve performance.

Challenge: Cumulative Learning In complex domains, humans learn plan knowledge gradually: Starting with small, relatively easy problems Moving to complex problems after mastering simpler ones Later acquisitions build naturally on earlier experience, learning to cumulative learning. Our education system depends heavily on such vertical transfer of learned knowledge. We need more learning systems that demonstrate this form of cumulative improvement (e.g., Reddy & Tadepalli, 1997).

Challenge: Cross-Domain Transfer In other cases, humans exhibit a form of transfer that involves: Learning to solve problems in one domain Reusing this knowledge to solve problems in another domain that is superficially quite different Such cross-domain transfer is related to within-domain analogical reasoning, but it is far more challenging. In its extreme form, the two domains support similar solutions but have no shared symbols or predicates. We need more learning systems that demonstrate this radical form of knowledge reuse.

Traditional Learned Representations Most research on learning for planning has focused on learning: Control rules that reduce effective branching factor Macro-operators that reduce effective solution depth These grew naturally from representations used to create hand- crafted expert problem solvers. But now we have other representations of plan knowledge that suggest new learning tasks and methods. Nor does this refer to POMDPs, workflows, or other highly constrained formalisms.

Challenge: Learn HTNs the modularity and flexibility of search-control rules the large-scale structure of macro-operators Hierarchical task networks (HTNs) offer the most effective planning available, but they are expensive to build manually. HTNs provide an ideal target for learning because they have: Machine learning has automated the creation of expert classifiers. We should do the same for HTNs, which are effectively expert planning systems.

Challenge: Learn HTNs Given: Basic knowledge about some action-oriented domain Given: A set of training problems (initial states and goals) Given: Some performance task the system must carry out. Given: Some module that uses HTNs to perform this task Learn: An HTN that lets the system improve its performance on new tasks from the same or similar domain. We can define the task of learning hierarchical task networks as: We need more research on this important topic (e.g., Reddy & Tadepalli, 1997; Ilghami et al., 2005).

Some Responses acquire a constrained but important class of HTNs that one can use for both planning and reactive control from both successful problem solving and expert traces that extends naturally to support cross-domain transfer Our recent research attempts to respond to these challenges by developing methods that: Moreover, these ideas are embedded in an integrated architecture that supports many capabilities I CARUS (Langley, 2006).

Primitive Concept (assigned-mission ?patient ?mission) Nonprimitive Concept (patient-form-filled ?patient) Conceptual Knowledge in I CARUS Conceptual knowledge is cast as Horn clauses that specify relevant relations in the environment Memory is organized hierarchically Divided into primitive and non-primitive predicates

HTN Methods in I CARUS Similar to SHOP2 but methods indexed by goals they achieve Each method decomposes a goal into subgoals If a methods goal is active and its precondition is satisfied, then try to achieve its subgoals or apply its operators precondition concept HTN method HTN goal concept HTN method subgoal operator

Operators in I CARUS Effects Concept (arrival-time ?patient) Precondition Concept (patient ?p) and (travel-from ?p ?from) and (travel-to ?p ?to) Action (get-arrival-time ?patient ?from ?to) Operators describe low-level actions that agents can execute directly in the environment Preconditions: legal conditions for action execution Effects: expected changes when action is executed

Training Input: Expert Traces and Goals Expert demonstration traces Operators the expert uses and the resulting belief state State: Set of concept instances Goal is a concept instance in the final state I CARUS learns generalized skills that achieves similar goals Operator instance (get-arrival-time P2) Concept instance (assigned-flight P1 M1 ) State Goal concept (all-patients-arranged)

Learning Plan Knowledge from Demonstration Plan Knowledge If Impasse Problem ? Initial State goal LIGHT Demonstration Traces Background knowledge Reactive Executor Learned plan knowledge Concept definitions Operators States and actions HTNs Expert

Learning HTNs by Trace Analysis concepts actions

Operator Chaining Learning HTNs by Trace Analysis

Concept Chaining concepts actions Learning HTNs by Trace Analysis

Explanation Structure for Trace (dest-airport patient1 SFO) (arrival-time NW32 1pm) (query-arrival-time) (scheduled NW32) (location patient1 SFO 1pm) (assigned patient1 NW32) (flight-available) (assign patient1 NW32 ) (transfer-hospital patient1 hospital2) (arrange-ground-transportation SFO hospital2 1pm) (close-airport hospital2 SFO) Time:1Time:2 Time:3

Hierarchical Task Network Structure (dest-airport ?patient ?loc) (arrival-time ?flight ?time) (query-arrival-time) (scheduled ?flight) (location ?patient ?loc ?time) (assigned ?patient ?flight) (flight-available) (assign ?patient ?flight) (transfer-hospital ?patient ?hospital) (arrange-ground-transportation ?loc ?hospital ?time) (close-airport ?hospital ?loc)

concepts actions Transfer by Representation Mapping Predicate mappings Source domain Target domain

Challenge: Learn with Richer Goals HTNs are more expressive than classical plans (Errol et al., 1994). Our approach loses this advantage because it assumes the head of each method is a goal it achieves, but we can: This scheme should acquire the full class of HTNs while still retaining the tractability of goal-directed learning. Extend goal concepts to describe temporal behavior Revise the execution module to handle these structures Augment trace analysis to reason about temporal goals Learn new methods with temporal goals in their heads

Challenge: Extend Conceptual Vocabularies Given: A set of concepts used in goals, states, and methods Given: New methods acquired from sample solution traces Find:New concepts that produce improved performance as the result of future method learning. Our approach to learning HTNs relies on the concept hierarchy used to explain solution traces. The method would be less dependent if it extended this hierarchy: This would support a bootstrapped learner that invents predicates to describe states, goals, and methods.

Challenge: Extend Conceptual Vocabularies Define a new concept for the precondition of each method learned by chaining off a concept definition. Check traces for states in which this concept becomes true and learn methods to achieve it. During performance, treat each methods precondition as its first subgoal, which it can achieve if submethods are known. Our approach to utilizing predicate invention has three steps: This technique would make an HTN more complete by growing it downward, introducing nonterminal symbols as necessary. We have partially implemented this scheme and hope to report results at the next meeting.

Concluding Remarks: Research Style Clearly, there remain many open problems to address in learning plan knowledge. These involve new abilities, not improvements on existing ones, which suggests that we: These strategies will help us extend the reach of our learning systems, not just strengthen their grasp. Look at human behavior for ideas on how to proceed Develop integrated systems rather than component algorithms Demonstrate their behavior on challenging domains

Concluding Remarks: Evaluation We must evaluate our new plan learners, but this does not mean: More appropriate experiments would revolve around: Measuring their speed in generating plans Showing they run faster than existing systems Entering them in planning competitions Demonstrating entirely new functionalities Running lesion studies to show new features are required Using performance measures appropriate to the task These steps will produce conceptual advances and scientific understanding far more than will mindless bake-offs.

Concluding Remarks: Summary Learning plan knowledge is a key area with many open problems: These challenges will benefit from earlier work on plan learning, but they also require new ideas. Together, they should lead us toward learning systems that rival humans in their flexibility and power. Learning from traces, advice, and other sources Transferring knowledge within and across domains Learning and extending rich structures like HTNs

End of Presentation

I CARUS Concepts for In-City Driving ((in-rightmost-lane ?self ?clane) :percepts ((self ?self) (segment ?seg) (line ?clane segment ?seg)) :relations ((driving-well-in-segment ?self ?seg ?clane) (last-lane ?clane) (not (lane-to-right ?clane ?anylane)))) ((driving-well-in-segment ?self ?seg ?lane) :percepts ((self ?self) (segment ?seg) (line ?lane segment ?seg)) :relations ((in-segment ?self ?seg) (in-lane ?self ?lane) (aligned-with-lane-in-segment ?self ?seg ?lane) (centered-in-lane ?self ?seg ?lane) (steering-wheel-straight ?self))) ((in-lane ?self ?lane) :percepts ((self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests ((> ?dist -10) (<= ?dist 0)))

Representing Short-Term Beliefs/Goals (current-street me A)(current-segment me g550) (lane-to-right g599 g601)(first-lane g599) (last-lane g599)(last-lane g601) (at-speed-for-u-turn me)(slow-for-right-turn me) (steering-wheel-not-straight me)(centered-in-lane me g550 g599) (in-lane me g599)(in-segment me g550) (on-right-side-in-segment me)(intersection-behind g550 g522) (building-on-left g288)(building-on-left g425) (building-on-left g427)(building-on-left g429) (building-on-left g431)(building-on-left g433) (building-on-right g287)(building-on-right g279) (increasing-direction me)(buildings-on-right g287 g279)

((in-rightmost-lane ?self ?line) :percepts ((self ?self) (line ?line)) :start ((last-lane ?line)) :subgoals ((driving-well-in-segment ?self ?seg ?line))) ((driving-well-in-segment ?self ?seg ?line) :percepts ((segment ?seg) (line ?line) (self ?self)) :start ((steering-wheel-straight ?self)) :subgoals ((in-segment ?self ?seg) (centered-in-lane ?self ?seg ?line) (aligned-with-lane-in-segment ?self ?seg ?line) (steering-wheel-straight ?self))) ((in-segment ?self ?endsg) :percepts ((self ?self speed ?speed) (intersection ?int cross ?cross) (segment ?endsg street ?cross angle ?angle)) :start ((in-intersection-for-right-turn ?self ?int)) :actions (( steer 1))) I CARUS Skills for In-City Driving

I CARUS Interleaves Execution and Problem Solving Executed plan Problem ? Skill Hierarchy Primitive Skills Reactive Execution impasse? Problem Solving yes no This organization reflects the psychological distinction between automatized and controlled behavior.