Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Slides:



Advertisements
Similar presentations
Explanation-Based Learning (borrowed from mooney et al)
Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Arizona State University and Institute for the Study of Learning and Expertise Expertise, Transfer, and Innovation in.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Elena Messina.
Center for the Study of Language and Information Stanford University, Stanford, California March 20-21, 2004 Symposium on Reasoning and Learning in Cognitive.
Pat Langley Seth Rogers Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA.
Some Information 1.Parking is free in Stanford lots over the weekend. 2.There is no registration fee thanks to ONR and NSF. 3.Participation in the meeting.
Pat Langley Institute for the Study of Learning and Expertise Palo Alto, CA Cumulative Learning of Relational and Hierarchical Skills.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Varieties of Problem Solving in a Unified Cognitive Architecture.
Pat Langley Dan Shapiro Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Extending the I CARUS Cognitive Architecture Thanks to D. Choi,
Pat Langley Dongkyu Choi Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Mental Simulation and Learning in the I CARUS Architecture.
General learning in multiple domains transfer of learning across domains Generality and Transfer in Learning training items test items training items test.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.
Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
IL Kickoff Meeting June 20-21, 2006 DARPA Integrated Learning POIROT Project 1 Learning Hierarchical Task Networks by Analyzing Expert Traces Pat Langley.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Institute for the Study of Learning and Expertise Palo Alto,
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA A Unified Cognitive Architecture for Embodied Agents Thanks.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Cognitive Architectures and Virtual Intelligent Agents Thanks.
Pat Langley Computer Science and Engineering Arizona State University Tempe, Arizona USA Institute for the Study of Learning and Expertise Palo Alto, California.
Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California A Cognitive Architecture for Complex Learning.
1 ISLE Transfer Learning Team Main Technology Components The I CARUS Architecture Markov Logic Networks Executes skills in the environment Long-TermConceptualMemoryShort-TermConceptualMemory.
Planning.
Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
REVIEW : Planning To make your thinking more concrete, use a real problem to ground your discussion. –Develop a plan for a person who is getting out of.
In the name of God An Application of Planning An Application of PlanningJSHOP BY: M. Eftekhari and G. Yaghoobi.
1 Planning Chapter CMSC 471 Adapted from slides by Tim Finin and Marie desJardins. Some material adopted from notes by Andreas Geyer-Schulz,
Hierarchical Task Network (HTN) Planning Hai Hoang 4/17/2007.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
1 Chapter 4:Constraint Logic Programs Where we learn about the only programming concept rules, and how programs execute.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Classical Planning via Plan-space search COMP3431 Malcolm Ryan.
Plan Generation & Causal-Link Planning 1 José Luis Ambite.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Models of Human Performance Dr. Chris Baber. 2 Objectives Introduce theory-based models for predicting human performance Introduce competence-based models.
Automated Planning and HTNs Planning – A brief intro Planning – A brief intro Classical Planning – The STRIPS Language Classical Planning – The STRIPS.
Explanation-Based Learning (EBL) One definition: Learning general problem-solving techniques by observing and analyzing human solutions to specific problems.
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
Introduction to AI Robotics Chapter 2. The Hierarchical Paradigm Hyeokjae Kwon.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Visualizations to Support Interactive Goal Model Analysis Jennifer Horkoff 1 Eric Yu 2 Department of Computer Science 1 Faculty of Information 2
Nonlinear Planning Lecture Module 9. Nonlinear Planning with Goal Set Generate a plan by doing some work on one goal, then some on another and then some.
Introduction CS 3358 Data Structures. What is Computer Science? Computer Science is the study of algorithms, including their  Formal and mathematical.
Planning, page 1 CSI 4106, Winter 2005 Planning Points Elements of a planning problem Planning as resolution Conditional plans Actions as preconditions.
Introduction CS 3358 Data Structures. What is Computer Science? Computer Science is the study of algorithms, including their  Formal and mathematical.
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
Advanced Problem Solving Systems: Planning Lecture Module 8.
AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)
Basic Problem Solving Search strategy  Problem can be solved by searching for a solution. An attempt is to transform initial state of a problem into some.
(Classical) AI Planning. General-Purpose Planning: State & Goals Initial state: (on A Table) (on C A) (on B Table) (clear B) (clear C) Goals: (on C Table)
Cognitive Architectures For Physical Agents Sara Bolduc Smith College CSC 290.
Planning I: Total Order Planners Sections
Cognitive Architectures and General Intelligent Systems Pay Langley 2006 Presentation : Suwang Jang.
Son Thanh To Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California Dongkyu Choi Department of Aerospace Engineering University.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA
Learning Procedural Knowledge through Observation -Michael van Lent, John E. Laird – 인터넷 기술 전공 022ITI02 성유진.
Learning Teleoreactive Logic Programs by Observation
Architecture Components
A Value-Driven Architecture for Intelligent Behavior
Implementation of Learning Systems
Presentation transcript:

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Learning Hierarchical Task Networks from Problem Solving Thanks to Dongkyu Choi, Kirstin Cummings, Seth Rogers, and Daniel Shapiro for contributions to this research, which was funded by Grant HR from DARPA IPTO and by Grant IIS from NSF.

Classical Approaches to Planning Generated plan Classical Planning Operators Problem ?

Planning with Hierarchical Task Networks Generated plan HTN Planning Problem ? Task Network

Classical and HTN Planning Challenge: Can we unify classical and HTN planning in a single framework? Challenge: Can we unify classical and HTN planning in a single framework? Challenge: Can we use learning to gain the advantage of HTNs while avoiding the cost of manual construction? Challenge: Can we use learning to gain the advantage of HTNs while avoiding the cost of manual construction? Hypothesis: The responses to these two challenges are closely intertwined. Hypothesis: The responses to these two challenges are closely intertwined.

Mixed Classical / HTN Planning Generated plan Problem ? Task Network Operators HTN Planning impasse? Classical Planning yes no

Learning HTNs from Classical Planning Generated plan Problem ? Task Network Operators HTN Planning impasse? Classical Planning yes no HTN Learning

Four Contributions of the Research Representation: A specialized class of hierarchical task nets. Representation: A specialized class of hierarchical task nets. Execution: A reactive controller that utilizes these structures. Execution: A reactive controller that utilizes these structures. Planning: A method for interleaving HTN execution with problem solving when impasses are encountered. Planning: A method for interleaving HTN execution with problem solving when impasses are encountered. Learning: A method for creating new HTN methods from successful solutions to these impasses. Learning: A method for creating new HTN methods from successful solutions to these impasses.

A New Representational Formalism Concepts: A set of conjunctive relational inference rules; Concepts: A set of conjunctive relational inference rules; Primitive skills: A set of durative STRIPS operators; Primitive skills: A set of durative STRIPS operators; Nonprimitive skills: A set of HTN methods which specify: Nonprimitive skills: A set of HTN methods which specify: a head that indicates a goal the method achieves; a head that indicates a goal the method achieves; a single (typically defined) precondition; a single (typically defined) precondition; one or more ordered subskills for achieving the goal. one or more ordered subskills for achieving the goal. This special class of hierarchical task networks can be executed reactively but in a goal-directed manner (Nilsson, 1994). A teleoreactive logic program consists of three components:

Some Defined Concepts (Axioms) (clear (?block) :percepts((block ?block)) :negatives((on ?other ?block))) (hand-empty ( ) :percepts((hand ?hand status ?status)) :tests ((eq ?status 'empty))) (unstackable (?block ?from) :percepts((block ?block) (block ?from)) :positives((on ?block ?from) (clear ?block) (hand-empty))) (pickupable (?block ?from) :percepts((block ?block) (table ?from)) :positives((ontable ?block ?from) (clear ?block) (hand-empty))) (stackable (?block ?to) :percepts ((block ?block) (block ?to)) :positives((clear ?to) (holding ?block))) (putdownable (?block ?to) :percepts((block ?block) (table ?to)) :positives((holding ?block)))

Some Primitive Skills (Operators) (unstack (?block ?from) :percepts((block ?block ypos ?ypos) (block ?from)) :start(unstackable ?block ?from) :actions(( * grasp ?block) ( * vertical-move ?block (+ ?ypos 10))) :effects ((clear ?from) (holding ?block))) (pickup (?block ?from) :percepts((block ?block) (table ?from height ?height)) :start (pickupable ?block ?from) :effects ((holding ?block))) (stack (?block ?to) :percepts((block ?block) (block ?to xpos ?xpos ypos ?ypos height ?height)) :start (stackable ?block ?to) :effects ((on ?block ?to) (hand-empty))) (putdown (?block ?to) :percepts ((block ?block) (table ?to xpos ?xpos ypos ?ypos height ?height)) :start (putdownable ?block ?to) :effects ((ontable ?block ?to) (hand-empty)))

Some NonPrimitive Recursive Skills Some NonPrimitive Recursive Skills (clear (?C) :percepts((block ?D) (block ?C)) :start(unstackable ?D ?C) :skills((unstack ?D ?C))) (clear (?B) :percepts ((block ?C) (block ?B)) :start[(on ?C ?B) (hand-empty)] :skills((unstackable ?C ?B) (unstack ?C ?B))) (unstackable (?C ?B) :percepts((block ?B) (block ?C)) :start [(on ?C ?B) (hand-empty)] :skills((clear ?C) (hand-empty))) (hand-empty ( ) :percepts ((block ?D) (table ?T1)) :start (putdownable ?D ?T1) :skills ((putdown ?D ?T1))) [Expanded for readability] Teleoreactive logic programs are executed in a top-down, left-to-right manner, much as in Prolog but extended over time, with a single path being selected on each time step.

Interleaving HTN Execution and Classical Planning Solve(G) Push the goal literal G onto the empty goal stack GS. On each cycle, If the top goal G of the goal stack GS is satisfied, Then pop GS. Else if the goal stack GS does not exceed the depth limit, Let S be the skill instances whose heads unify with G. If any applicable skill paths start from an instance in S, Then select one of these paths and execute it. Else let M be the set of primitive skill instances that have not already failed in which G is an effect. If the set M is nonempty, Then select a skill instance Q from M. Push the start condition C of Q onto goal stack GS. Else if G is a complex concept with the unsatisfied subconcepts H and with satisfied subconcepts F, Then if there is a subconcept I in H that has not yet failed, Then push I onto the goal stack GS. Else pop G from the goal stack GS and store information about failure with G's parent. Else pop G from the goal stack GS. Store information about failure with G's parent. Else if G is a complex concept with the unsatisfied subconcepts H and with satisfied subconcepts F, Then if there is a subconcept I in H that has not yet failed, Then push I onto the goal stack GS. Else pop G from the goal stack GS and store information about failure with G's parent. Else pop G from the goal stack GS. Store information about failure with G's parent. This is traditional means-ends analysis, with three exceptions: (1) conjunctive goals must be defined concepts; (2) chaining occurs over both skills/operators and concepts/axioms; and (3) selected skills are executed whenever applicable.

A Successful Planning Trace (ontable A T) (on B A) (on C B) (hand-empty) (clear C) (unst. C B) (unstack C B) (clear B) (putdown C T) (unst. B A) (unstack B A) (clear A) (holding C)(hand-empty) (holding B) A B CB A C initial state goal

Three Questions about HTN Learning What is the hierarchical structure of the network? What is the hierarchical structure of the network? What are the heads of the learned clauses/methods? What are the heads of the learned clauses/methods? What are the conditions on the learned clauses/methods? What are the conditions on the learned clauses/methods? The answers follow naturally from our representation and from our approach to plan generation.

Recording Results for Learning Solve(G) Push the goal literal G onto the empty goal stack GS. On each cycle, If the top goal G of the goal stack GS is satisfied, Then pop GS and let New be Learn(G). If G's parent P involved skill chaining, Then store New as P's first subskill. Else if G's parent P involved concept chaining, Then store New as P's next subskill. Else if the goal stack GS does not exceed the depth limit, Let S be the skill instances whose heads unify with G. If any applicable skill paths start from an instance in S, Then select one of these paths and execute it. Else let M be the set of primitive skill instances that have not already failed in which G is an effect. If the set M is nonempty, Then select a skill instance Q from M, store Q with goal G as its last subskill, Push the start condition C of Q onto goal stack GS, and mark goal G as involving skill chaining. Else if G is a complex concept with the unsatisfied subconcepts H and with satisfied subconcepts F, Then if there is a subconcept I in H that has not yet failed, Then push I onto the goal stack GS, store F with G as its initially true subconcepts, and mark goal G as involving concept chaining. Else pop G from the goal stack GS and store information about failure with G's parent. Else pop G from the goal stack GS. Store information about failure with G's parent. and mark goal G as involving concept chaining. Else pop G from the goal stack GS and store information about failure with G's parent. Else pop G from the goal stack GS. Store information about failure with G's parent. The extended problem solver calls on Learn to construct a new skill clause and stores the information it needs in the goal stack generated during search.

Three Questions about HTN Learning What is the hierarchical structure of the network? What is the hierarchical structure of the network? The structure is determined by the subproblems solved during planning, which, because both operator conditions and goals are single literals, form a semilattice. The structure is determined by the subproblems solved during planning, which, because both operator conditions and goals are single literals, form a semilattice. What are the heads of the learned clauses/methods? What are the heads of the learned clauses/methods? What are the conditions on the learned clauses/methods? What are the conditions on the learned clauses/methods?

(ontable A T) (on B A) (on C B) (hand-empty) (clear C) (unst. C B) (unstack C B) (clear B) (putdown C T) (unst. B A) (unstack B A) (clear A) (holding C)(hand-empty) (holding B) A B CB A C 1 skill chaining Constructing Skills from a Trace

(ontable A T) (on B A) (on C B) (hand-empty) (clear C) (unst. C B) (unstack C B) (clear B) (putdown C T) (unst. B A) (unstack B A) (clear A) (holding C)(hand-empty) (holding B) A B CB A C 1 2 skill chaining Constructing Skills from a Trace

(ontable A T) (on B A) (on C B) (hand-empty) (clear C) (unst. C B) (unstack C B) (clear B) (putdown C T) (unst. B A) (unstack B A) (clear A) (holding C)(hand-empty) (holding B) A B CB A C concept chaining Constructing Skills from a Trace

(ontable A T) (on B A) (on C B) (hand-empty) (clear C) (unst. C B) (unstack C B) (clear B) (putdown C T) (unst. B A) (unstack B A) (clear A) (holding C)(hand-empty) (holding B) A B CB A C skill chaining Constructing Skills from a Trace

Learned Skills After Structure Determined Learned Skills After Structure Determined ( (?C) :percepts((block ?D) (block ?C)) :start :skills((unstack ?D ?C))) ( (?B) :percepts ((block ?C) (block ?B)) :start :skills((unstackable ?C ?B) (unstack ?C ?B))) ( (?C ?B) :percepts((block ?B) (block ?C)) :start :skills((clear ?C) (hand-empty))) ( ( ) :percepts ((block ?D) (table ?T1)) :start :skills ((putdown ?D ?T1)))

Three Questions about HTN Learning What is the hierarchical structure of the network? What is the hierarchical structure of the network? The structure is determined by the subproblems solved during planning, which, because both operator conditions and goals are single literals, form a semilattice. The structure is determined by the subproblems solved during planning, which, because both operator conditions and goals are single literals, form a semilattice. What are the heads of the learned clauses/methods? What are the heads of the learned clauses/methods? The head of a learned clause is the goal literal that the planner achieved for the subproblem that produced it. The head of a learned clause is the goal literal that the planner achieved for the subproblem that produced it. What are the conditions on the learned clauses/methods? What are the conditions on the learned clauses/methods?

Learned Skills After Heads Inserted Learned Skills After Heads Inserted (clear (?C) :percepts((block ?D) (block ?C)) :start :skills((unstack ?D ?C))) (clear (?B) :percepts ((block ?C) (block ?B)) :start :skills((unstackable ?C ?B) (unstack ?C ?B))) (unstackable (?C ?B) :percepts((block ?B) (block ?C)) :start :skills((clear ?C) (hand-empty))) (hand-empty ( ) :percepts ((block ?D) (table ?T1)) :start :skills ((putdown ?D ?T1)))

Three Questions about HTN Learning What is the hierarchical structure of the network? What is the hierarchical structure of the network? The structure is determined by the subproblems solved during planning, which, because both operator conditions and goals are single literals, form a semilattice. The structure is determined by the subproblems solved during planning, which, because both operator conditions and goals are single literals, form a semilattice. What are the heads of the learned clauses/methods? What are the heads of the learned clauses/methods? The head of a learned clause is the goal literal that the planner achieved for the subproblem that produced it. The head of a learned clause is the goal literal that the planner achieved for the subproblem that produced it. What are the conditions on the learned clauses/methods? What are the conditions on the learned clauses/methods? If the subproblem involved skill chaining, they are the conditions of the first subskill clause. If the subproblem involved skill chaining, they are the conditions of the first subskill clause. If the subproblem involved concept chaining, they are the subconcepts that held at the outset of the subproblem. If the subproblem involved concept chaining, they are the subconcepts that held at the outset of the subproblem.

Learned Skills After Conditions Inferred Learned Skills After Conditions Inferred (clear (?C) :percepts((block ?D) (block ?C)) :start(unstackable ?D ?C) :skills((unstack ?D ?C))) (clear (?B) :percepts ((block ?C) (block ?B)) :start[(on ?C ?B) (hand-empty)] :skills((unstackable ?C ?B) (unstack ?C ?B))) (unstackable (?C ?B) :percepts((block ?B) (block ?C)) :start [(on ?C ?B) (hand-empty)] :skills((clear ?C) (hand-empty))) (hand-empty ( ) :percepts ((block ?D) (table ?T1)) :start (putdownable ?D ?T1) :skills ((putdown ?D ?T1)))

Learning an HTN Method from a Problem Solution Learn(G) If the goal G involves skill chaining, Then let S 1 and S 2 be G's first and second subskills. If subskill S 1 is empty, Then return the literal for clause S 2. Else create a new skill clause N with head G, with S 1 and S 2 as ordered subskills, and with the same start condition as subskill S 1. Return the literal for skill clause N. Else if the goal G involves concept chaining, Then let {C k+1,..., C n } be G's initially satisfied subconcepts. Let {C 1,..., C k } be G's stored subskills. Create a new skill clause N with head G, with {C k+1,..., C n } as ordered subskills, and with the conjunction of {C 1,..., C k } as start condition. Return the literal for skill clause N.

Creating a Clause from Skill Chaining XZ XZY ProblemSolution NewMethod

Creating a Clause from Concept Chaining ProblemSolution NewMethod Y BX C Z D AA D # B 8 3 Z C

Important Features of Learning Method it occurs incrementally from one experience at a time; it occurs incrementally from one experience at a time; it takes advantage of existing background knowledge; it takes advantage of existing background knowledge; it constructs the hierarchies in a cumulative manner. it constructs the hierarchies in a cumulative manner. Our approach to learning HTNs has some important features: In these ways, it is similar to explanation-based approaches to learning from problem solving. However, the method for finding conditions involves neither analytic or inductive learning in their traditional senses.

An In-City Driving Environment Our focus on learning for reactive control comes from an interest in complex physical domains, such as driving a vehicle in a city. To study this problem, we have developed a realistic simulated environment that can support many different driving tasks.

Skill Clauses Learning for In-City Driving Skill Clauses Learning for In-City Driving parked (?ME ?G1152) :percepts((lane-line ?G1152) (self ?ME)) :start( ) :skills((in-rightmost-lane ?ME ?G1152) (stopped ?ME)) in-rightmost-lane (?ME ?G1152) :percepts((self ?ME) (lane-line ?G1152)) :start ((last-lane ?G1152)) :skills ((driving-in-segment ?ME ?G1101 ?G1152)) driving-in-segment (?ME ?G1101 ?G1152) :percepts((lane-line ?G1152) (segment ?G1101) (self ?ME)) :start ((steering-wheel-straight ?ME)) :skills ((in-lane ?ME ?G1152) (centered-in-lane ?ME ?G1101 ?G1152) (aligned-with-lane-in-segment ?ME ?G1101 ?G1152) (steering-wheel-straight ?ME))

Learning Curves for In-City Driving

Transfer Studies of HTN Learning Because we were interested in our methods ability to transfer its learned skills to harder problems, we: created concepts and operators for Blocks World and FreeCell; created concepts and operators for Blocks World and FreeCell; let the system solve and learn from simple training problems; let the system solve and learn from simple training problems; asked the system to solve and learning from harder test tasks; asked the system to solve and learning from harder test tasks; recorded the number of steps taken and solution probability; recorded the number of steps taken and solution probability; as a function of the number of transfer problems encountered; as a function of the number of transfer problems encountered; averaged the results over many different problem orders. averaged the results over many different problem orders. The resulting transfer curves revealed the systems ability to take advantage of prior learning and generalize to new situations.

Transfer Effects in the Blocks World On 20-block tasks, there is no difference in solved problems. 20 blocks

Transfer Effects in the Blocks World However, there is difference in the effort needed to solve them. 20 blocks

FreeCell Solitaire FreeCell is a full-information card game that, in most cases, can be solved by planning; it also has a highly recursive structure.

Transfer Effects in FreeCell On 16-card FreeCell tasks, prior training aids solution probability. 16 cards

Transfer Effects in FreeCell However, it also lets the system solve problems with less effort. 16 cards

Transfer Effects in FreeCell On 20-card tasks, the benefits of prior training are much stronger. 20 cards

Transfer Effects in FreeCell However, it also lets the system solve problems with less effort. 20 cards

Where is the Utility Problem? Many previous studies of learning and planning found that: learned knowledge reduced problem-solving steps and search learned knowledge reduced problem-solving steps and search but increased CPU time because it was specific and expensive but increased CPU time because it was specific and expensive We have not yet observed the utility problem, possibly because: the problem solver does not chain off learned skill clauses; the problem solver does not chain off learned skill clauses; our performance module does not attempt to eliminate search. our performance module does not attempt to eliminate search. If we encounter it in future domains, we will collect statistics on clauses to bias selection, like Minton (1988) and others.

Related Work on Planning and Execution problem-solving architectures like Soar and Prodigy problem-solving architectures like Soar and Prodigy Nilssons (1994) notion of teleoreactive controllers Nilssons (1994) notion of teleoreactive controllers execution architectures that use HTNs to structure knowledge execution architectures that use HTNs to structure knowledge Nau et al.s encoding of HTNs for use in plan generation Nau et al.s encoding of HTNs for use in plan generation Our approach to planning and execution bears similarities to: These mappings are valid but provide no obvious approach to learning HTN structures from successful plans. Erol et al.s (1994) complexity analysis of HTN planning Erol et al.s (1994) complexity analysis of HTN planning Barrett and Welds (1994) use of HTNs for plan parsing Barrett and Welds (1994) use of HTNs for plan parsing Other mappings between classical and HTN planning come from:

Related Research on Learning production composition (e.g., Neves & Anderson, 1981) production composition (e.g., Neves & Anderson, 1981) macro-operator formation (e.g., Iba, 1985) macro-operator formation (e.g., Iba, 1985) explanation-based learning (e.g., Mitchell et al., 1986) explanation-based learning (e.g., Mitchell et al., 1986) chunking in Soar (Laird, Rosenbloom, & Newell, 1986) chunking in Soar (Laird, Rosenbloom, & Newell, 1986) Our learning mechanisms are similar to those in earlier work on: which also learned decomposition rules from problem solutions. Marsella and Schmidts (1993) REAPPR Marsella and Schmidts (1993) REAPPR Ruby and Kiblers (1993) SteppingStone Ruby and Kiblers (1993) SteppingStone Reddy and Tadepallis (1997) X-Learn Reddy and Tadepallis (1997) X-Learn But they do not rely on analytical schemes like goal regression, and their creation of hierarchical structures is closer to that by:

The I CARUS Architecture Long-TermConceptualMemory Long-Term Skill Memory Short-TermConceptualMemory Goal/SkillStack Categorization and Inference SkillExecution Perception Environment PerceptualBuffer Problem Solving Skill Learning MotorBuffer SkillRetrieval

Hierarchical Structure of Long-Term Memory concepts skills Each concept is defined in terms of other concepts and/or percepts. Each skill is defined in terms of other skills, concepts, and percepts. I CARUS organizes both concepts and skills in a hierarchical manner.

Interleaved Nature of Long-Term Memory conceptsskills For example, the skill highlighted here refers directly to the highlighted concepts. I CARUS interleaves its long-term memories for concepts and skills.

Recognizing Concepts and Selecting Skills concepts skills Concepts are matched bottom up, starting from percepts. Skill paths are matched top down, starting from intentions. I CARUS matches patterns to recognize concepts and select skills.

Directions for Future Work Despite our initial progress on structure learning, we should still: evaluate approach on more complex planning domains; evaluate approach on more complex planning domains; extend method to support HTN planning rather than execution; extend method to support HTN planning rather than execution; generalize the technique to acquire partially ordered skills; generalize the technique to acquire partially ordered skills; adapt scheme to work with more powerful planners; adapt scheme to work with more powerful planners; extend method to chain backward off learned skill clauses; extend method to chain backward off learned skill clauses; add technique to learn recursive concepts for preconditions; add technique to learn recursive concepts for preconditions; examine and address the utility problem for skill learning. examine and address the utility problem for skill learning. These should make our approach a more effective tool for learning hierarchical task networks from classical planning.

relies on a new formalism – teleoreactive logic programs – that identifies heads with goals and has single preconditions; relies on a new formalism – teleoreactive logic programs – that identifies heads with goals and has single preconditions; executes stored HTN methods when they are applicable but resorts to classical planning when needed; executes stored HTN methods when they are applicable but resorts to classical planning when needed; caches the results of successful plans in new HTN methods using a simple learning technique; caches the results of successful plans in new HTN methods using a simple learning technique; creates recursive, hierarchical structures from individual problems in an incremental and cumulative manner. creates recursive, hierarchical structures from individual problems in an incremental and cumulative manner. We have described an approach to planning and execution that: This approach holds promise for bridging the longstanding divide between two major paradigms for planning. Concluding Remarks

End of Presentation