Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience.

Similar presentations


Presentation on theme: "Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience."— Presentation transcript:

1 Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience Institute for the Study of Learning and Expertise PI: Pat LangleyPresenter: Ray Mooney

2 Learning Objective Develop learning methods that operate over rich knowledge structures which: –support both reactive control and problem solving –are embedded in an integrated cognitive architecture –that operates in complex physical environments Learning mechanisms can acquire and revise such knowledge more rapidly and effectively than human programmers can create and debug it manually.

3 What is Being Learned? I CARUS is an integrated cognitive architecture that learns: the logical structure of relational skills and concepts a hierarchical organization over these elements numeric utility functions attached to skills and concepts –that describe effective means for achieving goals –that support reactive control of physical agents from background knowledge, experience with executing skills in the environment, and problem solving –in an incremental, cumulative manner that responds to changes in tasks and the environment. This relies on the tight integration of execution, problem solving, and learning.

4 What is Being Learned? For example, in a driving domain, I CARUS would learn: the structure of driving skills like turning and passing the structure of driving concepts (e.g., passable) hierarchical connections (e.g., pass and change-lanes) how to achieve high-level goals (e.g., package delivery) how to get from one place to another (route knowledge) the expected utility of driving skills and subskills the expected utility of driving concepts This different content is cast within a unified formalism that I CARUS provides for encoding knowledge.

5 How is Knowledge Being Learned? The I CARUS architecture learns: value functions using a hierarchical variant on model- based reinforcement learning; new skills and concepts based on the cached results of means-ends problem solving. Learning and reasoning are integrated in that: conceptual inference and hierarchical skills provide high-level descriptions for reinforcement learning; problem-solving traces form the basis of new skills and concepts. Learning is automatic but could be adapted to benefit from advice and traces of expert behavior. Structure learning occurs from single instances; value learning should be much faster than in typical methods.

6 How is Knowledge Being Learned? A B D E Means-ends analysis produces hierarchical skills C

7 How is the Knowledge Represented? I CARUS casts both background and learned knowledge as: logical relational concepts with linear value functions logical relational skills with linear value functions that are defined in terms of other skills and concepts. Background knowledge constrains the learning of value functions and provides components for structure learning. I CARUS provides a formalism for encoding knowledge: about physical domains with continuous attributes that includes probability of success, expected duration, and resource requirements described at multiple levels of abstraction over both state (with concepts) and time (with skills).

8 How is the Knowledge Represented? (make-right-turn (?self ?corner) :objective((behind-right-corner ?corner)) :start((in-rightmost-lane ?self) (ahead-right-corner ?corner) (at-turning-distance ?corner)) :requires((near-block-corner ?corner) (at-turning-speed ?self)) :ordered((begin-right-turn ?self ?corner) (end-right-turn ?self ?corner)) :value(30.0) ) (slow-for-intersection (?self) :percepts((self ?self speed ?speed) (corner ?corner street-dist ?dist)) :objective((slow-enough-intersection ?self)) :requires((near-block-corner ?corner) :actions((*slow-down)) :value(+ (× -5.2 ?dist) (× 20.3 ?speed)) ) Some I CARUS driving skills

9 How is the Knowledge Represented? (corner-ahead-left (?corner) :percepts((corner ?corner r ?r theta ?theta)) :tests((< ?theta 0) (>= ?theta )) :value((+ (× 5.6 ?r) (× 3.1 ?theta)) ) (in-intersection (?self) :percepts((self ?self) (corner ?ncorner street-dist ?sdist)) :positives((near-block-corner ?ncorner) (corner-straight-ahead ?scorner)) :negatives((far-block-corner ?fcorner)) :tests((< ?sdist 0.0)) :value(-10.0) ) Some I CARUS driving concepts

10 What is the Domain? Our initial studies of I CARUS have focused on a simulated in-city driving environment that: requires integration of perception, action, and cognition involves both reactive control and goal direction supports many distinct tasks of varying complexity provides clear opportunities for cumulative learning The environment lets us vary domain characteristics systematically and record statistics on agent behavior. However, I CARUS aims at broad generality and should support reasoning and learning in: both first-person and strategy games crisis-response tasks involving physical response intelligent assistants for office activities

11 How is Progress Being Measured? Dependent variables: Efficiency of task execution (e.g., driving time) Quality of task execution (e.g., gas used, accidents) Higher-order metrics: Rate and asymptote of learning curves Transfer to related tasks and altered environments Independent variables: Inclusion or omission of learning methods Amount of background knowledge available Task difficulty and environmental complexity Amount of experience, task/environment similarity

12 What are the Technical Milestones? Year 1 Learn estimates of driving skills duration and success Learn higher-level driving skills via problem solving Demonstrate improvement on multi-package delivery Year 2 Learn value functions for driving concepts Learn trade-offs among many high-level tasks Demonstrate transfer and scaling to complex tasks Year 3 Acquisition of place and route knowledge Support episodic memory and perceptual attention Demonstrate cumulative learning and change resilience We will also examine other domains to ensure generality.


Download ppt "Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience."

Similar presentations


Ads by Google