Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning To Use Memory Nick Gorski & John Laird Soar Workshop 2011.

Similar presentations


Presentation on theme: "Learning To Use Memory Nick Gorski & John Laird Soar Workshop 2011."— Presentation transcript:

1 Learning To Use Memory Nick Gorski & John Laird Soar Workshop 2011

2 Agent Memory & Learning Memory Environment action observations reward 2

3 Agent Actions, Internal & External Environment action observations reward {go left, go right, eat food, bid 5 5s, pick a flower} action {store, retrieve, maintain} 3 Memory

4 Internal Actions Over Memory Internal actions are deliberate or automatic Automatic actions are in the background – Architectural and always happen – Ex: storage to episodic memory Deliberate actions are in the foreground – Procedural knowledge and cognitive – Ex: storage to working memory 4

5 Agent Internal Reinforcement Learning Environment action observations reward 5 Memory Reinforcement Learning Reinforcement Learning Action Selection

6 Assumptions Custom framework, not using Soar Simple memory models Simple tasks 6

7 Learning to Use Memory Research Question: – When can agents learn to use memory? Idea: – Investigate dynamics of memory and environment independently Need: – Simple, parameterized task 7

8 An Interactive TMaze LEFT (observation) {forward} (avail. actions) 8

9 An Interactive TMaze DECIDE (observation) {left, right} (avail. actions) 9

10 An Interactive TMaze +1 (reward) 10

11 TMaze A/B C C Base TMaze 11 Question: how much knowledge is needed to perform this task?

12 Parameterized TMazes A/B C C Base TMazeTemporal Delay A/B C C C C # Dependent Actions D D D D A/B X/Y C C Concurrent Knowledge A B W X Y Z C C Amt. of Knowledge A/B C C 2 nd Order Knowledge 12

13 Two Working Memory Models Internal action toggles between memory states Less expressive Ungrounded knowledge Internal action stores current observation More expressive Grounded knowledge Bit memory toggle A/B Gated WM gate

14 TMaze 14 A/B C C Base TMaze

15 Bit Memory & TMaze Methodology: – Modify memory to attribute blame Interfering behavior in choice location Doesnt manifest with GWM 15

16 State Diagram: Bit Memory TMaze 16 L L L L 1 1 trueperceptmem L L L L 0 0 trueperceptmem R R R R 1 1 trueperceptmem R R R R 0 0 trueperceptmem L L C C 1 1 trueperceptmem R R C C 1 1 trueperceptmem R R C C 0 0 trueperceptmem L L C C 0 0 trueperceptmem toggle STARTING STATES up leftright leftright leftright toggle leftright

17 State Diagram: GWM TMaze 17 L L L L L L trueperceptmem L L L L trueperceptmem R R R R trueperceptmem R R R R R R trueperceptmem L L C C L L trueperceptmem L L C C trueperceptmem R R C C trueperceptmem R R C C R R trueperceptmem L L C C C C trueperceptmem R R C C C C trueperceptmem gate STARTING STATES gate up gate leftright leftrightleftright left right

18 Number of Dependent Actions 18 C C # Dependent Actions D D D D

19 What Weve Learned Our machine learning intuition is often wrong (and yours probably is, too!) Chicken & Egg Problem State ambiguity is very problematic to learning 19

20 Chicken & Egg Problem Prospective uses of memory are hard Case study: bit memory & TMazes 20 1/0 Bit memory Endemic across memory models A/B C C Base TMazeChicken & Egg Problem: Must learn an association between 1 & 0 and A & B Must learn an association between 1 & 0 and left & right To be effective, cant self- interfere with memory in C!

21 Implications for Soar Soar natively supports learning internal acts. Next step: learning to use Soars memories Learning alongside hand-coded procedural knowledge is potentially strong approach Soar got the WM model right RL will never be a magic bullet 21

22 Nuggets & Coal Nearly finished! Better understanding of RL + memory, and thus Soar 9 Parameterized, empirical evaluations of RL gaining traction Optimality not only metric of performance Not quite finished! Qualitative results, but no closed form results yet No recent results for long term memories Not immediately applicable to Soar 22


Download ppt "Learning To Use Memory Nick Gorski & John Laird Soar Workshop 2011."

Similar presentations


Ads by Google