Internal Actions Over Memory Internal actions are deliberate or automatic Automatic actions are in the background – Architectural and always happen – Ex: storage to episodic memory Deliberate actions are in the foreground – Procedural knowledge and cognitive – Ex: storage to working memory 4
TMaze A/B C C Base TMaze 11 Question: how much knowledge is needed to perform this task?
Parameterized TMazes A/B C C Base TMazeTemporal Delay A/B C C C C # Dependent Actions D D D D A/B X/Y C C Concurrent Knowledge A B W X Y Z C C Amt. of Knowledge A/B C C 2 nd Order Knowledge 12
Two Working Memory Models Internal action toggles between memory states Less expressive Ungrounded knowledge Internal action stores current observation More expressive Grounded knowledge 13 0 0 1 1 Bit memory toggle A/B Gated WM gate
Bit Memory & TMaze Methodology: – Modify memory to attribute blame Interfering behavior in choice location Doesnt manifest with GWM 15
State Diagram: Bit Memory TMaze 16 L L L L 1 1 trueperceptmem L L L L 0 0 trueperceptmem R R R R 1 1 trueperceptmem R R R R 0 0 trueperceptmem L L C C 1 1 trueperceptmem R R C C 1 1 trueperceptmem R R C C 0 0 trueperceptmem L L C C 0 0 trueperceptmem toggle STARTING STATES up leftright leftright leftright toggle leftright
State Diagram: GWM TMaze 17 L L L L L L trueperceptmem L L L L trueperceptmem R R R R trueperceptmem R R R R R R trueperceptmem L L C C L L trueperceptmem L L C C trueperceptmem R R C C trueperceptmem R R C C R R trueperceptmem L L C C C C trueperceptmem R R C C C C trueperceptmem gate STARTING STATES gate up gate leftright leftrightleftright left right
Number of Dependent Actions 18 C C # Dependent Actions D D D D
What Weve Learned Our machine learning intuition is often wrong (and yours probably is, too!) Chicken & Egg Problem State ambiguity is very problematic to learning 19
Chicken & Egg Problem Prospective uses of memory are hard Case study: bit memory & TMazes 20 1/0 Bit memory Endemic across memory models A/B C C Base TMazeChicken & Egg Problem: Must learn an association between 1 & 0 and A & B Must learn an association between 1 & 0 and left & right To be effective, cant self- interfere with memory in C!
Implications for Soar Soar natively supports learning internal acts. Next step: learning to use Soars memories Learning alongside hand-coded procedural knowledge is potentially strong approach Soar got the WM model right RL will never be a magic bullet 21
Nuggets & Coal Nearly finished! Better understanding of RL + memory, and thus Soar 9 Parameterized, empirical evaluations of RL gaining traction Optimality not only metric of performance Not quite finished! Qualitative results, but no closed form results yet No recent results for long term memories Not immediately applicable to Soar 22