Download presentation

Presentation is loading. Please wait.

Published byPresley Barlet Modified over 2 years ago

1
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Emotion-Based Decision and Learning Bruno Damas

2
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Worst case agent scenario Complex world, with large number of perceptions Minimum a priori knowledge Very limited computational power (both computation time and memory size) Possible non-stationary world

3
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Discretization of perception space leads to an exponential growth of computational resources needed with the increase of the number of perceptions. Only the most important information must be preserved. Solution: Apply the concept of somatic markers to build an associative memory capable of dealing with such problems.

4
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Emotions in human decision Somatic markers store Situation/Connotation associations (feelings) in human memory When a decision has to be made, several possible scenarios are built in the mind, associated with the possible different behaviors the subject may have. Somatic markers, taking into account their likeness to these hypothetic situations, induce a body response (the emotion) that corresponds to the situation desirability.

5
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Future Situation 1 a2a2 a1a1 a3a3 Present Situation Future Situation 2 Future Situation 3 Decision u2u2 u1u1 u3u3 Somatic Markers

6
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Decision and learning process To implement such an emotion-based decision process in an artificial agent, at least three mechanisms are required: An associative memory A memory management system A connotation estimation procedure

7
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Associative memory What should be stored in associative memory? (Perception, Action)C or dC SituationDesirability One must know where to find invariances. Ex: Filling the tank vs. Putting only 5l

8
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Estimation Procedure Non parametric regression problem with K samples (x i,y i ). x y? There is no reference model!

9
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Proposed Estimation Procedure Similarity measure x = (P,A),y = u(P,A),y i = u(P,A | dC i )

10
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Relation to classical decision

11
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Design issues Associative Memory Distance measure (similarity) Memory capacity Continuous-time signal sampling and reconstruction Cut frequency of low-pass filter Sampling rate

12
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Finite Resources: Memory Management The agent must start picking and discarding memory records when the associative memory reaches its full capacity. The choice policy of the to be discarded record is crucial: Agent performance should increase, i.e., estimation should become better on the long time. Discarding mechanisms must be fast, and must have, in the worst case, the same computational complexity as the estimation mechanisms.

13
14 de Fevereiro de 2004, Instituto Sistemas e Robótica First Approach Distribute the memory records as uniformly as possible in the perception space. Discarding records in crowded areas should do the trick. Second Approach Eliminate memory points that hardly make a difference in the estimation / interpolation process. Local variance could be a possible heuristic, but care must be taken since the order in wich memory points are acquired does matter.

14
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Third Approach Take into account non-stationary environments. This is the hardest case. Time must then be considered in the interpolation function, and a reformulation of the removal policy must be done (in the limit: FIFO) Obtaining the environment change rate ( is it slow- varying or fast-varying? ) can become a major problem.

15
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Conclusions Major advantages: No need for discretization of a continuous perception state (Reinforcement Learning) Ability to deal with arbitrary large environments with any computational /memory restrictions No need for previous world examples ( Neural Networks ): Agent learns from the begin.

16
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Conclusions Major drawbacks: A similarity measure is needed It is difficult to choose an appropriate memory size This is a greedy architecture.

17
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Major Questions Self-adjustment of similarity measure ( Particular case: identification of irrelevant perception vector elements. There are statistical tools that do that, but... ) Choosing an adequate memory size, possibly based on: Perception vector dimension Bounds for each perception vector element Variability of the true unknown function we are trying to estimate ( Bandwith ) Exploration vs. Exploitation problem

18
14 de Fevereiro de 2004, Instituto Sistemas e Robótica Current Work Sequences of actions Application of this architecture to: Hidden Markov Chain Inverted Pendulum control Dynamic obstacles avoidance

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google