Presentation is loading. Please wait.

Presentation is loading. Please wait.

NEW TIES WP2 Agent and learning mechanisms. Decision making and learning Agents have a controller (decision tree, DQT)  Input: situation (as perceived.

Similar presentations


Presentation on theme: "NEW TIES WP2 Agent and learning mechanisms. Decision making and learning Agents have a controller (decision tree, DQT)  Input: situation (as perceived."— Presentation transcript:

1 NEW TIES WP2 Agent and learning mechanisms

2 Decision making and learning Agents have a controller (decision tree, DQT)  Input: situation (as perceived = seen/heard/interpr’d  Output: action Decision making = using DQT Learning = modifying DQT Decisions also depend on inheritable “attitude genes” (learned through evolution)

3 Example of a DQT 0.5 B B T A BiasTestActionDecision 0.2 Genetic bias YES Boolean choice Legend VISUAL: FRONT FOOD REACHABLE T NOYES TURN LEFT MOVETURN RIGHT A 0.60.2 PICKUP 1.0 A BAG: FOOD T YESNO TURN LEFT MOVETURN RIGHT A 0.60.2 EAT 1.0 A 0.5

4 Interaction evolution & individual learning Bias node with n children each with bias b i Bias ≠ probability  Bias b i is learned, changing (name: learned bias)  Genetic bias g i is inherited, part of genome, constant Actual probability of choosing child x: p(b,g) = b + (1 - b) ∙ g Learned and inherited behaviour are linked through formula

5 DQT nodes & parameters cont’d Test node language: native concepts + emerging concepts Native: see_agent, see_mother, see_food, have_food, see_mate, … New concepts can emerge by categorisation (discrimination game)

6 Learning: the heart of the emergence engine Evolutionary learning:  not within an agent (not during lifetime), over generations  by variation + selection Individual learning:  within one agent, during lifetime  by reinforcement learning Social learning:  during lifetime, in interacting agents  by sending/receiving + adopting knowledge pieces

7 Types of learning: properties Evolutionary learning:  Agent does not create new knowledge during lifetime  Basic DQTree + genetic biases are inheritable  “knowledge creator” = crossover and mutation Individual learning:  Agent does create new knowledge during lifetime  DQTree + learned biases are modified  “knowledge creator” = reinforcement learning (driven by rewards)  Individually learnt knowledge dies with its host agent Social learning:  Agent imports knowledge already created elsewhere (new? not new?)  Adoption of imported knowledge ≈ crossover  Importing knowledge pieces  can save effort for recipient  can create novel combinations  Exporting knowledge helps its preservation after death of host

8 Present status of types of learning Evolutionary learning:  Demonstrated in 2 NT scenarios  Autonomous selection/reproduction causes problems with population stability (im/explosion) Individual learning:   code, but never demonstrated in NT scenarios Social learning:  Under construction/design based on the “telepathy” approach  Communication protocols + adoption mechanisms needed

9 Evolution: variation operators Operators for DQT:  Crossover = subtree swap  Mutation =  Substitute subtree with random sub-tree  Change concepts in test nodes  Change bias on an edge Operators for attitude genes:  Crossover = full arithmetic xover  Mutation =  Add Gaussian noise  Replace with random value

10 Evolution: selection operators Mate selection:  Mate action chosen by DQT  Propose – accept proposal  Adulthood OK Survivor selection:  Dead if too old ( ≥ 80 years)  Dead if zero energy

11 Experiment: Simple world Setup: Environment World size: 200 x 200 grid cells Agents and food (no tokens, roads, etc). Both are variable in number. Initial distribution of agents (500): in upper left corner Initial distribution of food (10000): 5000 in upper left and lower right corner.

12 Experiment: Simple world Setup: Agents Native knowledge (concepts and DQT sub trees)  Navigating (random walk)  Eating (identify, pickup and eat plants)  Mating (identify mates, propose/agree) Random DQT-tree branches  Differs per agent  Based on the “pool” of native concepts

13 Experiment: Simple world Simulation continued for 3 months real time to test stability

14 Experiment: Poisonous Food Setup: Environment Two types of food: poisonous (decreases energy) and edible (increases energy) World size: 200 x 200 grid cells Agents and food (no tokens, roads, etc). Both are variable in number. Initial distribution of agents (500): uniform random over the grid space. Initial distribution of food (10000): 5000 of each type of food uniform random over the same grid space as the agents.

15 Experiment: Poisonous Food Setup: Agent Native knowledge  Identical to simple world experiment Additional native knowledge  Can distinguish poisonous from edible plants  Relation with eating/picking up is not present No random DQT-tree branches

16 Experiment: Poisonous Food Measures Population size Welfare (energy) Number of poisonous and edible plants Complexity of controller (nr. of nodes) Age

17 Experiment: Poisonous Food Demo

18 Experiment: Poisonous Food Results


Download ppt "NEW TIES WP2 Agent and learning mechanisms. Decision making and learning Agents have a controller (decision tree, DQT)  Input: situation (as perceived."

Similar presentations


Ads by Google