Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design of Multi-Agent Systems Teacher Bart Verheij Student assistants Albert Hankel Elske van der Vaart Web site

Similar presentations


Presentation on theme: "Design of Multi-Agent Systems Teacher Bart Verheij Student assistants Albert Hankel Elske van der Vaart Web site"— Presentation transcript:

1 Design of Multi-Agent Systems Teacher Bart Verheij Student assistants Albert Hankel Elske van der Vaart Web site http://www.ai.rug.nl/~verheij/teaching/dmas/ (Nestor contains a link)

2 Student presentations Week 37 * C. Jonker et al. (2002). BDI-Modelling of Intracellular Dynamics. Joris Ijsselmuiden * R. Wulfhorst et al. (2003). A Multiagent Approach for Musical Interactive Systems. Rosemarijn Looije * M. Dastani, J. Hulstijn, F. Dignum, J.-J.Ch. Meyer (2004). Issues in Multiagent System Development. Sander van Dijk

3 Week 38 * W. C. Stirling, M. A. Goodrich and D. J. Packard (2002). Satisficing Equilibria: A Non- Classical Theory of Games and Decisions. Dimitri Vrehen * A. Bazzan and R.H. Bordini (2001). A framework for the simulation of agents with emotions. Report on Experiments with the Iterated Prisoner's Dilemma. Stijn Colen * I. Dickinson and M. Wooldridge (2003). Towards Practical Reasoning Agents for the Semantic Web. * E. Norling (2004). Folk Psychology for Human Modelling: Extending the BDI Paradigm. Student presentations

4 Some practical matters  Please submit exercises to designofmas@gmail.com.  Please use naming conventions for file names and message subjects.  Please read your student mail.

5 Overview Introduction Evaluation criteria & equilibria Social welfare Pareto efficiency Nash equilibria The Prisoner’s Dilemma Loose end: dominant strategies Not or different in the book

6 Typical structure of a multi-agent system

7 Interactions  Communication  Influence on environment (‘spheres of influence’)  Organizations, communities, coalitions  Hierarchical relations  Cooperation, competition

8 Utilities & preferences How to measure the results of a multi-agent systems? In terms of preferences and utilities. Some notation:  ={  1,  2, … }‘outcomes’, future environmental states group preferences (assumes cooperation) individual preferences

9 Preferences Strict preferences Properties Reflexive: Transitive: Comparable:

10 Utilities According to utility theory, preferences can be measured in terms of real numbers Example: money But money isn’t always the right measure: think of the subjective value of a million dollars when you have nothing or when you are Bill Gates.

11 Utility & money

12 Zero-sum & constant-sum games Simplification: two agents Constant sum games The sum of all players' payoffs is the same for any outcome. u i (  ) + u j (  ) = C for all    Zero-sum games All outcomes involve a sum of the players’ payoffs of 0: u i (  ) + u j (  ) = 0 for all    Chess 0, ½, 1 -½, 0, ½

13 Zero-sum & constant-sum games One agent’s gain is another agent’s loss. Zero-sum games are necessarily always competitive. But there are many non-zero sum situations.

14 Overview Introduction Evaluation criteria & equilibria Social welfare Pareto efficiency Nash equilibria The Prisoner’s Dilemma Loose end: dominant strategies

15 Kinds of evaluation criteria & equilibria Social welfare Pareto efficiency Nash equilibrium

16 Social welfare Social welfare measures the sum of all individual outcomes. Optimal social welfare may not be achievable when individuals are self-interested Individual agents follow their own (different) utility function.

17 Example 1 Agenta2a2 Strategys 2,1 s 2,2 s 1,1 (5,6)(4,3) a1a1 s 1,2 (1,2)(6,4) highest social welfare

18 Overview Introduction Evaluation criteria & equilibria Social welfare Pareto efficiency Nash equilibria The Prisoner’s Dilemma Loose end: dominant strategies

19 Pareto efficiency or optimality An outcome is Pareto optimal if a better outcome for one agent always results in a worse outcome for some other agent When all agents pursue social welfare, highest social welfare is Pareto optimal. However, a Pareto optimal outcome need not be desirable. E.g., dictatorship Pareto improvement: change that is an improvement for someone without hurting anyone

20 Example 1 Agenta2a2 Strategys 2,1 s 2,2 s 1,1 (5,6)(4,3) a1a1 s 1,2 (1,2)(6,4) Pareto efficient Pareto improvements

21 Overview Introduction Evaluation criteria & equilibria Social welfare Pareto efficiency Nash equilibria The Prisoner’s Dilemma Loose end: dominant strategies

22 Nash equilibrium Two strategies s 1 and s 2 are in Nash equilibrium if: 1.under the assumption that agent i plays s 1, agent j can do no better than play s 2 ; and 2.under the assumption that agent j plays s 2, agent i can do no better than play s 1. No individual has the incentive to unilaterally change strategy Example: driving on the right side of the road Nash equilibria do not always exist and are not always unique

23 Example 1 Agenta2a2 Strategys 2,1 s 2,2 s 1,1 (5,6)  (4,3) a1a1  s 1,2 (1,2)  (6,4) Nash equilibria ‘Nash incentives’

24 Example 1 Agenta2a2 Strategys 2,1 s 2,2 s 1,1 (5,6)  (4,3) a1a1  s 1,2 (1,2)  (6,4) outcomes corresponding to strategies in Nash equilibrium

25 Example 2 Agenta2a2 Strategys 2,1 s 2,2 s 1,1 (3,6)  (5,3) a1a1  s 1,2 (6,2)  (2,5) no Nash equilibrium

26 Example 3 unique Nash equilibrium Agenta2a2 Strategys 2,1 s 2,2 s 1,1 (1,1)  (5,0) a1a1  s 1,2 (0,5)  (3,3)

27 Example 3 unique Nash equilibrium Agenta2a2 Strategys 2,1 s 2,2 s 1,1 (1,1)  (5,0) a1a1  s 1,2 (0,5)  (3,3) highest social welfare & Pareto efficient

28 Overview Introduction Evaluation criteria & equilibria Social welfare Pareto efficiency Nash equilibria The Prisoner’s Dilemma Loose end: dominant strategies

29 The Prisoner’s Dilemma Two men are collectively charged with a crime and held in separate cells, with no way of meeting or communicating. They are told that: – if one confesses and the other does not, the confessor will be freed, and the other will be jailed for three years – if both confess, then each will be jailed for two years Both prisoners know that if neither confesses, then they will each be jailed for one year

30 The Prisoner’s Dilemma The prisoners can either defect or cooperate. The rational action for each individual prisoner is to defect. Example 3 is a prisoner’s dilemma (but note that it tables utilities, not prison years: less years in prison has a higher utility). Real life: nuclear arms reduction, free riders

31 The Prisoner’s Dilemma The Prisoner’s Dilemma is the fundamental problem of multi-agent interactions. It appears to imply that cooperation will not occur in societies of self-interested agents.

32 Recovering cooperation... Conclusions that some have drawn from this analysis: – the game theory notion of rational action is wrong! – somehow the dilemma is being formulated wrongly Arguments to recover cooperation: – We are not all Machiavelli! – The other prisoner is my twin! – The shadow of the future…

33 The Iterated Prisoner’s Dilemma One answer: play the game more than once If you know you will be meeting your opponent again, then the incentive to defect appears to evaporate When you now how many times you’ll meet your opponent, defection is again rational

34 Axelrod’s tournament Suppose you play iterated prisoner’s dilemma against a range of opponents… What strategy should you choose, so as to maximize your overall payoff? Axelrod (1984) investigated this problem, with a computer tournament for programs playing the prisoner’s dilemma

35 Strategies in Axelrod’s tournament ALL-D: Always defect TIT-FOR-TAT: At the first meeting of an opponent: cooperate. Then do what your opponent did on the previous meeting TESTER: First: defect. If the opponent retaliates, play TIT-FOR-TAT. Otherwise intersperse cooperation and defection. JOSS: As TIT-FOR-TAT, except periodically defect

36 Reasons for TIT-FOR-TAT’s success – Don’t be envious: Don’t play as if it were zero sum! – Be nice: Start by cooperating, and reciprocate cooperation – Retaliate appropriately: Always punish defection immediately, but use “measured” force — don’t overdo it – Don’t hold grudges: Always reciprocate cooperation immediately

37 Overview Introduction Evaluation criteria & equilibria Social welfare Pareto efficiency Nash equilibria The Prisoner’s Dilemma Loose end: dominant strategies

38 Dominant strategy A strategy is dominant for an agent if it is the best under all circumstances Dominant strategy equilibrium: each agent uses a dominant strategy A dominant strategy equilibrium is always a Nash equilibrium (but there are ‘more’ of the latter).

39 Example 4 (2,3)  (1,2)s 1,2  a1a1 (4,5)  (2,3)s 1,1 s 2,2 s 2,1 Strategy a2a2 Agent Dominant for a 1 Dominant for a 2

40 Just to play with: new roads - There are 6 cars going from A to D each day. - (A,B) and (C,D) are highways time(c) = 5 + 2c, where c is the number of cars -(B,D) and (A,C) are local roads time(c) = 20 + c A B C D What will happen when a new highway is made between B and C?


Download ppt "Design of Multi-Agent Systems Teacher Bart Verheij Student assistants Albert Hankel Elske van der Vaart Web site"

Similar presentations


Ads by Google