Presentation is loading. Please wait.

Presentation is loading. Please wait.

History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA.

Similar presentations


Presentation on theme: "History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA."— Presentation transcript:

1 History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA Yevgeniy Vorobeychik Computer and Information Sciences University of Pennsylvania, USA 1

2 Modeling Dynamic Multiagent Behavior Design a representation that: –expresses a joint probability distribution over agent actions over time –supports inference (e.g., prediction) –exploits locality of interaction Our solution: –history-dependent graphical multiagent models (hGMMs) 2

3 Example 3 Consensus Voting [Kearns et al. ’09]: shown from agent 1’s perspective 2 3 4 5 1 6 t=10s AgentBlue consensus Red consensus neither 11.0 0.50 2 1.00 time

4 Graphical Representations Exploit locality in agent interactions –MAIDs [Koller & Milch ’01], NIDs [Gal & Pfeffer ’08], Action-graph games [Jiang et al. ’08] –Graphical games [Kearns et al. ’01] and Markov random field for graphical games [Daskalakis & Papadimitriou ’06] 4

5 Graphical Multiagent Models (GMMs) [Duong, Wellman and Singh UAI-08] –Nodes: agents –Edges: dependencies between agents –Neighborhood N i includes i and its neighbors accommodates multiple sources of belief about agent behavior for static (one-shot) scenarios 5 2 3 4 5 1 6 Joint probability distribution of system’s actions Joint probability distribution of system’s actions potential of neighborhood’s joint actions normalization

6 Contribution Extend static GMM for modeling dynamic joint behaviors by conditioning on local history 6

7 History-dependent GMM (hGMM) 7 Extend static GMM: condition joint agent behavior on abstracted history of actions directly captures joint behavior using limited action history Joint probability distribution of system’s actions at time t potential of neighborhood’s joint actions at t normalization neighborhood-relevant abstracted history abstracted history

8 Joint vs. Individual Behavior Models Autonomous agents’ behaviors are independent given complete history. Agent i’s actions depend on past observations, specified by strategy function σ i (H t ) –Individual behavior models (IBMM): conditional independence of agent behavior given complete history. Pr(a t | H t ) = Π i σ i (H t ) History is often abstracted/summarized (limited horizon h, frequency function f, etc.), resulting in correlations in observed behavior. –Joint behavior models (hGMM) –no independence assumption 8 2 3333 1 σ2(Ht2)σ2(Ht2) σ3(Ht3)3σ3(Ht3)3 σ1(Ht1)σ1(Ht1)

9 Voting Consensus Simulation Simulation (treated as the true model): smooth fictitious play [Camerer and Ho ’99] –agents respond probabilistically in proportion to expected rewards (given reward function and beliefs about others’ behavior) Note: –This generative model is individual behavior –Given abstracted history, joint behavior models may better capture behavior even if generated by an individual behavior model 9

10 Voting Consensus Models Individual Behavior Multiagent Model (IBMM) Joint Behavior Multiagent Model (hGMM) 10 normalization Frequency that action a i is previously chosen by each of i’s neighbors Reward for action a i, regardless of neighbor’s actions Expected reward for a Ni, discounted by the number of dissenting neighbors Frequency that a Ni is previously chosen by neighborhood N i

11 Model Learning and Evaluation Given a sequence of joint actions over m time periods X = {a 0,…,a m }, the log likelihood induced by the model M: LM(X;θ) –θ: model’s parameters Potential function learning: –assumes a known graphical structure –employs gradient descent Evaluation: –computes LM(X;θ) to evaluate M 11

12 Experiments 10 agents i.i.d. payoffs for consensus red and blue results (between 0 and 1), 0 otherwise. max node degree d T = 100 or when the vote converges 20 smooth fictitious play game runs generated for each game configuration (10 for training, 10 for testing) 12

13 Results 13 1.hGMMs outperform IBMMs in predicting outcomes for shorter history lengths. 2.Shorter history horizon  more abstraction of history  more induced behavior correlation  hGMM > IBMM 3.hGMMs outperform IBMMs in predicting outcomes across different values of d Evaluation: log likelihood for hGMM / log likelihood for IBMM d\h12345678 6 3 Green: hGMM > IBMM Yellow: hGMM < IBMM

14 Asynchronous Belief Updates hGMMs outperform IBMMs more for longer summarization intervals v (which induce more behavior correlations) 14

15 Direct Sampling Compute the joint distribution of actions as the empirical distribution of the training data Evaluation: Log likelihood for hGMM / log likelihood for direct sampling Direct sampling is computationally more expensive but less powerful 15

16 Conclusions hGMMs support efficient and effective inference about system dynamics, using abstracted history, for scenarios exhibiting locality hGMMs provide better predictions of dynamic behaviors than IBMMs and direct fictitious play sampling Approximation does not deteriorate performance Future work: –More domain applications: authentic voting experimental results, other scenarios –(Fully) dynamic GMM that allows reasoning about unobserved past states 16


Download ppt "History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA."

Similar presentations


Ads by Google