History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA.

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

Autonomic Scaling of Cloud Computing Resources
Evolving Cooperation in the N-player Prisoner's Dilemma: A Social Network Model Dept Computer Science and Software Engineering Golriz Rezaei Michael Kirley.
A Tutorial on Learning with Bayesian Networks
Learning on the Test Data: Leveraging “Unseen” Features Ben Taskar Ming FaiWong Daphne Koller.
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Markov Decision Process
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
Discrete Choice Model of Bidder Behavior in Sponsored Search Quang Duong University of Michigan Sebastien Lahaie
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
Graphical Multiagent Models Quang Duong Computer Science and Engineering Chair: Michael P. Wellman 1.
Dynamic Bayesian Networks (DBNs)
1 Graphical Models for Online Solutions to Interactive POMDPs Prashant Doshi Yifeng Zeng Qiongyu Chen University of Georgia Aalborg University National.
Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.
On Predictability and Profitability: Would AI induced Trading Rules be Sensitive to the Entropy of time Series Nicolas NAVET – INRIA France
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
Introduction of Probabilistic Reasoning and Bayesian Networks
Modeling Seller Listing Strategies Quang Duong University of Michigan Neel Sundaresan Nish Parikh Zeqiang Shen eBay Research Labs 1.
Yanxin Shi 1, Fan Guo 1, Wei Wu 2, Eric P. Xing 1 GIMscan: A New Statistical Method for Analyzing Whole-Genome Array CGH Data RECOMB 2007 Presentation.
Temporal Action-Graph Games: A New Representation for Dynamic Games Albert Xin Jiang University of British Columbia Kevin Leyton-Brown University of British.
Probabilistic reasoning over time So far, we’ve mostly dealt with episodic environments –One exception: games with multiple moves In particular, the Bayesian.
Particle filters (continued…). Recall Particle filters –Track state sequence x i given the measurements ( y 0, y 1, …., y i ) –Non-linear dynamics –Non-linear.
Chapter 20: Social Service Selection Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.
Probabilistic Model of Sequences Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)
Graphical Games Kjartan A. Jónsson. Nash equilibrium Nash equilibrium Nash equilibrium N players playing a dominant strategy is a Nash equilibrium N players.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.
A Decentralised Coordination Algorithm for Mobile Sensors School of Electronics and Computer Science University of Southampton {rs06r2, fmdf08r, acr,
Learning and Planning for POMDPs Eyal Even-Dar, Tel-Aviv University Sham Kakade, University of Pennsylvania Yishay Mansour, Tel-Aviv University.
Behaviorist Psychology R+R- P+P- B. F. Skinner’s operant conditioning.
Decentralised Coordination of Mobile Sensors School of Electronics and Computer Science University of Southampton Ruben Stranders,
Chapter 20: Social Service Selection Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.
Strategic Modeling of Information Sharing among Data Privacy Attackers Quang Duong, Kristen LeFevre, and Michael Wellman University of Michigan Presented.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Modeling Information Diffusion in Networks with Unobserved Links Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University.
Graphical models for part of speech tagging
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Generalized and Bounded Policy Iteration for Finitely Nested Interactive POMDPs: Scaling Up Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University.
Decisions from Data: The Role of Simulation Gail Burrill Gail Burrill Michigan State University
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Utilities and MDP: A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor SIUC.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
Design Principles for Creating Human-Shapable Agents W. Bradley Knox, Ian Fasel, and Peter Stone The University of Texas at Austin Department of Computer.
Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.
An Asymptotic Analysis of Generative, Discriminative, and Pseudolikelihood Estimators by Percy Liang and Michael Jordan (ICML 2008 ) Presented by Lihan.
Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.
Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.
Computing and Approximating Equilibria: How… …and What’s the Point? Yevgeniy Vorobeychik Sandia National Laboratories.
Presented by Jian-Shiun Tzeng 5/7/2009 Conditional Random Fields: An Introduction Hanna M. Wallach University of Pennsylvania CIS Technical Report MS-CIS
Modeling Agents’ Reasoning in Strategic Situations Avi Pfeffer Sevan Ficici Kobi Gal.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Human and Optimal Exploration and Exploitation in Bandit Problems Department of Cognitive Sciences, University of California. A Bayesian analysis of human.
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
Repeated Game Modeling of Multicast Overlays Mike Afergan (MIT CSAIL/Akamai) Rahul Sami (University of Michigan) April 25, 2006.
IJCAI’07 Emergence of Norms through Social Learning Partha Mukherjee, Sandip Sen and Stéphane Airiau Mathematical and Computer Sciences Department University.
MAIN RESULT: We assume utility exhibits strategic complementarities. We show: Membership in larger k-core implies higher actions in equilibrium Higher.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Perfect recall: Every decision node observes all earlier decision nodes and their parents (along a “temporal” order) Sum-max-sum rule (dynamical programming):
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Achieving Goals in Decentralized POMDPs Christopher Amato Shlomo Zilberstein UMass.
1 Markov Decision Processes Finite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Yifeng Zeng Aalborg University Denmark
Hidden Markov Autoregressive Models
Estimating Networks With Jumps
CASE − Cognitive Agents for Social Environments
Presentation transcript:

History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA Yevgeniy Vorobeychik Computer and Information Sciences University of Pennsylvania, USA 1

Modeling Dynamic Multiagent Behavior Design a representation that: –expresses a joint probability distribution over agent actions over time –supports inference (e.g., prediction) –exploits locality of interaction Our solution: –history-dependent graphical multiagent models (hGMMs) 2

Example 3 Consensus Voting [Kearns et al. ’09]: shown from agent 1’s perspective t=10s AgentBlue consensus Red consensus neither time

Graphical Representations Exploit locality in agent interactions –MAIDs [Koller & Milch ’01], NIDs [Gal & Pfeffer ’08], Action-graph games [Jiang et al. ’08] –Graphical games [Kearns et al. ’01] and Markov random field for graphical games [Daskalakis & Papadimitriou ’06] 4

Graphical Multiagent Models (GMMs) [Duong, Wellman and Singh UAI-08] –Nodes: agents –Edges: dependencies between agents –Neighborhood N i includes i and its neighbors accommodates multiple sources of belief about agent behavior for static (one-shot) scenarios Joint probability distribution of system’s actions Joint probability distribution of system’s actions potential of neighborhood’s joint actions normalization

Contribution Extend static GMM for modeling dynamic joint behaviors by conditioning on local history 6

History-dependent GMM (hGMM) 7 Extend static GMM: condition joint agent behavior on abstracted history of actions directly captures joint behavior using limited action history Joint probability distribution of system’s actions at time t potential of neighborhood’s joint actions at t normalization neighborhood-relevant abstracted history abstracted history

Joint vs. Individual Behavior Models Autonomous agents’ behaviors are independent given complete history. Agent i’s actions depend on past observations, specified by strategy function σ i (H t ) –Individual behavior models (IBMM): conditional independence of agent behavior given complete history. Pr(a t | H t ) = Π i σ i (H t ) History is often abstracted/summarized (limited horizon h, frequency function f, etc.), resulting in correlations in observed behavior. –Joint behavior models (hGMM) –no independence assumption σ2(Ht2)σ2(Ht2) σ3(Ht3)3σ3(Ht3)3 σ1(Ht1)σ1(Ht1)

Voting Consensus Simulation Simulation (treated as the true model): smooth fictitious play [Camerer and Ho ’99] –agents respond probabilistically in proportion to expected rewards (given reward function and beliefs about others’ behavior) Note: –This generative model is individual behavior –Given abstracted history, joint behavior models may better capture behavior even if generated by an individual behavior model 9

Voting Consensus Models Individual Behavior Multiagent Model (IBMM) Joint Behavior Multiagent Model (hGMM) 10 normalization Frequency that action a i is previously chosen by each of i’s neighbors Reward for action a i, regardless of neighbor’s actions Expected reward for a Ni, discounted by the number of dissenting neighbors Frequency that a Ni is previously chosen by neighborhood N i

Model Learning and Evaluation Given a sequence of joint actions over m time periods X = {a 0,…,a m }, the log likelihood induced by the model M: LM(X;θ) –θ: model’s parameters Potential function learning: –assumes a known graphical structure –employs gradient descent Evaluation: –computes LM(X;θ) to evaluate M 11

Experiments 10 agents i.i.d. payoffs for consensus red and blue results (between 0 and 1), 0 otherwise. max node degree d T = 100 or when the vote converges 20 smooth fictitious play game runs generated for each game configuration (10 for training, 10 for testing) 12

Results 13 1.hGMMs outperform IBMMs in predicting outcomes for shorter history lengths. 2.Shorter history horizon  more abstraction of history  more induced behavior correlation  hGMM > IBMM 3.hGMMs outperform IBMMs in predicting outcomes across different values of d Evaluation: log likelihood for hGMM / log likelihood for IBMM d\h Green: hGMM > IBMM Yellow: hGMM < IBMM

Asynchronous Belief Updates hGMMs outperform IBMMs more for longer summarization intervals v (which induce more behavior correlations) 14

Direct Sampling Compute the joint distribution of actions as the empirical distribution of the training data Evaluation: Log likelihood for hGMM / log likelihood for direct sampling Direct sampling is computationally more expensive but less powerful 15

Conclusions hGMMs support efficient and effective inference about system dynamics, using abstracted history, for scenarios exhibiting locality hGMMs provide better predictions of dynamic behaviors than IBMMs and direct fictitious play sampling Approximation does not deteriorate performance Future work: –More domain applications: authentic voting experimental results, other scenarios –(Fully) dynamic GMM that allows reasoning about unobserved past states 16