CASE − Cognitive Agents for Social Environments

Slides:



Advertisements
Similar presentations
Observational Learning in Random Networks Julian Lorenz, Martin Marciniszyn, Angelika Steger Institute of Theoretical Computer Science, ETH.
Advertisements

Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
Markov Decision Process
Modeling Maze Navigation Consider the case of a stationary robot and a mobile robot moving towards a goal in a maze. We can model the utility of sharing.
Markov Game Analysis for Attack and Defense of Power Networks Chris Y. T. Ma, David K. Y. Yau, Xin Lou, and Nageswara S. V. Rao.
CWI PNA2, Reading Seminar, Presented by Yoni Nazarathy EURANDOM and the Dept. of Mechanical Engineering, TU/e Eindhoven September 17, 2009 An Assortment.
An RG theory of cultural evolution Gábor Fáth Hungarian Academy of Sciences Budapest, Hungary in collaboration with Miklos Sarvary - INSEAD, Fontainebleau,
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
Farnoush Banaei-Kashani and Cyrus Shahabi Criticality-based Analysis and Design of Unstructured P2P Networks as “ Complex Systems ” Mohammad Al-Rifai.
Markov Decision Processes
Planning under Uncertainty
0 Network Effects in Coordination Games Satellite symposium “Dynamics of Networks and Behavior” Vincent Buskens Jeroen Weesie ICS / Utrecht University.
An Algorithm for Automatically Designing Deterministic Mechanisms without Payments Vincent Conitzer and Tuomas Sandholm Computer Science Department Carnegie.
Bayesian Reinforcement Learning with Gaussian Processes Huanren Zhang Electrical and Computer Engineering Purdue University.
1 Indirect Estimation of the Parameters of Agent Based Models of Financial Markets Peter Winker Manfred Gilli.
Dynamic Network Security Deployment under Partial Information George Theodorakopoulos (EPFL) John S. Baras (UMD) Jean-Yves Le Boudec (EPFL) September 24,
Probability Grid: A Location Estimation Scheme for Wireless Sensor Networks Presented by cychen Date : 3/7 In Secon (Sensor and Ad Hoc Communications and.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Exploration in Reinforcement Learning Jeremy Wyatt Intelligent Robotics Lab School of Computer Science University of Birmingham, UK
1 On the Agenda(s) of Research on Multi-Agent Learning by Yoav Shoham and Rob Powers and Trond Grenager Learning against opponents with bounded memory.
Learning and Planning for POMDPs Eyal Even-Dar, Tel-Aviv University Sham Kakade, University of Pennsylvania Yishay Mansour, Tel-Aviv University.
FLANN Fast Library for Approximate Nearest Neighbors
1 Monte-Carlo Planning: Policy Improvement Alan Fern.
01/16/2002 Reliable Query Reporting Project Participants: Rajgopal Kannan S. S. Iyengar Sudipta Sarangi Y. Rachakonda (Graduate Student) Sensor Networking.
Utility Theory & MDPs Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart.
COGNITIVE RADIO FOR NEXT-GENERATION WIRELESS NETWORKS: AN APPROACH TO OPPORTUNISTIC CHANNEL SELECTION IN IEEE BASED WIRELESS MESH Dusit Niyato,
By Paul Cottrell, BSc, MBA, ABD. Author Complexity Science, Behavioral Finance, Dynamic Hedging, Financial Statistics, Chaos Theory Proprietary Trader.
MAKING COMPLEX DEClSlONS
Search and Planning for Inference and Learning in Computer Vision
Exploring the dynamics of social networks Aleksandar Tomašević University of Novi Sad, Faculty of Philosophy, Department of Sociology
OBJECT FOCUSED Q-LEARNING FOR AUTONOMOUS AGENTS M. ONUR CANCI.
Study on Genetic Network Programming (GNP) with Learning and Evolution Hirasawa laboratory, Artificial Intelligence section Information architecture field.
1 On the Emergence of Social Conventions: modeling, analysis and simulations Yoav Shoham & Moshe Tennenholtz Journal of Artificial Intelligence 94(1-2),
1 Near-Optimal Play in a Social Learning Game Ryan Carr, Eric Raboin, Austin Parker, and Dana Nau Department of Computer Science, University of Maryland.
Linking multi-agent simulation to experiments in economy Re-implementing John Duffy’s model of speculative learning agents.
Session 2a, 10th June 2008 ICT-MobileSummit 2008 Copyright E3 project, BUPT Autonomic Joint Session Admission Control using Reinforcement Learning.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
Multi-Agent Modeling of Societal Development and Cultural Evolution Yidan Chen, 2006 Computer Systems Research Lab.
Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,
CS584 - Software Multiagent Systems Lecture 12 Distributed constraint optimization II: Incomplete algorithms and recent theoretical results.
Artificial intelligence methods in the CO 2 permission market simulation Jarosław Stańczak *, Piotr Pałka **, Zbigniew Nahorski * * Systems Research Institute,
© D. Weld and D. Fox 1 Reinforcement Learning CSE 473.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Question paper 1997.
Tetris Agent Optimization Using Harmony Search Algorithm
MDPs (cont) & Reinforcement Learning
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Repeated Game Modeling of Multicast Overlays Mike Afergan (MIT CSAIL/Akamai) Rahul Sami (University of Michigan) April 25, 2006.
1 Passive Reinforcement Learning Ruti Glick Bar-Ilan university.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
4/22/20031/28. 4/22/20031/28 Presentation Outline  Multiple Agents – An Introduction  How to build an ant robot  Self-Organization of Multiple Agents.
Explain the step-by-step process of rational decision making
Game Theory Just last week:
Making complex decisions
Analytics and OR DP- summary.
Task: It is necessary to choose the most suitable variant from some set of objects by those or other criteria.
MSDM AAMAS-09 Two Level Recursive Reasoning by Humans Playing Sequential Fixed-Sum Games Authors: Adam Goodie, Prashant Doshi, Diana Young Depts.
Rational Decision Making 8-step Process
An Overview of Reinforcement Learning
Reinforcement Learning
Multi-Agent Exploration
Announcements Homework 3 due today (grace period through Friday)
Instructors: Fei Fang (This Lecture) and Dave Touretzky
R. W. Eberth Sanderling Research, Inc. 01 May 2007
Markov Random Fields Presented by: Vladan Radosavljevic.
Evolution for Cooperation
A History Sensitive Cascade Model in Diffusion Networks
Boltzmann Machine (BM) (§6.4)
CSRG Presented by Souvik Das 11/02/05
Random Neural Network Texture Model
Presentation transcript:

CASE − Cognitive Agents for Social Environments Yu Zhang Trinity University | Laboratory for Distributed Intelligent Agent Systems

Outline Introduction CASE — Agent-Level Solution CASE — Society-Level Solution Experiment A Case Study Conclusion and Future Work Trinity University | Laboratory for Distributed Intelligent Agent Systems

Introduction Multi-Agent Systems MAS for Social Simulation Research Goal Existing Approaches Our Approach Trinity University | Laboratory for Distributed Intelligent Agent Systems

Multi-Agent Systems Society Multi-Agent Agent Agents Societies High-Frequency Interactions Interactions are decentralized KB Agent KB KB KB KB KB KB = Knowledge Base Trinity University | Laboratory for Distributed Intelligent Agent Systems

Simulating Social Environments 5 1998 Journal of Artificial Societies and Social Simulation first published. 1997 First international conference on computer simulation and the social sciences. Hope is that computer simulation will achieve a disciplinary synthesis among the social sciences. 1996 Santa Fe Institute becomes well known for developing ideas about complexity and studying them utilizing computer simulations of real-world phenomena. 1995 Series of workshops held in Italy and USA. Field becomes more theoretically and methodologically grounded. 1992 First ‘Simulating Societies’ workshop held. Trinity University | Laboratory for Distributed Intelligent Agent Systems

6 Research Goal Understanding how the decentralized interactions of agents could generate social conventions. Trinity University | Laboratory for Distributed Intelligent Agent Systems

Current Approaches Society Level Focuses on static social structures. 7 Society Level Focuses on static social structures. Agent Level Focuses on the self-interested agents. Trinity University | Laboratory for Distributed Intelligent Agent Systems

Our Approach Cognitive Agents for Social Environments Network 8 Our Approach Cognitive Agents for Social Environments Network  Social Convention  Bounded Rationality  Action  Perception Environment Trinity University | Laboratory for Distributed Intelligent Agent Systems

Related Work COGENT SOAR Sugarscape CASE Meso-Level Agent behavior realistic but not too computationally complex Agent Complexity Schelling’s Segregation Model ACT-R CLARION Top-Down Bottom-Up Trinity University | Laboratory for Distributed Intelligent Agent Systems

Outline Background and Objective CASE — Agent-Level Solution CASE — Society-Level Solution Experiment A Case Study Conclusion and Future Work Trinity University | Laboratory for Distributed Intelligent Agent Systems

Our Approach Cognitive Agents for Social Environments Network 11 Our Approach Cognitive Agents for Social Environments Network  Social Convention  Bounded Rationality  Action  Perception Environment Trinity University | Laboratory for Distributed Intelligent Agent Systems

Rationality vs. Bounded Rationality Rationality means that agents calculate a utility value for the outcome of every action. Bounded Rationality means that agents use intuition and heuristics to determine if one action is better than another. Trinity University | Laboratory for Distributed Intelligent Agent Systems

13 Daniel Kahneman Courtesy Google Image Trinity University | Laboratory for Distributed Intelligent Agent Systems

Two-Phase Decision Model Evaluation Criteria Selective Attention Framing Anchoring Accessibility State Similarity Phase I - Editing Phase II - Evaluation Decision Mode Two Modes of Function Intuition Deliberation Action Trinity University | Laboratory for Distributed Intelligent Agent Systems

Phase I - Editing Phase II - Evaluation 15 Phase I - Editing Phase II - Evaluation Trinity University | Laboratory for Distributed Intelligent Agent Systems

Phase I - Editing Framing: decide evaluation criteria based on one’s attitude toward potential risk and reward. Anchoring: build selective attention on information. Salience of information: iI Anchored information: I*= {i | i > threshold} Accessibility: determine state similarity only by I*. A piece of information Context of the current decision A set of all information Accessibility relation Current state A memory state Trinity University | Laboratory for Distributed Intelligent Agent Systems

Phase II - Evaluation Intuition Deliberation If st ~ sm, the optimal decision policy *(st) and *(sm) should be close too. Deliberation Optimize *(st). Time discount factor A given policy Expected value function Trinity University | Laboratory for Distributed Intelligent Agent Systems

Outline Background and Objective CASE — Agent-Level Solution CASE — Society-Level Solution Experiment A Case Study Conclusion and Future Work Trinity University | Laboratory for Distributed Intelligent Agent Systems

Our Approach Cognitive Agents for Social Environments Network 19 Our Approach Cognitive Agents for Social Environments Network  Social Convention  Bounded Rationality  Action  Perception Environment Trinity University | Laboratory for Distributed Intelligent Agent Systems

Social Convention A social law is a restriction on the set of actions available to agents. A social convention is a social law that restricts the agent’s behavior to one particular action. Trinity University | Laboratory for Distributed Intelligent Agent Systems

Hard-Wired Design vs. Emergent Design Hard-wired design means that social conventions are given to agents off-line before the simulation. Emergent design is a run time solution that agents decide the most suitable conventions giving the current state of the system. Trinity University | Laboratory for Distributed Intelligent Agent Systems

Generating Social Conventions: Existing Rules Highest Cumulative Reward Simple Majority An agent switches to a new action if the total payoff from that action is higher than the payoff obtained from the currently-chosen action. Not rely on global statistics about the system. Guaranteeing convergence in a 2-person 2-choice symmetric coordination game. An agent switches to a new action if they have observed more instance of it in other agents than the present action. Rely on global statistics about the system. Convergence has not been proved. Trinity University | Laboratory for Distributed Intelligent Agent Systems

Generating Social Conventions: Our Rule Generalized Simple Majority Definition. Assume an agent has K neighbors and that KA neighbors are in state A. If the agent is in state B, it will change to state A with probability Theorem. When →, change to state A when more than K/2 neighbors are in state A, in a 2-person 2-choice symmetric coordination game. Trinity University | Laboratory for Distributed Intelligent Agent Systems

Outline Background and Objective CASE — Agent-Level Solution CASE — Society-Level Solution Experiment Evaluating the Agent-Level Solution Evaluating the Society-Level Solution A Case Study Conclusion and Future Work Trinity University | Laboratory for Distributed Intelligent Agent Systems

Evaluating the Agent-Level Solution The Ultimatum Game The Bargaining Game Agent a I'll take x, you get 10x I'll take x, you get 10x Accept Negotiate I'll take y, you get 10y a gets x, b gets 10x a gets x, b gets 10x Accept Agent b Reject Both get 0 Reject or Run out of steps Both get 0 Trinity University | Laboratory for Distributed Intelligent Agent Systems

Phase I - Editing Accessibility Framing Anchoring st ~ sm if 11 states: $0, $1, …, $10 Anchoring Use 500 iterations of Q-learning to develop anchored states Accessibility st ~ sm if Trinity University | Laboratory for Distributed Intelligent Agent Systems

Q-Learning Well studied reinforcement learning algorithm Converges to optimal decision policy Works in unknown environments Estimates long-term reward from experience expected discounted reward old value old value max future value learning rate discount factor Trinity University | Laboratory for Distributed Intelligent Agent Systems

Phase II - Evaluation 1000 iterations of play with intuitive or deliberative decisions Trinity University | Laboratory for Distributed Intelligent Agent Systems

Results of the Ultimatum Game Human Players’ Results & Rational Players’ Results Human Players Rational Players Number Time Accepted Two-Phase CASE Agents’ Results Intuition Only Deliberation Only Split Value Trinity University | Laboratory for Distributed Intelligent Agent Systems

Results of the Bargaining Game Human Players’ Results 10 8 6 4 2 Split Value Negotiation Size Iteration Iteration CASE Agents’ Results Split Value Negotiation Size Iteration Iteration Results of the Bargaining Game by Human Players are with kind permission of Springer Science Trinity University | Laboratory for Distributed Intelligent Agent Systems

Outline Background and Objective CASE — Agent-Level Solution CASE — Society-Level Solution Experiment Evaluating the Agent-Level Solution Evaluating the Society-Level Solution A Case Study Conclusion and Future Work Trinity University | Laboratory for Distributed Intelligent Agent Systems

Evaluating the Society-Level Solution 2-person 2-choice symmetric coordination Game A B 1 -1 Two Optimal Decisions: (A,A) and (B,B) Trinity University | Laboratory for Distributed Intelligent Agent Systems

Evaluating the Society-Level Solution Intuitive and deliberative decisions N agents (N2) with random initial state, A or B, with probability 50% Agents connected by classic networks or complex networks Evaluating two rules Highest cumulative reward (HCR) Generalized simple majority (GSM) Performance measure: T90% The time it takes that 90% of the agents use the same convention Trinity University | Laboratory for Distributed Intelligent Agent Systems

Classic Networks Complete Network KN Lattice Ring CN,K Random Network RN,P Nodes fully connected to each other Nodes fully connected to its K neighbors Local clustering Nodes connected with equal probability N=8 N=100 K=6 N=100 P=5% Trinity University | Laboratory for Distributed Intelligent Agent Systems

Small-World Network WN,K,P Complex Networks Small-World Network WN,K,P Scale-Free Network SN,K, Start with a CN,K graph and rewire every link at random with P Local clustering & randomness P(K) is a power law P(K) ~ K- Large networks can self-organize into a scale free state, independent of the agents N=100 K=6 P=5% N=100 K=6 =2.5 Trinity University | Laboratory for Distributed Intelligent Agent Systems

Evaluating Highest Cumulative Reward Network Topology Name Size  103 (N) Parameter SN Scale-free network 1, 2.5, 5, 7.5, 10, 25, 50 =2.5 <K>=12 CN Lattice ring 0.1, 0.25, 0.5, 0.75, 1 K=12 KN Complete network None needed WN Small-world network 1, 2.5, 5, 7.5, 10 P=0.1 <K>=12 Lattice ring T90%=O(N2.5) Small-world Scale-free/ Complete T90%=O(N1.5) T90%=O(NlogN) Trinity University | Laboratory for Distributed Intelligent Agent Systems

Evaluating Generalized Simple Majority Network Topology Name Size  103 (N) Parameter CN Lattice ring 0.1, 0.25, 0.5, 0.75 K=12 KN Complete network 1, 2.5, 5, 7.5, 10, 25, 50, 75, 100 None needed SN Scale-free network =2.5 <K>=12 =3.0 <K>=12 WN Small-world network 1, 2.5, 5, 7.5, 10, 25, 50 P=0.1 <K>=12 Lattice ring T90%=O(N2.5) Small-world T90%=O(N1.5) Scale-free/ Complete T90%=O(N) Trinity University | Laboratory for Distributed Intelligent Agent Systems

Evaluating HCR vs. GSM Network Topology N <K> P Small-World Network 104 12 P=0.05, 0.09 … 0.9 (P=0.09) Lattice ring Random network Trinity University | Laboratory for Distributed Intelligent Agent Systems