Modeling the Process of Collaboration and Negotiation with Incomplete Information Katia Sycara, Praveen Paruchuri, Nilanjan Chakraborty Collaborators:

Slides:

Advertisements

Similar presentations

Empirical Studies in Computer-Mediated Interest-Based Negotiations Sohan Dsouza MSc IT, 2009 British University in Dubai.

Advertisements

Dialogue Policy Optimisation

1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.

Partially Observable Markov Decision Process (POMDP)

CROWN “Thales” project Optimal ContRol of self-Organized Wireless Networks WP1 Understanding and influencing uncoordinated interactions of autonomic wireless.

SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.

1 Regret-based Incremental Partial Revelation Mechanism Design Nathanaël Hyafil, Craig Boutilier AAAI 2006 Department of Computer Science University of.

CHAPTER NINE Relationships in Negotiation McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.

Chapter Learning Objectives

COMPUTER SIMULATION MODELS AND MULTILEVEL CANCER CONTROL INTERVENTIONS Joseph Morrissey, Kristen Hassmiller Lich, Rebecca Anhang Price, Jeanne Mandelblatt.

 By Ashwinkumar Ganesan CMSC 601.  Reinforcement Learning  Problem Statement  Proposed Method  Conclusions.

Planning under Uncertainty

KI Kunstmatige Intelligentie / RuG Markov Decision Processes AIMA, Chapter 17.

© 2005 Prentice-Hall, Inc. 5-1 Chapter 5 Negotiation and Conflict Resolution.

1 University of Southern California Security in Multiagent Systems by Policy Randomization Praveen Paruchuri, Milind Tambe, Fernando Ordonez University.

4/1 Agenda: Markov Decision Processes (& Decision Theoretic Planning)

Markov Decision Processes

Group Processes and Work Teams Chapter Nine. © Copyright Prentice-Hall Group Dynamics Group dynamics focus on the nature of groups – the variables.

9/23. Announcements Homework 1 returned today (Avg 27.8; highest 37) –Homework 2 due Thursday Homework 3 socket to open today Project 1 due Tuesday –A.

16-1 McGraw-Hill/Irwin ©2006 The McGraw-Hill Companies, Inc., All Rights Reserved CHAPTER SIXTEEN International and Cross-Cultural Negotiation.

Organizational Learning (OL)

CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.

Modeling and Simulation

Applying Research on Cultural Differences in Negotiation to a Negotiation Simulation Phani Radhakrishnan PhD Cross Cultural Differences in Org Behaviour.

What is Business Analysis Planning & Monitoring?

Observer Experiments: Studies in Culturally Filtered Perception Carnegie Mellon University: Dr. Katia Sycara, Dr. Laurie Weingart, Dr. Roie Zivan University.

N What do you expect from Science? n Science is a process, not a thing. n Science is a paradigm based on understanding nature (rather than passively observing.

Managing Social Influences through Argumentation-Based Negotiation Present by Yi Luo.

People, Process, Technology. Communication Quality Experience(QCE ) What does this mean Why is it important How does it effect us Why do we need it How.

MAKING COMPLEX DEClSlONS

Conference Paper by: Bikramjit Banerjee University of Southern Mississippi From the Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence.

9TH EDITION Selling Today Manning and Reece APPROACHING THE CUSTOMER.

History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA.

Coalition Formation between Self-Interested Heterogeneous Actors Arlette van Wissen Bart Kamphorst Virginia DignumKobi Gal.

Excited to Disagree? A Study of Emotions in Team Conflict Laurie R. Weingart Tepper School of Business Carnegie Mellon University (in collaboration with.

Team Formation between Heterogeneous Actors Arlette van Wissen Virginia Dignum Kobi Gal Bart Kamphorst.

Generalized and Bounded Policy Iteration for Finitely Nested Interactive POMDPs: Scaling Up Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University.

DEPARTMENT of COMPUTER SCIENCE University of Rochester  Activities  Abductive Inference of Multi-Agent Interaction  Capture the Flag Data Collection.

International Environmental Agreements with Uncertain Environmental Damage and Learning Michèle Breton, HEC Montréal Lucia Sbragia, Durham University Game.

Utilities and MDP: A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor SIUC.

Lecture 7 Course Summary The tools of strategy provide guiding principles that that should help determine the extent and nature of your professional interactions.

Cognitive Schemata for Cooperation and Negotiation Dr. Catherine Tinsley, Dr. Laurie Weingart, Dr. Robin Dillon, Nazli Turan.

Modeling Reasoning in Strategic Situations Avi Pfeffer MURI Review Monday, December 17 th, 2007.

An Extended Alternating-Offers Bargaining Protocol for Automated Negotiation in Multi-agent Systems P. Winoto, G. McCalla & J. Vassileva Department of.

The Scientific Method. Objectives Explain how science is different from other forms of human endeavor. Identify the steps that make up scientific methods.

LECTURE 6 A Conflict Management.

AAAI Fall Symposium on Mixed-Initiative Problem-Solving Assistants 1 Mixed-Initiative Dialogue Systems for Collaborative Problem-Solving George Ferguson.

A. The “Interactive Interview”. How it differs from the typical Social Survey.

TEAM, ORGANIZATIONAL, AND INTERNATIONAL CULTURE Chapter 14.

1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.

Why use landscape models?  Models allow us to generate and test hypotheses on systems Collect data, construct model based on assumptions, observe behavior.

1 (Chapter 3 of) Planning and Control in Stochastic Domains with Imperfect Information by Milos Hauskrecht CS594 Automated Decision Making Course Presentation.

Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia

Lecture “6” Manage Project Team

Cognitive Schemata for Collaboration and Negotiation Catherine H. Tinsley Robin Dillon-Merrill The McDonough School of Business Georgetown University June.

1 McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. International Business Negotiation.

Keep the Adversary Guessing: Agent Security by Policy Randomization

CS b659: Intelligent Robotics

Analytics and OR DP- summary.

Reinforcement Learning in POMDPs Without Resets

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Thrust IC: Action Selection in Joint-Human-Robot Teams

Markov Decision Processes

Markov Decision Processes

Announcements Homework 3 due today (grace period through Friday)

Discrete Event Simulation - 4

Katia Sycara, Praveen Paruchuri, Nilanjan Chakraborty

CS 416 Artificial Intelligence

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Presentation transcript:

Modeling the Process of Collaboration and Negotiation with Incomplete Information Katia Sycara, Praveen Paruchuri, Nilanjan Chakraborty Collaborators: Roie Zivan, Laurie Weingart, Geoff Gordon, Miro Dudik

MURI 14 Program Review-- September 10, Theory Formation Identify Cultural Factors CUNY, Georgetown, CMU Computational Models CMU, USC Virtual Humans USC Implementation CMU RESEARCH PRODUCTS Surveys & Interviews CUNY, CMU, U Mich, Georgetown Cross-Cultural Interactions U Pitt, CMU Data Analysis CUNY, Georgetown, U Pitt, CMU validation Validated Theories Models Modeling Tools Briefing Materials Scenarios Training Simulations Common task Subgroup task

MURI 14 Program Review-- September 10, Problem Computational model of reasoning in Cooperation and Negotiation (C&N) Capture the rich process of C&N –Not just outcome –Not just offer-counteroffer but additional communications Account for cultural, social factors Rewards of other agents not known Uncertain and dynamic environment MURI 14 Program Review-- September 10, 2009

4 Contributions Created an initial model from real human data. The model: –Applicable in a uniform way to both collaboration and negotiation –Derives sequences of actions for an agent from real transcripts, as opposed to state of the art work where action selection is constructed heuristically –Adapts its beliefs during the course of the interaction –Learns elements of the negotiation (e.g. other party type) as the interaction proceeds –Produces optimal activity sequences considering also the other agents –Has only incomplete information about others

POMDP: Partially Observable Markov Decision Process Agent has initial beliefs Agent takes an action Gets an observation Interprets the observation Updates beliefs Decides on an action Repeats Agent takes optimal action considering world/other agents Elements: { States, Actions, Transitions, Rewards, Observations } MURI 14 Program Review-- September 10, 2009 The World (Other agents) The World (Other agents) Agent Action Observation

MURI 14 Program Review-- September 10, Why POMDP based modeling ? –Decentralized algorithm –Incorporated in an agent that interacts with others –Can represent communication (arguments, offers, preferences etc) –Many conversational turns –Learns e.g. the model of the other player –Adaptive best response –Computationally efficient for realistic interactions –Extendable to more the two agents Natural way to represent cultural and social factors in C and N MURI 14 Program Review-- September 10, 2009

7 Output of POMDP The output is a policy matrix Policy: Optimal action to take, given current state (observations and other’s model) At run-time, agent consults the matrix and takes appropriate action

MURI 14 Program Review-- September 10, Simplified Example Two agents negotiating –Seller S (POMDP Agent) –Buyer B (Other player) Single item negotiation Initially buyer at 0 price and seller at max = 10 MURI 14 Program Review-- September 10, 2009

9 Example: State Space State composed of 2 parts – –Seller Type, Buyer type –Negotiation status: current offers Agent types: cooperative or non-cooperative Negotiation modeled from Seller’s perspective –Initially high uncertainty of Buyer type Seller’s belief about Buyer, and state of negotiation are dynamic MURI 14 Program Review-- September 10, 2009

10 Example: POMDP State Agent Type: cooperative vs non-cooperative –0  cooperative, 1  non-cooperative –Discretized to {0,.5, 1} Price discretized to the set {0,1,..,9,10} Sample state: State space = Number of Buyer types * Negotiation states = 363 Me (Seller) Type= Coop You (Buyer) = Unknown Negotiation status: MURI 14 Program Review-- September 10, 2009

11 Example: Action & Transition Action set: {Concede 2, Concede 1, Concede 0, Accept, Reject} Transition: Probability of ending in some state if agent takes a particular action in current state MURI 14 Program Review-- September 10, 2009

12 Me = Coop You = Unknown My price = $10 Your price = $0 Me = Coop You = Coop ( $9, $0 ) Me = Coop You = Coop ( $9, $1 ) Me = Coop You = Coop ( $9, $2 ) Me = Coop You = Ncoop ( $9, $0 ) Me = Coop You = Ncoop ( $9, $1 ) Me = Coop You = Ncoop ( $9, $2 ) Concede 1 Concede Concede ( $4, $6)( $6, $4) Concede Me = Coop You = Coop ( $8, $0 ) Me = Coop You = Coop ( $8, $1 ) Me = Coop You = Coop ( $8, $2 ) Me = Coop You = Coop ( $7, $0 ) Me = Coop You = Coop ( $7, $1 ) Me = Coop You = Coop ( $7, $2 ) Concede 2Concede 0 ( $5, $5) Concede 1 Concede 0 Agree

MURI 14 Program Review-- September 10, Building Initial Simplified POMDP Human negotiation transcripts –2 players (Grocer and Florist) with 4 issues Mapped dialogues to 14 base codes (actions) Other player’s type known for each transcript –Used for training and validation of the model Transition: Frequency of reaching some state, given a code Observation: Frequency of observing a code given some negotiation state MURI 14 Program Review-- September 10, 2009

14 POMDP construction MURI 14 Program Review-- September 10, Grocer-Florist Transcript Model Generator Model generated Reasoning over model Prescription of optimal actions given state of interaction (Empty)Learns

MURI 14 Program Review-- September 10, Codes used CodeDefinitionCodeDefinitionCodeDefinition OFFERREACTIONSMisc Miscellaneous OS Single-Issue RPO Agreement to offer made SBF Substantiation OM Multi-Issue RPS Agreement with statement Q Question PROVIDE INFORMATION RNO Disagreement with offer PC Procedural Comment IP Issue Preferences RNS Disagreement with statement INT Summarizing IR Priorities TP Threat/Power IB Bottom-line Courtesy of Laurie Weingart

MURI 14 Program Review-- September 10, Sample Grocer-Florist Transcript SpeakerCodeUnit FloristPCSo let’s start with temperature GrocerRPSOkay FloristOSSo I would suggest a temperature of 64 degrees GrocerRPSOkay FloristQHow does that work for you? GrocerIPWell personally for the grocery I think it is better to have a higher temperature GrocerSBFJust because I want the customers to feel comfortable GrocerSBFAnd if it is too cold that might turn the customers away a little bit FloristRPSOkay GrocerSBF"And also if it is warm, people are more apt to buy cold drinks to keep themselves comfortable and cool" FloristRPSThat's true. GrocerOSI think 66 would be good. GrocerSBFThat way it is not too cold and it is not too hot as well. GrocerSBFAnd its good for the customers. FloristRPO"Okay, yeah" Assumed Florist is Cooperative

MURI 14 Program Review-- September 10, Grocer POMDP generated Me = Coop You = Coop 70F, 62F Me = Coop You = Coop 70F, 64F Me = Coop You = Coop 66F, 64F Me = Coop You = Coop 66F, 66F Agrees without committing Proposes 66F Grocer substantiates his offer Discuss preferences and support their positions Reward 60 points for both Grocer and Florist 64F Florist Doesn’t commit Florist Agrees to 66F

MURI 14 Program Review-- September 10, Negotiation Game MURI 14 Program Review-- September 10, Agent: (Grocer) Optimal POMDP policy Human (Florist) Grocer Action Florist Action Sequential Process oriented Blends computational and social science results

MURI 14 Program Review-- September 10, Initial results – Classification of Florist 10 transcripts for training: 4 cooperatives, 6 non- cooperatives 5 for testing –average of correctly classified X axis – Number of communications Y axis – Uncertainty of belief of grocer about florist Uncertainty of belief

MURI 14 Program Review-- September 10, Modeling Cultural Factors How do we model cultural factors for C and N in a POMDP? How do we validate the model? Is the model general enough to exhibit plausible culturally-specific human behavior?

MURI 14 Program Review-- September 10, Culture and POMDP Initial beliefs about others’ social value orientation and behavior usually reflect own culture beliefs about the interaction Culture influences frequency of particular actions and communications Interpretation of each observation refines the agent’s model of others Interpretation is influenced by culture –Model can capture cultural misinterpretations and their consequences in terms of strategy and outcomes Agents from different cultures can have different rewards for the same actions

MURI 14 Program Review-- September 10, Other’s type Includes factors such as: –Social Value Orientation Pro-Social/cooperative, individualistic, competitive, altruistic –Trust, Reputation etc –Cultural factors Individualist vs Collectivist Egalitarian vs Hierarchy Direct vs Indirect communication MURI 14 Program Review-- September 10, 2009

A’s culture A’s history with B Context B’s culture B’s history with A Context B’s behavior A’s interpretation of B’s intent A’s real intent A’s behavior B’s interpretation of A’s intent B’s real intent B’s schema A’s schema B’s schema A’s schema Cognitive Schema of A  POMDP State Space Initial Beliefs Actions Observations Transition Reward

A’s culture A’s history with B Context B’s culture B’s history with A Context B’s behavior A’s interpretation of B’s intent A’s real intent A’s behavior B’s interpretation of A’s intent B’s real intent B’s schema A’s schema B’s schema A’s schema Capturing initial state of model State Space Initial Beliefs Actions Observations Transition Reward Survey experiments Observer Experiments

A’s culture A’s history with B Context B’s culture B’s history with A Context B’s behavior A’s interpretation of B’s intent A’s real intent A’s behavior B’s interpretation of A’s intent B’s real intent B’s schema A’s schema B’s schema A’s schema Capturing model dynamics State Space Initial Beliefs Actions Observations Transition Reward Intercultural transcripts

MURI 14 Program Review-- September 10, Plans for Next Year Initial beliefs from Observer Experiment and from surveys (US, Turkey, Egypt, Qatar) Collect intra-cultural negotiation transcripts –US, Turkey, Egypt Build POMDPs from intra-cultural negotiation transcripts –US, Turkey, Egypt Build POMDPs from inter-cultural negotiation transcripts –US-Hong Kong, US-German, US-Israeli (have) (courtesy of Wendi Adair and Jeanne Brett) –US-Turkish, US-Egyptian, US-Qatari (collect)

MURI 14 Program Review-- September 10, Plans for Next Year Validate the predictive behavior of the models –Using the transcripts for training and testing Use the models in negotiation with humans Use the models in what-if scenarios Use the models to generate hypotheses to test with human subjects Initial models for collaboration scenarios using POMDP

MURI 14 Program Review-- September 10, Thank You Any questions ? MURI 14 Program Review-- September 10, 2009