Presentation on theme: "Automated Negotiation Agents Sarit Kraus Dept. of Computer Science Bar-Ilan University."— Presentation transcript:
Automated Negotiation Agents Sarit Kraus Dept. of Computer Science Bar-Ilan University
Negotiation A discussion in which interested parties exchange information and come to an agreement. Davis and Smith, 1977
What is an Agent? PROPERTY MEANING Situated Sense and act in dynamic/uncertain environments Flexible Reactive (responds to changes in the environment) Pro-active (acting ahead of time) Autonomous Exercises control over its own actions Goal-oriented Purposeful Persistent Continuously running process Social Interacts with other agents/people Learning Adaptive Mobile Able to transport itself Personality Character, Emotional state
No Agent is an Island: automated agents negotiate with other automated agents Monitoring electricity networks (Jennings) Distributed design and engineering (Petrie et al.) Distributed meeting scheduling (Sen & Durfee, Tambe) Teams of robotic systems acting in hostile environments (Balch & Arkin, Tambe, Kaminka) Electronic commerce (Kraus et al.) Collaborative Internet-agents (Etzioni & Weld, Weiss) Collaborative interfaces (Grosz & Ortiz, Andre) Information agent on the Internet (Klusch, Kraus et al.) Cooperative transportation scheduling (Fischer) Supporting hospital patient scheduling (Decker & Jin)
Agents negotiate with humans Training people in negotiations Trade agents for the Web Elves agents– representing people
Plan of talk: agents negotiate with humans Automated agent for bilateral negotiations with complete information: the fishing dispute (collaborators: Penina Hoz-Weiss, Jon Wilkenfeld) Automated agent for multi-party negotiations: the Diplomacy game (collaborators: Daniel Lehmann and Eitan Ephrati) On going work: learning, incomplete information; mediation (collaborators: Dudi Sarne, Barbara Grosz Lin Raz, Michal Halamish)
Fishing Dispute Negotiators: Canada and Spain Canadas stock of flatfish decreases over the years. Spain has fished this same stock of flatfish for many years, but outside the Canadian exclusive economic zone (EEZ). Canada would like Spain to restrict its fishing near her EEZ. Spain is dependent on fishing in the area outside the EEZ for employment and trade purposes.
Possible Outcomes An agreement on Total Allowable Catch (TAC). An agreement on limiting the length of the fishing season. Canada enforces conservation measures with military forces against Spain. Spain enforces its right to fish throughout the fishery with military force against Canada. If the negotiation has not ended prior to the deadline, then it terminates with a status quo outcome.
World State Parameters World state parameters are also negotiable and affect the utility of players: Canada subsidizes removal of Spain's ships (0, 5, 10, 15, 20 ships). Spain reduces the amount of pollution caused by the fishing fleet (0%, 15%, 25%, 50%). Canada imposes trade sanctions on Spain. Spain imposes trade sanctions on Canada
Fishing Dispute Outcomes TAC Limit Season Opt Out Status Quo World State Parameters Canada subsidizes Spain reduces Canada imposes Spain imposes ships Pollution Trade Sanctions Trade Sanctions
Negotiation Process Each of the parties can make requests, threats, offers, conditional offers and counteroffers, as well as to comment on the negotiation. The utility of each ending is affected by the period when the negotiation ended. Canada loses over time since Spain continues to fish while negotiating. Spain gains over time for the same reason. Spain Thule Canada Ultima
Negotiations in the Fishing Dispute S Spain offers to set TAC at 44 thousand tons. C Canada offers to set TAC at 18 thousand tons. S E Spain asks that Canada compensate Spain for Spains restricted fishing practices by replacing the income of twenty ships.
Other Negotiations Games Team Games (SPIRE); negotiations on coordination; exchange of information; finding solutions is complex Competitive games: when agents can benefit from reaching an agreement (also in bilateral games). Trade games: Monopoly, Traders of Genoa, Kohle, Kies & Knete,Treasure game W ar games: Diplomacy, Risk Crisis games: Hostage Crisis. Semi-cooperative games: Color Trail, Majority Game
Chess Programs play chess as well as people Programs play chess in a way much different than people: they mainly search the game tree
... A B A 0+1Evaluation Final states... Search tree for Tic-Tac-Toe
Fishing dispute vs. Chess Type of game: crisis game vs. war game. Coordination game vs. zero sum game Number of players: 2 Moves: simultaneously + negotiations vs. sequentially– need to reach an agreement. Number of pieces to move: no pieces vs. one piece at a time Information: Complete information. Needed capabilities: Negotiation skills vs. strategic skills.
Playing Techniques NEGOTIATIONS Game theory techniques: formalize the game; find an equilibrium; follow the equilibrium strategy. Market techniques. Appropriate for games of many players that can exchange similar items. Heuristics: domain specific; advice books; human like strategies Markov Decision Processes. Modeling the opponent Learning from DB Learning from experience CHESS Heuristic Search
The Automated Negotiator Agent (fishing dispute) The agent plays the role of one of the countries. During the negotiation the agent receives messages, analyzes them and responds. It also initiates a discussion on one or more parameters of the agreement. It takes actions when needed.
Nash Equilibrium An action profile is an order set a=(a1,…,aN) of one action for each of the N players in the game. An action profile a is a Nash Equilibrium (Nash 53) of a strategic game, if each agent j does not have a different action yielding an outcome that it prefers to that generated when chooses aj, given that every other player i chooses ai.
Strategy of Negotiation Formal strategic negotiation theory: The agent is based on the a bargaining model. By backward induction the agent builds the strategy to be reached at each time period according to the sequential equilibrium (Kraus, Strategic Negotiation in Multiagent Environments, MIT Press 2001). When the agent plays against humans Not Enough Heuristics
Automated agent: Using equilibrium strategy when playing against humans Human negotiators do not use equilibrium strategies even though game is not complex and the automated agent finds equilibrium fast. Not surprising: Kahneman & Tversky showed that humans do not use decision theory. The agent using the equilibrium did not reach beneficial agreements.
Heuristics Negotiation tactics Attributes Risk Attitude Opting out Fine tuning
Attributes Number of points lower than the equilibrium utility value that the agent will agree to. The number of fish ton (TAC) the agent will increase/decrease in his offer. Sending the first message / waiting to receive a message. Full offer message or not.
Modeling the risk attitude of the opponent The agent is always neutral toward risk, but is sensitive to the risk level of its human opponent and will change its view of the humans utility function accordingly. Risk attitude influences the agreement an opponent is willing to accept. The agent begins with the assumption that its opponent is risk neutral. It uses a heuristic method to decide whether to change the estimation of the risk attitude of the opponent. When the agent decides that its opponent is risk prone, it changes the opponents utility function. This leads the agent to a recalculation of his strategy.
Fishing Dispute: Conclusions We developed an agent that can play well against a human player. The agent was tested on students in their third year of computer science studies. The results of the experiments implied that the agent plays well and fair. It raised the sum of the utilities in the simulation it was involved in. The agent played as Spain significantly better than a human did, and just as good as a human Canada player.
Diplomacys Rules Each player represents one of seven European powers: England, Germany, Russia, Turkey, Austria-Hungary, Italy and France.
Diplomacys Rules (Cont.) Winner: The power that gains control over the majority of the board. Beginning: 1901; two seasons a year. A season: consists of a negotiations stage and a move stage. Moves: All players secretly write the orders for all of their units simultaneously. Negotiations: Coalitions and agreements among the players reached in the negotiations stage significantly affect the course of the game. The rules of the game do not bind a player to anything she says. Deciding who to trust as situations arise is part of the game.
Negotiations in Diplomacy R If you support my attack on Vienna I will support your attack on Rumania G I know that Italy is going to attack Trieste F Dont trust Germany E If you will not help me I will attack you
Moves in Diplomacy Only one unit may be in any space at one time. A unit can be ordered to: move, support, hold (convoy). An army or a fleet may support the move of another unit of that country or any other country in making a move. Support can also be given on a defensive basis. Opposing units with equal support do not move. An advantage of only one support is sufficient to win.
Moves in Diplomacy
The Need for Negotiations in Diplomacy Moves require close cooperation between various allied powers. Incomplete information: communications between players are done secretly. The game is complex: 8 34 possible moves in each step of the game (without negotiation moves). Negotiation is used to obtain information about the goals of the other players. Others negotiate.
Diplomacy vs. Chess Type of game: war games. Number of players: 7 vs. 2 Moves: simultaneous vs. sequential. Number of pieces to move: all pieces vs. one piece. Information: uncertainty about messages exchanged between other players vs. full information Needed capabilities: negotiation skills vs. strategic skills.
Playing Techniques NEGOTIATIONS Game theory techniques: formalize the game; find an equilibrium; follow the equilibrium strategy. Impossible in Diplomacy because of complexity. Market techniques. Appropriate for games of many players that can exchange similar items. Heuristics: domain specific; advice books; human like strategies Markov Decision Processes. Learning from DB Learning from experience CHESS Heuristic Search
Diplomat: an Automated Diplomacy player Analysis & Strategies Finder Negotiations Previous Agreements Beliefs on other players Board Status Agreements Detailed plans and their estimated value for possible coalitions Analysis & Strategies Finder Moves Analysis Others Moves
Diplomacy Structure Prime Minister Foreign Office Military Headquarters Ministry Of Defense Intelligence Strategies Finder Front 1 Front 2 Front 3 Desk 10 Analyzer 13 Desk 11 Desk 12 Analyzer 14 Write orders 15 Write orders 15 Secretary
Strategies Finder (SF) Front: possible enemies and possible allies, e.g., Russia and Italy against Austria and Germany. Diplomats strategy for a given front includes: A list of orders associated with their purpose. The expected average profit from carrying out the strategy for each power who participates in the strategy and the common expected profit for all of the powers. A Venice (I) moves to Triests in order to attack Triest A Vienna (R) supports A Venice to Trieste in order to attack Trieste ……… Expected outcome: Aver: 10617 Min: 5002 Max: 20862 Russia: 3358 Italy 18117
Strategies Finder (SF) (Cont) Diplomat identifies possible front based on on- going agreements, beliefs about other agents and their relations. SF finds some strategies for each front using domain specific heuristics. The value of each strategy is computed by finding strategies for the enemies of the front. The negotiation is done based on the identified strategies. Question: What is the best strategy?
Diplomats negotiation Exchange information; Decide what kind of agreement to try to achieve. Find common enemies. Negotiating about the general purposes of an agreement: spaces on the board to attack, to defend, to leave or to enter. Deciding on the specific movements in order to achieve the purposes From previous stage Signing the final Agreement; Deciding if to keep it.
Diplomats behavior is not deterministic Diplomat has special ``personality'' traits that affect its behavior and may be varied easily from game to game. Diplomat ``flips coins'' in the following cases: To decide whether to pretend to keep an agreement or to tell the other partner that it will break the agreement (the probability depends on the personality traits.) To decide whether to give more details about a suggestion. To decide which opening to use. When SF searches for possible strategies. For example, to decide which units will participate in the attack or defense of a given location and to guess which of the enemy's units will participate in the battle of that location.
Diplomats Evaluation Diplomat was evaluated and consistently played better than human players. It did not play enough games to gain statistical results. It was hard to evaluate what contributed to its success.
Conclusions It is possible to develop automated negotiators!! Is it possible to develop standard methods for playing negotiation games (as in Chess)? On going work incomplete information Modeling the opponents preferences Learning to negotiate
Learning to negotiate: 3-players majority game You are one of 3 Players: You need to divide the rights for a goldmine You Player 1 Player 2
Simple Game Protocol (cont.) Each Game Round one player is selected Randomly And he/she gets to make a division proposal You Player 1 Player 2 You Player 1 Player 2
Simple Game Protocol (cont.) Based on the proposals the players vote It takes a majority to make a decision – the proposer and one other player You Player 1 Player 2
Simple Game Protocol (cont.) Once a majority was reached the game ends – each player gets his/her share Otherwise (no agreement) – A new proposer is selected and an additional round is being played You Player 1 Player 2
Simple Game Protocol (cont.) However – it is not certain that a new round will take place!!! There is a continuation probability – if no agreement was reached, there is a possibility that the game will suddenly end and all players will get zero No Agreement P(New Turn)=0.9P(End Game)=0.1
Agent Design Collect and Manage a DB of previous games Given a new game– find similar situations in DB Maximize utility given previous behaviour
Color Trail Game Co-developer: Barbara Grosz Harvard University