A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST)

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST) Dept. of Computer Science Texas A&M University

2 Motivation Agent Multi-Agent Team  Agents share a large amount of knowledge about the teamwork.  Hard coded Interactions among participants.  High-frequency message exchange.  Communication risk.

3 Challenging Issues in Designing Communication Protocols  Each agent has incomplete information from which uncertainties arise.  Each agent has different problem solving capabilities.  Data are decentralized and lack systems’ global control.  Excessive/unrestricted communication leads to lack of scalability

4 Our Approach and Its Contributions Proactive Communication  OBPC: Reduction of communication load through OBservations.  DIP: Dynamic estimation of the probability distribution of Information Production and need.  DTPC: Decision-Theoretic determination of communication strategies.

5 Background  CAST (Collab. Agents for Simulating Teamwork)  MALLET (Multi-Agent Logic-based Language for Encoding Teamwork) (team-plan killwumpus(?w) (process (seq (agent-bind ?ca (constraint (play-role ?ca scout))) (DO ?ca (findwumpus ?w))) (agent-bind ?fi (constraint ((play-role ?fi fighter) (closest-to-wumpus ?fi ?w)))) (DO ?fi (movetowumpus ?w)) (DO ?fi (shootwumpus ?w)))))) (ioper shootwumpus (?w) (pre-cond (wumpus ?w) (location ?w ?x ?y) (dead ?w false)) (effect (dead ?w true)))

6 Overview CAST KB Proactive Communication OBPC DIP DTPC Optimal Communication Strategy Team Structure & Teamwork Procedure

7 Agent Execution Cycle Observe Sense Predict Info. need and production Decide Strategy Communicate Information Act Effect Execution Cycle

8 Syntax of Observability ::= (CanSee )* (BelieveCanSee )* ::= ::= | ::= ( )* ::= ( ) ::= (DO ( )) ::= | ::=

9 Example Observability Rules (CanSee ca (location ?o ?x ?y) (location ca ?xc ?yc) (location ?o ?x ?y) (inradius ?x ?y ?xc ?yc rca) ) //The carrier can see the location property of an object. (CanSee ca (DO ?fi (shootwumpus ?w)) (play-role fighter ?fi) (location ca ?xc ?yc) (location ?fi ?x ?y) (adjacent ?xc ?yc ?x ?y) ) //The carrier can see the shootwumpus action of a fighter. (BelieveCanSee ca fi (location ?o ?x ?y) (location fi ?xi ?yi) (location ?o ?x ?y) (inradius ?x ?y ?xi ?yi rfi) ) //The carrier believes the fighter is able to see the location property of an object. (BelieveCanSee ca fi (DO ?f (shootwumpus ?w)) (play-role fighter ?f) (  ?f fi) (location ca ?xc ?yc) (location fi ?xi ?yi) (location ?f ?x ?y) (inradius ?xi ?yi ?xc ?yc rca) (inradius ?x ?y ?xc ?yc rca) (adjacent ?x ?y ?xi ?yi) ) //The carrier believes the fighter is able to see the shootwumpus action of another fighter.

10 Proactive Communication Based on Observation  ProactiveTell –A provider reasons about what information it will have. –A provider reasons about whether to deliver a piece of information when having the information.  ActiveAsk –A needer reasons about what information it will need. –A needer reasons about whether to ask for a piece of information when needing the information.

11 Evaluation  20 wumpuses, 8 pits, and 20 piles of gold per world.  1 carrier and 3 fighters compose a team.  The team goal is to kill wumpuses and get the gold without being killed.  5 randomly generated worlds with 20×20 cells. Multi-Agent Wumpus World

12 Decision-Theoretic Proactive Communication  Strategies  Utility Function  Cost Function  Value Function  Decision-Making

13 Decision-Making on Situation PA 0 1 2 e e a-b: ProactiveTell a-b: Silence b-a: Accept b-a: Wait b-a: Silence e e b-a: ActiveAsk Situation PA: Provider produces a new piece of information a: provider b: needer e: end

14 DM on Situation PB 0 a-b: Reply e a-b: WaitUntilNext Situation PB: Provider receives a request for a piece of information e

15 DM on Situation NA b-a: ActiveAsk b-a: Silence b-a: Wait a-b: Reply a-b: WaitUntilNext a-b: Silence a-b: ProactiveTell Situation NA: Needer needs a piece of information 0 1 0 t t e e e t: transfer

16 DM on Situation NB Situation NB: Needer receives a piece of information t 0 e b-a: Accept

17 Utility Function  Parameters in utility function: –I: information about which communication occurs –t: time of decision-making –t 1 : time at which I is needed –t 2 : time at which the value for I used is produced –SU: situation at t –S: strategy available at SU –M: a set of messages involving in obtaining I –E: environment state at t U(I, t, t 1, t 2, SU, S, M, E) =V(I, t, t 1, t 2, SU, S)–C(M)

18 Value Function V(I, t, t 1, t 2, SU, S) =T(I, t, t 1, t 2, SU, S)//Timeliness +R(I, t, t 1, t 2, SU, S)//Relevance

19  Timeliness –Whether agents use a value that can be produced in time when they need I. d(I, t, t 1, t 2, SU, S) = max(0, t 2 –t 1 ) ft(d(I, t, t 1, t 2, SU, S)) s.t. ft(x) < ft(y) if y < x T(I, t, t 1, t 2, SU, S) = ft(d(I, t, t 1, t 2, SU, S)) Timeliness Function

20 Relevance Function  Relevance –Unprocessed, Most recent, Important P(I, t, t 1, t 2, SU, S) = P r (I  t  t 1  t 2  no other value for I was produced between Int[t 1,t 2 ] | S  SU) fr I (P(I, t, t 1, t 2, SU, S)) s.t. fr I (x) < fr I (y) if x < y R(I, t, t 1, t 2, SU, S) = fr I (P(I, t, t 1, t 2, SU, S))

21 Cost Function 0 if M i =  C(M i ) = k 1 + k 2 × len(M i ) otherwise

22 Expected Utility E(U) = Time Strategy t1t1 t2t2 P.ProactiveTell P.Silence +T P.Reply P.WaitUntilNext N.ActiveAsk if a Reply if a WaitUnitlNext N.Silence N.Wait if a ProactiveTell +T if a Silence N.Accept

23 Strategies t Current time Unknown Known Next production Last sent Last not sent Last need aware of Unfulfilled need Situation PA: Situation PA: provider produces I ProactiveTell? Silence?

24 Strategies t Current time Unknown Known Next production Last production Situation PB: Situation PB: provider receives a request for I Reply? WaitUntilNext?

25 Strategies t Current time Unknown Known Next production Last I received Most recent production Situation NA: Situation NA: needer needs I ActiveAsk? Wait? Silence?

26 Strategies Situation NB: Situation NB: needer receives I Accept

27 Summary Advantages of Approach: allows agents to make intelligent choices of communication policy based on: –frequencies: of needs, of sensing, of info. change –costs: of messages, plus penalities for delays in action, or acting with incorrect information

28 Criteria for Applicable Domains  There are information needs among the team.  Agents can communicate.  There is uncertainty in the environment. –Stochastic properties of teamwork process. –Agents have incomplete/disjoint knowledge about the world.  The team acts under critical time constraints, so proactive assistance becomes important.

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST)

Similar presentations

Presentation on theme: "A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST)

Similar presentations

Presentation on theme: "A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST)"— Presentation transcript:

Similar presentations

About project

Feedback