Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Algorithms for DCOP: A Graphical-Game-Based Approach

Similar presentations


Presentation on theme: "Distributed Algorithms for DCOP: A Graphical-Game-Based Approach"— Presentation transcript:

1

2 Distributed Algorithms for DCOP: A Graphical-Game-Based Approach
For MGM, each agent broadcasts a gain message to all its neighbors that represents the maximum change in its local utility if it is allowed to act under the current context. An agent is then allowed to act if its gain message is larger than all the gain messages it receives from all its neighbors (ties can be broken through variable ordering or another method). So what?

3 Collaborators Manish Jain Prateek Tandon

4 Sample coordination results
Full Graph Chain Graph 4

5 Regular Graphs

6 Team Coordination Penalty
Increased coordination hurts In some graphs (low density) For some algorithms (SE-Optimistic & BE-Rebid) Intuition If there are few neighbors, agents less selective With many neighbors, won’t move unless high gain

7 Average # Constraints Changed
k=1 and k=2 increase over time

8 Average Reward Improvement
k=2 improves relative to k=1

9 Low density: k=1 better At Higher density, more constraints changed for k=2: k=2 relatively better

10 BE-Rebid-2 Low bids have less gain Low density does even worse

11 SE-Optimistic-2 Low bid -> low gain
Lower Density have lower bids (higher chance of mistake)

12 Sanity Check

13 In Contrast Fewer constraints changed but higher improvement
Little change in # constraints from k=1 and k=2

14 Summary of Team Uncertainty Penalty
Relative performance of k=1 and k=2 changes over higher density Favors k=2 as density increases Few Neighbors Curtails k=2 Performance Low bids in low density graphs: hurt Low density: wider range and can have larger mistakes In Contrast: Mean and Stay Conservative: fewer constraints changed Worse overall

15 Solutions Discourage Joint Actions Discount all bids
SE-Threshold-2 & BE-Threshold-2 Only form pair if bid > (t × #constraints) Unless high bid, “play it safe” with k=1 Discount all bids SE-i-2 Generalize SE-Optimistic and SE-Mean BE-i-2 Explore utility is discounted (bias towards stay / backtrack) Both have extra parameter to tune (t or i)

16 Improved SE Algorithms

17 Improved BE Algorithms

18 (Selected) Open Questions
Different amounts of coordination Often increasing coordination helps But it can hurt due to uncertainty in the environment Many open theoretical questions (current work with Scot Alfeld) How close are we to optimal? Can we predict how well an algorithm will perform? Multi-armed bandit is to MDP as DCEE is to MMDP ? TODO: backup slides on this!

19 Exploration vs. Exploitation
Multi-armed Bandit How to choose? -greedy Confidence Intervals

20 Possible Class Projects
New algorithms General k implementation Enhance simulator: not just small-scale fading Incorporate prior knowledge How to set parameter i in SE-i-2 and BE-i-2 Learn to change graph topologies Different objectives Maximize the minimum threshold Get all constraints above a threshold

21 (Quick) Simulator Demo

22 1-3: Get neighbors’ info 4-11: Make pair 12-14: No Pair 15-26: Can I/we move? 27-30: Move, if able


Download ppt "Distributed Algorithms for DCOP: A Graphical-Game-Based Approach"

Similar presentations


Ads by Google