Distributed Algorithms for DCOP: A Graphical-Game-Based Approach

Distributed Algorithms for DCOP: A Graphical-Game-Based Approach
For MGM, each agent broadcasts a gain message to all its neighbors that represents the maximum change in its local utility if it is allowed to act under the current context. An agent is then allowed to act if its gain message is larger than all the gain messages it receives from all its neighbors (ties can be broken through variable ordering or another method). So what?

Collaborators Manish Jain Prateek Tandon

Sample coordination results
Full Graph Chain Graph 4

Regular Graphs

Team Coordination Penalty
Increased coordination hurts In some graphs (low density) For some algorithms (SE-Optimistic & BE-Rebid) Intuition If there are few neighbors, agents less selective With many neighbors, won’t move unless high gain

Average # Constraints Changed
k=1 and k=2 increase over time

Average Reward Improvement
k=2 improves relative to k=1

Low density: k=1 better At Higher density, more constraints changed for k=2: k=2 relatively better

BE-Rebid-2 Low bids have less gain Low density does even worse

SE-Optimistic-2 Low bid -> low gain
Lower Density have lower bids (higher chance of mistake)

Sanity Check

In Contrast Fewer constraints changed but higher improvement
Little change in # constraints from k=1 and k=2

Summary of Team Uncertainty Penalty
Relative performance of k=1 and k=2 changes over higher density Favors k=2 as density increases Few Neighbors Curtails k=2 Performance Low bids in low density graphs: hurt Low density: wider range and can have larger mistakes In Contrast: Mean and Stay Conservative: fewer constraints changed Worse overall

Solutions Discourage Joint Actions Discount all bids
SE-Threshold-2 & BE-Threshold-2 Only form pair if bid > (t × #constraints) Unless high bid, “play it safe” with k=1 Discount all bids SE-i-2 Generalize SE-Optimistic and SE-Mean BE-i-2 Explore utility is discounted (bias towards stay / backtrack) Both have extra parameter to tune (t or i)

Improved SE Algorithms

Improved BE Algorithms

(Selected) Open Questions
Different amounts of coordination Often increasing coordination helps But it can hurt due to uncertainty in the environment Many open theoretical questions (current work with Scot Alfeld) How close are we to optimal? Can we predict how well an algorithm will perform? Multi-armed bandit is to MDP as DCEE is to MMDP ? TODO: backup slides on this!

Exploration vs. Exploitation
Multi-armed Bandit How to choose? -greedy Confidence Intervals

Possible Class Projects
New algorithms General k implementation Enhance simulator: not just small-scale fading Incorporate prior knowledge How to set parameter i in SE-i-2 and BE-i-2 Learn to change graph topologies Different objectives Maximize the minimum threshold Get all constraints above a threshold

(Quick) Simulator Demo

1-3: Get neighbors’ info 4-11: Make pair 12-14: No Pair 15-26: Can I/we move? 27-30: Move, if able

Distributed Algorithms for DCOP: A Graphical-Game-Based Approach

Similar presentations

Presentation on theme: "Distributed Algorithms for DCOP: A Graphical-Game-Based Approach"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Algorithms for DCOP: A Graphical-Game-Based Approach

Similar presentations

Presentation on theme: "Distributed Algorithms for DCOP: A Graphical-Game-Based Approach"— Presentation transcript:

Similar presentations

About project

Feedback