Presentation on theme: "Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA 01002."— Presentation transcript:
Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA 01002
2 Bounded Rationality “ A theory of rationality that does not give an account of problem-solving in the face of complexity is sadly incomplete. It is worse than incomplete; it can be seriously misleading by providing solutions that are without operational significance ” Herb Simon, 1958 Basic Insight: Computations are actions with costs
3 Motivation Control actions like scheduling and coordination can be expensive Current multi-agent systems do not explicitly reason about these costs Need to account for costs at all levels of reasoning to provide accurate solutions Build meta-level control framework with minimum cost that reasons about cost of different control actions
4 Assumptions Agent can pursue multiple tasks simultaneously Agent can partially fulfill or omit tasks Agent can coordinate with other agents to complete tasks Tasks have varying arrival times, deadlines and associated utilities Tasks have alternate ways of being achieved Objective function: MAX utility over a fixed time horizon
6 Meta-level Decision Taxonomy Whether to accept, delay or reject an incoming new task? How much effort to put into reasoning about a new task? Whether to negotiate with another agent about task transfer? Whether to renegotiate in case of failure of previous negotiation? Whether to re-evaluate current plan when a task completes?
8 Some State Features NameDescriptionValueComplexity F0Relative Utility of new task High Med Low Simple F1Relative Deadline of new taskSimple F2Relative Utility of current schedule Simple F8Relation of slack fragments to current schedule Complex F9Relation of other agent’s slack fragments to non-local task High Med Low Complex
9 Some Heuristic Decisions If current schedule has low priority (expected quality is low) and incoming task is of high priority (high expected quality with tight deadline), then drop current schedule and schedule new task immediately. If current schedule has very high priority and new task has low expected utility and a tight deadline, drop the new task If current task to be scheduled has high execution uncertainty associated with it and a deadline which is not tight, then introduce high slack in the schedule and use medium scheduling effort
10 Related Work Monitoring Progress of Anytime Algorithms ( Hansen & Zilberstein) – Uses dynamic programming for computation of a non-myopic stopping rule Predictability versus Responsiveness (Durfee & Lesser) –Control amount of coordination using a user specified buffer Meta-level Control of Coordination Protocols ( Kuwabara) –Detects and handles exceptions by switching between protocols –Does not account for overhead of reasoning process
11 Evaluation Compare system using hand-generated MLC heuristics to –Naïve multi-agent system with no explicit MLC –Deterministic choice MLC –Random choice MLC –MLC with knowledge of environment characteristics including arrival model Environments are characterized by the following parameters –Type of tasks : Simple (S), Complex (C), Combination (A) –Frequency of Arrivals: High (H), Medium (M), Low (L) – Deadline Tightness: High (H), Medium (M), Low (L)
15 Contributions Meta-level control in a complex environment Designed agent architecture that reasons about overhead at all levels of the decision process Parametric control algorithm which reasons about effort and slack Identified state features for control using reinforcement learning
16 Future Work Implement Reinforcement-Learning based control algorithm –Function approximation (Sarsa( ) linear tile-coding) –MDP states will be abstractions of actual system state –Study effectiveness of RL algorithm on complex domain Compare performance of heuristic approach to RL approach
17 Research Questions What are the major obstacles to efficient meta- level control? How can costs be accurately included at all levels of reasoning? How to deal with the huge, complex state space? Is reinforcement learning a feasible approach to learn good meta-level control policies?