Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA 01002.

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

Learning Procedural Planning Knowledge in Complex Environments Douglas Pearson March 2004.
Simulation of Feedback Scheduling Dan Henriksson, Anton Cervin and Karl-Erik Årzén Department of Automatic Control.
Washington WASHINGTON UNIVERSITY IN ST LOUIS Real-Time: Periodic Tasks Fred Kuhns Applied Research Laboratory Computer Science Washington University.
Adopt Algorithm for Distributed Constraint Optimization
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
Partially Observable Markov Decision Process (POMDP)
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Solving POMDPs Using Quadratically Constrained Linear Programs Christopher Amato.
On Large-Scale Peer-to-Peer Streaming Systems with Network Coding Chen Feng, Baochun Li Dept. of Electrical and Computer Engineering University of Toronto.
SRTA: The Soft-Real Time Agent Control Architecture Bryan Horling, Victor Lesser, Regis Vincent, Thomas Wagner presented by Anita Raja.
Towards Equilibrium Transfer in Markov Games 胡裕靖
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Intelligent Patrolling Sarit Kraus Department of Computer Science Bar-Ilan University Collaborators: Noa Agmon, Gal Kaminka, Efrat Sless 1.
Planning under Uncertainty
Making Choices using Structure at the Instance Level within a Case Based Reasoning Framework Cormac Gebruers*, Alessio Guerri †, Brahim Hnich* & Michela.
Analyzing the tradeoffs between breakup and cloning in the context of organizational self-design By Sachin Kamboj.
Evolutionary Computational Intelligence Lecture 10a: Surrogate Assisted Ferrante Neri University of Jyväskylä.
A Heuristic Bidding Strategy for Multiple Heterogeneous Auctions Patricia Anthony & Nicholas R. Jennings Dept. of Electronics and Computer Science University.
Rescheduling Manufacturing Systems: a framework of strategies, policies, and methods Vieira, Herrmann and Lin.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
May 14, Organization Design and Dynamic Resources Huzaifa Zafar Computer Science Department University of Massachusetts, Amherst.
Ant Colonies As Logistic Processes Optimizers
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Present by Chen, Ting-Wei Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids Maria Chtepen, Filip H.A. Claeys, Bart Dhoedt,
University Of Maryland1 A Study Of Cyclone Technology.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Optimal Fixed-Size Controllers for Decentralized POMDPs Christopher Amato Daniel.
Efficient agent-based selection of DiffServ SLAs over MPLS networks Thanasis G. Papaioannou a,b, Stelios Sartzetakis a, and George D. Stamoulis a,b presented.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Column Generation Approach for Operating Rooms Planning Mehdi LAMIRI, Xiaolan XIE and ZHANG Shuguang Industrial Engineering and Computer Sciences Division.
TRACK-ALIGNED EXTENTS: MATCHING ACCESS PATTERNS TO DISK DRIVE CHARACTERISTICS J. Schindler J.-L.Griffin C. R. Lumb G. R. Ganger Carnegie Mellon University.
THE POWER-CONTROL MODEL. POWER OF CONTINGENT VARIABLES “At best, the four contingent variables (size, technology, environment and strategy) explain only.
An efficient distributed protocol for collective decision- making in combinatorial domains CMSS Feb , 2012 Minyi Li Intelligent Agent Technology.
CPU S CHEDULING Lecture: Operating System Concepts Lecturer: Pooja Sharma Computer Science Department, Punjabi University, Patiala.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
Scheduling policies for real- time embedded systems.
CPSC 422, Lecture 9Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 9 Sep, 28, 2015.
Shlomo Zilberstein Alan Carlin Bounded Rationality in Multiagent Systems using Decentralized Metareasoning TexPoint fonts used in EMF. Read the TexPoint.
EIS'2007 (Salamanca, Spain, March 22-24, 2007) 1 Towards an Extended Model of User Interface Adaptation: the ISATINE framework 1 Víctor M. López Jaquero,
PowerPoint Presentation by Charlie Cook The University of West Alabama Copyright © 2005 Prentice Hall, Inc. All rights reserved. Chapter 4 Foundations.
1 S ystems Analysis Laboratory Helsinki University of Technology Flight Time Allocation Using Reinforcement Learning Ville Mattila and Kai Virtanen Systems.
© D. Weld and D. Fox 1 Reinforcement Learning CSE 473.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.
Distributed Models for Decision Support Jose Cuena & Sascha Ossowski Pesented by: Gal Moshitch & Rica Gonen.
Multiagent System Katia P. Sycara 일반대학원 GE 랩 성연식.
1 Cooperative multi-agent systems In cooperative MAS agents strive to reach a common goal and increase the combined utility of their actions Limitations.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
1 Inventory Control with Time-Varying Demand. 2  Week 1Introduction to Production Planning and Inventory Control  Week 2Inventory Control – Deterministic.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
CSCI1600: Embedded and Real Time Software Lecture 23: Real Time Scheduling I Steven Reiss, Fall 2015.
Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.
1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.
Learning Team Behavior Using Individual Decision Making in Multiagent Settings Using Interactive DIDs Muthukumaran Chandrasekaran THINC Lab, CS Department.
For a good summary, visit:
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Mining Resource-Scheduling Protocols Arik Senderovich, Matthias Weidlich, Avigdor Gal, and Avishai Mandelbaum Technion – Israel Institute of Technology.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Achieving Goals in Decentralized POMDPs Christopher Amato Shlomo Zilberstein UMass.
Introduction to Load Balancing:
TÆMS-based Execution Architectures
Wayne Wolf Dept. of EE Princeton University
Lecture 4 Schedulability and Tasks
Clearing the Jungle of Stochastic Optimization
CPU Scheduling G.Anuradha
CSCI1600: Embedded and Real Time Software
Conclusions An architecture for “anthropomorphic agents” must mimic (but not necessarily duplicate) human rational cognition. Practical cognition makes.
Presented By: Darlene Banta
Quality-aware Middleware
Reinforcement Learning Dealing with Partial Observability
Presentation transcript:

Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA 01002

2 Bounded Rationality “ A theory of rationality that does not give an account of problem-solving in the face of complexity is sadly incomplete. It is worse than incomplete; it can be seriously misleading by providing solutions that are without operational significance ” Herb Simon, 1958 Basic Insight: Computations are actions with costs

3 Motivation Control actions like scheduling and coordination can be expensive Current multi-agent systems do not explicitly reason about these costs Need to account for costs at all levels of reasoning to provide accurate solutions Build meta-level control framework with minimum cost that reasons about cost of different control actions

4 Assumptions Agent can pursue multiple tasks simultaneously Agent can partially fulfill or omit tasks Agent can coordinate with other agents to complete tasks Tasks have varying arrival times, deadlines and associated utilities Tasks have alternate ways of being achieved Objective function: MAX utility over a fixed time horizon

5 Agent Architecture

6 Meta-level Decision Taxonomy Whether to accept, delay or reject an incoming new task? How much effort to put into reasoning about a new task? Whether to negotiate with another agent about task transfer? Whether to renegotiate in case of failure of previous negotiation? Whether to re-evaluate current plan when a task completes?

7 Decision Tree for New task arrival event

8 Some State Features NameDescriptionValueComplexity F0Relative Utility of new task High Med Low Simple F1Relative Deadline of new taskSimple F2Relative Utility of current schedule Simple F8Relation of slack fragments to current schedule Complex F9Relation of other agent’s slack fragments to non-local task High Med Low Complex

9 Some Heuristic Decisions If current schedule has low priority (expected quality is low) and incoming task is of high priority (high expected quality with tight deadline), then drop current schedule and schedule new task immediately. If current schedule has very high priority and new task has low expected utility and a tight deadline, drop the new task If current task to be scheduled has high execution uncertainty associated with it and a deadline which is not tight, then introduce high slack in the schedule and use medium scheduling effort

10 Related Work Monitoring Progress of Anytime Algorithms ( Hansen & Zilberstein) – Uses dynamic programming for computation of a non-myopic stopping rule Predictability versus Responsiveness (Durfee & Lesser) –Control amount of coordination using a user specified buffer Meta-level Control of Coordination Protocols ( Kuwabara) –Detects and handles exceptions by switching between protocols –Does not account for overhead of reasoning process

11 Evaluation Compare system using hand-generated MLC heuristics to –Naïve multi-agent system with no explicit MLC –Deterministic choice MLC –Random choice MLC –MLC with knowledge of environment characteristics including arrival model Environments are characterized by the following parameters –Type of tasks : Simple (S), Complex (C), Combination (A) –Frequency of Arrivals: High (H), Medium (M), Low (L) – Deadline Tightness: High (H), Medium (M), Low (L)

12 An Example

13 Evaluation, Continued

14 Evaluation, Continued

15 Contributions Meta-level control in a complex environment Designed agent architecture that reasons about overhead at all levels of the decision process Parametric control algorithm which reasons about effort and slack Identified state features for control using reinforcement learning

16 Future Work Implement Reinforcement-Learning based control algorithm –Function approximation (Sarsa( ) linear tile-coding) –MDP states will be abstractions of actual system state –Study effectiveness of RL algorithm on complex domain Compare performance of heuristic approach to RL approach

17 Research Questions What are the major obstacles to efficient meta- level control? How can costs be accurately included at all levels of reasoning? How to deal with the huge, complex state space? Is reinforcement learning a feasible approach to learn good meta-level control policies?