We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published bySheila Goodfriend
Modified about 1 year ago
© 2007 SRI International 1 Dan’s Multi-Option Talk Option 1: HUMIDRIDE: Dan’s Trip to the East Coast –Whining: High –Duration: Med –Viruses: Low Option 2: T-Cell: Attacking Dan’s Cold Virus –Whining: Med –Duration: Low –Viruses: High Option 3: Model-Lite Planning: Diverse Multi-Option Plans and Dynamic Objectives –Whining: Low –Duration: High –Viruses: Low
© 2007 SRI International Model-Lite Planning: Diverse Multi- Option Plans and Dynamic Objectives Daniel Bryce William Cushing Subbarao Kambhampati
© 2007 SRI International 3 Questions When must the plan executor decide on their planning objective? –Before synthesis? Traditional model –Before execution? Similar to IR model: select plan from set of diverse, but relevant plans –During execution? Multi-Option Plans (subsumes previous) –At all? “Keep your options open” Can the executor change their planning objective without replanning? Can the executor start acting without committing to an objective?
© 2007 SRI International 4 Overview Diverse Multi-Option Plans –Diversity –Representation –Connection to Conditional Plans –Execution Synthesizing Multi-Option Plans –Example –Speed-ups Analysis –Synthesis –Execution Conclusion
© 2007 SRI International 5 Diverse Multi-Option Plans Each plan step presents several diverse choices –Option 1: Train(MP, SFO), Fly(SFO, BOS), Car(BOS, Prov.) –Option 1a: Train(MP, SFO), Fly(SFO, BOS), Fly(BOS, PVD), Cab(PVD, Prov.) –Option 2: Shuttle(MP, SFO), Fly(SFO, BOS), Car(BOS, Prov.) –Option2a: Shuttle(MP, SFO), Fly(SFO, BOS), Fly(BOS, PVD), Cab(PVD, Prov.) Diversity is Reliant on Pareto Optimality –Each option is non-dominated –Diversity through Pareto Front w/ High Spread O1 Duration Cost O2 O2a O1a Fly(BOS,PVD) Car(BOS,Prov.) Train(MP, SFO) Shuttle(MP, SFO) Fly(SFO, BOS) Fly(BOS,PVD) Car(BOS,Prov.) Cab(PVD, Prov.) O2 O2a O1 O1a Diversity
© 2007 SRI International 6 Dynamic Objectives Multi-Options Plans are a type of Conditional Plan –Conditional on the user’s Objective Function –Allow the objective Function to change –Ensured that, irrespective of their obj. fn., will have non-dominated options Fly(BOS,PVD) Car(BOS,Prov.) Train(MP, SFO) Shuttle(MP, SFO) Fly(SFO, BOS) Fly(BOS,PVD) Car(BOS,Prov.) Cab(PVD, Prov.) O2 O2a O1 O1a
© 2007 SRI International 7 Executing Multi-Option Plans Duration Cost O1 O2 O2a O1a Fly(BOS,PVD) Car(BOS,Prov.) Train(MP, SFO) Shuttle(MP, SFO) Fly(SFO, BOS) Fly(BOS,PVD) Car(BOS,Prov.) Cab(PVD, Prov.) O2 O2a O1 O1a Duration Cost O1 O1a Local action choice corresponds to multiple options Duration Cost O1 Duration Cost O1 O1a Option values Change at each step
© 2007 SRI International 8 Multi-Option Conditional Probabilistic Planning (PO)MDP setting: (Belief) State Space Search –Stochastic Actions, Observations, Uncertain Initial State, Loops –Two Objectives: Expected Plan Cost, Probability of Plan Success Traditional Reward functions are linear combination of above. Assume objective fn. Extend LAO* to multiple objectives (Multi-Option LAO*) –Each generated (belief) state has an associated Pareto set of “best” sub-plans –Dynamic programming (state backup) combines successor state Pareto sets Yes, its exponential time per backup per state ♦ There are approximations –Basic Algorithm While not have a good plan ♦ ExpandPlan ♦ RevisePlan S S S
© 2007 SRI International 9 Example of State Backup
© 2007 SRI International 10 Search Example -- Initially 0.0 C Pr(G) Initialize Root Pareto Set with null plan and heuristic estimate
© 2007 SRI International 11 Search Example – 1 st Expansion a1a1 a2a2 0.0 C Pr(G) C C C 0.0 Expand Root Node and Initialize Pareto Sets of Children with null plan And Heuristic Estimate
© 2007 SRI International 12 Search Example – 1 st Revision a1a1 a2a2 0.0 C Pr(G) C C C a1 a1 0.0 Recompute Pareto Set For Root, find best heuristic Point is through a 1
© 2007 SRI International 13 Search Example – 2 nd Expansion a1a1 a2a2 a3a3 a4a C Pr(G) C C C C 0.0 C Pr(G) a1 a1 Expand Children of a 1 and initialize their Pareto Sets with null plan and Heuristic estimate – Both children Satisfy the Goal with non-zero probability
© 2007 SRI International 14 Search Example – 2 nd Revision a1a1 a2a2 a3a3 a4a C Pr(G) C C C C a4 a4 a4a4 a3 a3 a3a3 0.0 C Pr(G) a 1,[a 4 |a 3 ] a 1,[a 4 |a 3 ] Recompute Pareto Set of both expanded nodes and the root node – There is a feasible plan a 1, [a 4, a 3 ] that satisfies the goal with 0.66 probability and cost 2. The heuristic estimate indicates extending a 1, [a 4, a 3 ] will lead to a plan that satisfies the goal with 1.0 probability
© 2007 SRI International 15 Search Example – 3 rd Expansion a1a1 a2a2 a3a3 a4a4 a7a C Pr(G) C C C C C C a4a4 a3a3 a 1,[a 4 |a 3 ] 0.0 a 1,[a 4 |a 3 ] Expand Plan to include a 7. There is no applicable action after a 3 a4 a4 a3 a3
© 2007 SRI International 16 Search Example – 3 rd Revision a1a1 a2a2 a3a3 a4a4 a7a C Pr(G) C C C C C C , a 7 a7a7 a4, a7a4, a7 a4a4 a3a3 a2 a2 a 1,[a 4, a 7 |a 3 ] a 1,[a 4 |a 3 ] 0.0 Recompute all Pareto Sets that are Ancestors of Expanded Nodes. Heuristic for plans extended through a 3 is higher because of no applicable action. Heuristic at root node changes to plans extended through a 2 a4,a7 a4,a7 a3 a3
© 2007 SRI International 17 Search Example – 4 th Expansion a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 a7a C Pr(G) C C C C C C C C a7a7 a4, a7a4, a7 a4a4 a3a3 a 1,[a 4, a 7 |a 3 ] a 1,[a 4 |a 3 ] 0.0 a2 a2 Expand Plan through a 2, one expanding child satisfies the goal with 0.1 probability. , a 7 a4,a7 a4,a7 a3 a3
© 2007 SRI International 18 Search Example – 4 th Revision a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 a7a C Pr(G) C C C C C C C C a7a7 a4, a7a4, a7 a4a4 a3a3 a6 a6 a5a5 a 2,a 5 a 1,[a 4, a 7 |a 3 ] a 1,[a 4 |a 3 ] 0.0 a2, a6 a2, a6 Recompute Pareto sets of expanded Ancestors. Plan a 2, a 5 is dominated at the root. a7 a7 a4,a7 a4,a7 a3 a3
© 2007 SRI International 19 Search Example – 5 th Expansion a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 a7a7 a8a C Pr(G) C C C C C C C C C a7a7 a4, a7a4, a7 a4a4 a3a3 a5a5 a 1,[a 4, a 7 |a 3 ] a 1,[a 4 |a 3 ] 0.0 a2, a6 a2, a6 Expand Plan through a 6 a7 a7 a4,a7 a4,a7 a3 a3 a6 a6
© 2007 SRI International 20 Search Example – 5 th Revision a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 a7a7 a8a C Pr(G) C C C C C C C C C a7a7 a4, a7a4, a7 a4a4 a3a3 a8 a8 a8a8 a 6, a 8 a5a5 a 2,a 6,a 8 a 2,a 5 a 1,[a 4, a 7 |a 3 ] a 1,[a 4 |a 3 ] 0.0 Recompute Pareto Sets. Plans a 2, a 6, a 8, and a 2, a 5 are dominated at root. a7 a7 a4,a7 a4,a7 a3 a3 a 6, a 8 a 2, a 6, a 8
© 2007 SRI International 21 Search Example – Final a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 a7a7 a8a C Pr(G) C C C C C C C C C a7a7 a4, a7a4, a7 a4a4 a3a3 a8a8 a 6, a 8 a5a5 a 1,[a 4, a 7 |a 3 ] a 1,[a 4 |a 3 ] 0.0 a7 a7 a4,a7 a4,a7 a3 a3 a8 a8 a 6, a 8 a 2, a 6, a 8
© 2007 SRI International 22 Speed-ups -domination [Papadimtriou & Yannakakis, 2003] Randomized Node Expansions –Simulate Partial Plan to Expand a single node Reachability Heuristics –Use the McLUG (CSSAG)
© 2007 SRI International 23 domination x x’ x’/x = 1+ Cost 1-Pr(G) Multiply Each Objective By (1+ ) Check Domination Dominated Non-Dominated Each Hyper-Rectangle Has a single point
© 2007 SRI International 24 Synthesis Results
© 2007 SRI International 25 Execution Results Random Option: Sample Option, execute action Keep Options Open –Most Options: Execute action in most options –Diverse Options: Execute action in most diverse set of options
© 2007 SRI International 26 Summary & Future Work Summary –Multi-Option Plans let executor delay/change commitments to objective functions –Multi-Option Plans help executor understand alternatives –Multi-Option Plans passively enforce diversity through Pareto set approximation Future Work –Synthesis Proactive Diversity: Guide search to broaden Pareto set Speedups: Alternative Pareto set representation, standard MDP tricks –Execution Option Lookahead: how will set of options change? Meta-Objectives: Diversity, Decision Delay –Model-Lite Planning Unspecified objectives (not just unspecified objective function) Objective Function preference elicitation
© 2007 SRI International 27 Final Options Option 1: Questions Option 2: Criticisms Option 3: Next Talk!
© 2007 SRI International 28 Overview Traditional Planning assumes the objective is given a priori –Con: Users must know exactly what they want –Pro: Can synthesize on the fly Information Retrieval (IR) assumes user’s keywords constrain the objective –Con: Relies on existing term frequency index, and more… –Pro: Deals with human imprecision Want planners to: –Handle underspecified (model-lite) problems, like IR –Generate diverse, but relevant plans –The objective can be one of many underspecified aspects (actions, state, etc.) Just care about objective here Its not so easy: –Users can change/refine objective continually Same in IR: ♦ “decision making” ♦ “multi criteria decision making” ♦ “multi criteria decision making tutorial” ♦ “Providence weather” Solution at a Glance: –If objective is undefined, get a Pareto set –If need to start executing, keep options open –If objective is known, follow best fitting plan in Pareto set –If objective changes, follow best fitting plan in Pareto set The Trick: –Capture structure in the Pareto set The Case Study: –Conditional Probabilistic Planning Objective = f(Plan Cost, Plan Success)
Reinforcement Learning Yijue Hou. What is learning? Learning takes place as a result of interaction between an agent and the world, the idea behind learning.
Solving problems by searching Chapter 3 Image source: Wikipedia.
UNIT V: LEARNING. LEARNING Learning from Observation Inductive Learning Decision Trees Explanation based Learning Statistical Learning methods Reinforcement.
G5BAIM Artificial Intelligence Methods Graham Kendall Simulated Annealing.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Reinforcement Learning I: The setting and classical stochastic dynamic programming algorithms Tuomas Sandholm Carnegie Mellon University Computer Science.
Facility Location Lindsey Bleimes Charlie Garrod Adam Meyerson.
Learning to Improve the Quality of Plans Produced by Partial-order Planners M. Afzal Upal Intelligent Agents & Multiagent Systems Lab.
Building an Emulator. EGU short course – session 22 Outline Recipe for building an emulator – MUCM toolkit Screening – which simulator inputs matter Design.
. By: Marco Antonio Guimarães Dias Technical Consultant by Petrobras Doctoral Research by PUC-Rio 5 th Annual International Conference on Real Options.
Informed search algorithms Chapter 4. Material Chapter 4 Section Exclude memory-bounded heuristic search.
Using Trees to Depict a Forest Bin Liu, H.V. Jagadish Department of EECS University of Michigan Ann Arbor, USA Proceedings of Very Large Data Base Endowment.
Local Search and Optimization CS 271: Fall 2007 Instructor: Padhraic Smyth.
CPSC 502, Lecture 17Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 17 Nov, 8, 2011 Slide credit : C. Conati, S.
Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered.
Airline Schedule Optimization (Fleet Assignment I) Saba Neyshabouri.
Search Search plays a key role in many parts of AI. These algorithms provide the conceptual backbone of almost every approach to the systematic exploration.
Genetic Algorithms Chapter 3. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Algorithms GA Quick Overview Developed: USA in.
UNIT IV: UNCERTAIN KNOWLEDGE AND REASONING. Uncertain Knowledge and Reasoning Uncertainty Review of Probability Probabilistic Reasoning Bayesian networks.
BEST FIRST SEARCH - BeFS DFS is good becoz it allows a solution to be found without all competing branches having to be expanded. Breadth First Search.
1 Systems Engineering A Way of Thinking A Way of Doing Business Enabling Organized Transition from Need to Product August 1997 Systems Engineering Technical.
Lecture 3 Robert Zimmer Room 6, 25 St James. Introduction to Optimization Modeling.
Identify the differences between Analytical Decision Making and Intuitive Decision Making Demonstrate basic design and delivery requirements for Tactical.
STRONG METHOD PROBLEM SOLVING. Human experts are able to perform at a high level because they know a lot about their areas of expertise. This fact is.
Planning. Components of a Planning System In any general problem solving systems, elementary techniques to perform following functions are required –Choose.
Chapter 22 Implementing lists: linked implementations.
1 Chapter 11. Hash Tables. 2 Many applications require a dynamic set that supports only the dictionary operations, INSERT, SEARCH, and DELETE. Example:
Chapter 14 Query Optimization. ©Silberschatz, Korth and Sudarshan14.2Database System Concepts 3 rd Edition Chapter 14: Query Optimization Introduction.
The CLARION Cognitive Architecture: A Tutorial Part 2 – The Action- Centered Subsystem Nick Wilson, Michael Lynch, Ron Sun, Sébastien Hélie Cognitive Science,
Automated Parameter Setting Based on Runtime Prediction: Towards an Instance-Aware Problem Solver Frank Hutter, Univ. of British Columbia, Vancouver, Canada.
© 2016 SlidePlayer.com Inc. All rights reserved.