Deterministic Techniques for Stochastic Planning No longer the Rodney Dangerfield of Stochastic Planning?

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Lirong Xia Reinforcement Learning (2) Tue, March 21, 2014.
Programming exercises: Angel – lms.wsu.edu – Submit via zip or tar – Write-up, Results, Code Doodle: class presentations Student Responses First visit.
Top 5 Worst Times For A Conference Talk 1.Last Day 2.Last Session of Last Day 3.Last Talk of Last Session of Last Day 4.Last Talk of Last Session of Last.
Sungwook Yoon – Probabilistic Planning via Determinization Probabilistic Planning via Determinization in Hindsight FF-Hindsight Sungwook Yoon Joint work.
Probabilistic Planning (goal-oriented) Action Probabilistic Outcome Time 1 Time 2 Goal State 1 Action State Maximize Goal Achievement Dead End A1A2 I A1.
How to Think about Prolog - 1 Mike’s Prolog Tutorial 29 Sept 2011.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Solving POMDPs Using Quadratically Constrained Linear Programs Christopher Amato.
SARSOP Successive Approximations of the Reachable Space under Optimal Policies Devin Grady 4 April 2013.
Reinforcement Learning Dealing with Complexity and Safety in RL Subramanian Ramamoorthy School of Informatics 27 March, 2012.
Subbarao Kambhampati Arizona State University What’s Hot: ICAPS “Challenges in Planning” A brief talk on the core & (one) fringe of ICAPS Talk given at.
Decision Theoretic Planning
MDP Presentation CS594 Automated Optimal Decision Making Sohail M Yousof Advanced Artificial Intelligence.
Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.
1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.
An Accelerated Gradient Method for Multi-Agent Planning in Factored MDPs Sue Ann HongGeoff Gordon CarnegieMellonUniversity.
A Hybridized Planner for Stochastic Domains Mausam and Daniel S. Weld University of Washington, Seattle Piergiorgio Bertoli ITC-IRST, Trento.
Simulated Annealing Student (PhD): Umut R. ERTÜRK Lecturer : Nazlı İkizler Cinbiş
Infinite Horizon Problems
Planning under Uncertainty
Apprenticeship learning for robotic control Pieter Abbeel Stanford University Joint work with Andrew Y. Ng, Adam Coates, Morgan Quigley.
Two Models of Evaluating Probabilistic Planning IPPC (Probabilistic Planning Competition) – How often did you reach the goal under the given time constraints.
Summary of MDPs (until Now) Finite-horizon MDPs – Non-stationary policy – Value iteration Compute V 0..V k.. V T the value functions for k stages to go.
Concurrent Markov Decision Processes Mausam, Daniel S. Weld University of Washington Seattle.
1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.
1 Optimisation Although Constraint Logic Programming is somehow focussed in constraint satisfaction (closer to a “logical” view), constraint optimisation.
CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.
Computational Methods for Management and Economics Carla Gomes
Concurrent Probabilistic Temporal Planning (CPTP) Mausam Joint work with Daniel S. Weld University of Washington Seattle.
Department of Computer Science Undergraduate Events More
9/23. Announcements Homework 1 returned today (Avg 27.8; highest 37) –Homework 2 due Thursday Homework 3 socket to open today Project 1 due Tuesday –A.
1 Combinatorial Problems in Cooperative Control: Complexity and Scalability Carla Gomes and Bart Selman Cornell University Muri Meeting March 2002.
CS6800 Advanced Theory of Computation Fall 2012 Vinay B Gavirangaswamy
TEXT MINING IN BIOMEDICAL RESEARCH QI LI 03/28/14.
The Context of Forest Management & Economics, Modeling Fundamentals Lecture 1 (03/30/2015)
Policy Generation for Continuous-time Stochastic Domains with Concurrency Håkan L. S. YounesReid G. Simmons Carnegie Mellon University.
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
1 Endgame Logistics  Final Project Presentations  Tuesday, March 19, 3-5, KEC2057  Powerpoint suggested ( to me before class)  Can use your own.
Planning and Verification for Stochastic Processes with Asynchronous Events Håkan L. S. Younes Carnegie Mellon University.
Computer Science CPSC 502 Lecture 14 Markov Decision Processes (Ch. 9, up to 9.5.3)
Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Helmholtzstraße 22 D Ulm phone+49 (0) 731/ fax +49 (0) 731/ It Takes Two: Why Mortality Trend Modeling is more.
Department of Computer Science Undergraduate Events More
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Regularization and Feature Selection in Least-Squares Temporal Difference Learning J. Zico Kolter and Andrew Y. Ng Computer Science Department Stanford.
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
1 Monte-Carlo Planning: Policy Improvement Alan Fern.
Heuristic Search for problems with uncertainty CSE 574 April 22, 2003 Mausam.
Department of Computer Science Undergraduate Events More
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
4 th International Conference on Service Oriented Computing Adaptive Web Processes Using Value of Changed Information John Harney, Prashant Doshi LSDIS.
1 Efficient Computation of Diverse Query Results Erik Vee joint work with Utkarsh Srivastava, Jayavel Shanmugasundaram, Prashant Bhat, Sihem Amer Yahia.
Department of Computer Science Undergraduate Events More
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Reinforcement Learning Guest Lecturer: Chengxiang Zhai Machine Learning December 6, 2001.
Optimization Techniques for Natural Resources SEFS 540 / ESRM 490 B Lecture 1 (3/30/2016)
Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.
© Chinese University, CSE Dept. Software Engineering / Software Engineering Topic 1: Software Engineering: A Preview Your Name: ____________________.
When would someone think of using scientific notation?
Abstraction Transformation & Heuristics
Informed search algorithms
Chapter 1 Science Skills.
Mean Field and Variational Methods Loopy Belief Propagation
Reinforcement Learning (2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Reinforcement Learning (2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
CS 791Graduate Topics in Computer Science [Software Engineering]
Presentation transcript:

Deterministic Techniques for Stochastic Planning No longer the Rodney Dangerfield of Stochastic Planning?

Solving stochastic planning problems via determinizations Quite an old idea (e.g. envelope extension methods) What is new is that there is increasing realization that determinizing approaches provide state-of-the-art performance –Even for probabilistically interesting domains Should be a happy occasion..

Ways of using deterministic planning To compute the conditional branches –Robinson et al. To seed/approximate the value function –ReTraSE,Peng Dai, McLUG/POND, FF-Hop Use single determinization –FF-replan –ReTrASE (use diverse plans for a single determinization) Use sampled determinizations –FF-hop [AAAI 2008; with Yoon et al] –Use Relaxed solutions (for sampled determinizations) Peng Dai’s paper McLug [AIJ 2008; with Bryce et al] Would be good to understand the tradeoffs… Determinization = Sampling evolution of the world

Comparing approaches.. ReTrASE and FF-Hop seem closely related –ReTrASE uses diverse deterministic plans for a single determinization; FF-HOP computes deterministic plans for sampled determinizations –Is there any guarantee that syntactic (action) diversity is actually related to likely sample worlds? Cost of generating deterministic plans isn’t exactly too cheap.. –Relaxed reachability style approaches can compute multiple plans (for samples of the worlds) Would relaxation of samples’ plans be better or worse in convergence terms..?

Science may never fully explain who killed JFK, but any explanation must pass the scientific judgement. MDPs may never fully generate policies efficiently but any approach that does must pass MDP judgement.