Learning Control Knowledge for Planning Yi-Cheng Huang.

Slides:



Advertisements
Similar presentations
Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Advertisements

Language for planning problems
CSE391 – 2005 NLP 1 Planning The Planning problem Planning with State-space search.
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
JAYASRI JETTI CHINMAYA KRISHNA SURYADEVARA
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Maintaining Arc-consistency over Mutex Relations in Planning Graphs during Search Pavel Surynek Roman Barták Charles University, Prague Czech Republic.
Time Constraints in Planning Sudhan Kanitkar
Problem Solving by Searching Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 3 Spring 2007.
Graph-based Planning Brian C. Williams Sept. 25 th & 30 th, J/6.834J.
1 Backdoor Sets in SAT Instances Ryan Williams Carnegie Mellon University Joint work in IJCAI03 with: Carla Gomes and Bart Selman Cornell University.
Knowledge Representation Meets Stochastic Planning Bob Givan Joint work w/ Alan Fern and SungWook Yoon Electrical and Computer Engineering Purdue University.
Planning CSE 473 Chapters 10.3 and 11. © D. Weld, D. Fox 2 Planning Given a logical description of the initial situation, a logical description of the.
1 Classical STRIPS Planning Alan Fern * * Based in part on slides by Daniel Weld.
Utilizing Problem Structure in Local Search: The Planning Benchmarks as a Case Study Jőrg Hoffmann Alberts-Ludwigs-University Freiburg.
1 Chapter 16 Planning Methods. 2 Chapter 16 Contents (1) l STRIPS l STRIPS Implementation l Partial Order Planning l The Principle of Least Commitment.
CPSC 322, Lecture 19Slide 1 Propositional Logic Intro, Syntax Computer Science cpsc322, Lecture 19 (Textbook Chpt ) February, 23, 2009.
Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
CSE 5731 Lecture 21 State-Space Search vs. Constraint- Based Planning CSE 573 Artificial Intelligence I Henry Kautz Fall 2001.
1 Towards Efficient Sampling: Exploiting Random Walk Strategy Wei Wei, Jordan Erenrich, and Bart Selman.
Solving the Protein Threading Problem in Parallel Nocola Yanev, Rumen Andonov Indrajit Bhattacharya CMSC 838T Presentation.
1 Backdoors To Typical Case Complexity Ryan Williams Carnegie Mellon University Joint work with: Carla Gomes and Bart Selman Cornell University.
1 BLACKBOX: A New Paradigm for Planning Bart Selman Cornell University.
Encoding Domain Knowledge in the Planning as Satisfiability Framework Bart Selman Cornell University.
1 BLACKBOX: A New Approach to the Application of Theorem Proving to Problem Solving Bart Selman Cornell University Joint work with Henry Kautz AT&T Labs.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.
1.1 Chapter 1: Introduction What is the course all about? Problems, instances and algorithms Running time v.s. computational complexity General description.
Classical Planning Chapter 10.
Cristian Gherghina Joint work with: Wei-Ngan Chin, Razvan Voicu, Quang Loc Le Florin Craciun, Shengchao Qin TexPoint fonts used in EMF. Read the TexPoint.
Finite Capacity Scheduling 6.834J, J. Overview of Presentation What is Finite Capacity Scheduling? Types of Scheduling Problems Background and History.
Planning as Satisfiability CS Outline 0. Overview of Planning 1. Modeling and Solving Planning Problems as SAT - SATPLAN 2. Improved Encodings using.
The Role of Domain-Specific Knowledge in the Planning as Satisfiability Framework Henry Kautz AT&T Labs Bart Selman Cornell University.
All that remains is to connect the edges in the variable-setters to the appropriate clause-checkers in the way that we require. This is done by the convey.
Satisfiability and State- Transition Systems: An AI Perspective Henry Kautz University of Washington.
Empirical Explorations with The Logical Theory Machine: A Case Study in Heuristics by Allen Newell, J. C. Shaw, & H. A. Simon by Allen Newell, J. C. Shaw,
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
Biswanath Panda, Mirek Riedewald, Daniel Fink ICDE Conference 2010 The Model-Summary Problem and a Solution for Trees 1.
1 The LPSAT Engine and its Application to Metric Planning Steve Wolfman University of Washington CS&E Advisor: Dan Weld.
Open-Loop Planning as Satisfiability Henry Kautz AT&T Labs.
Planning as Propositional Satisfiabililty Brian C. Williams Oct. 30 th, J/6.834J GSAT, Graphplan and WalkSAT Based on slides from Bart Selman.
For Monday Finish chapter 19 No homework. Program 4 Any questions?
Automated Planning Dr. Héctor Muñoz-Avila. What is Planning? Classical Definition Domain Independent: symbolic descriptions of the problems and the domain.
CS 5411 Compilation Approaches to AI Planning 1 José Luis Ambite* Some slides are taken from presentations by Kautz and Selman. Please visit their.
Domain-Dependent View of Multiple Robots Path Planning Pavel Surynek Charles University, Prague Czech Republic.
SAT 2009 Ashish Sabharwal Backdoors in the Context of Learning (short paper) Bistra Dilkina, Carla P. Gomes, Ashish Sabharwal Cornell University SAT-09.
1/16 Planning Chapter 11- Part1 Author: Vali Derhami.
An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto Univ.)
Efficient Automated Planning with New Formulations Ruoyun Huang Washington University in St. Louis.
1 CMSC 471 Fall 2004 Class #21 – Thursday, November 11.
1 CMSC 471 Fall 2002 Class #24 – Wednesday, November 20.
1 Propositional Logic Limits The expressive power of propositional logic is limited. The assumption is that everything can be expressed by simple facts.
AAAI of 20 Deconstructing Planning as Satisfiability Henry Kautz University of Rochester in collaboration with Bart Selman and Jöerg Hoffmann.
Inference in Propositional Logic (and Intro to SAT) CSE 473.
Search Control.. Planning is really really hard –Theoretically, practically But people seem ok at it What to do…. –Abstraction –Find “easy” classes of.
Learning Declarative Control Rules for Constraint-Based Planning Yi-Cheng Huang Bart Selman Cornell University Henry Kautz University of Washington.
REU 2007-ParSat: A Parallel SAT Solver Christopher Earl, Mentor: Dr. Hao Zheng Department of Computer Science & Engineering Introduction Results and Conclusions.
Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.
Hard Problems Some problems are hard to solve.  No polynomial time algorithm is known.  E.g., NP-hard problems such as machine scheduling, bin packing,
Hybrid BDD and All-SAT Method for Model Checking
Hard Problems Some problems are hard to solve.
Inference and search for the propositional satisfiability problem
Planning as Satisfiability
Study Guide for ES205 Yu-Chi Ho Jonathan T. Lee Nov. 7, 2000
Class #20 – Wednesday, November 5
Class #17 – Tuesday, October 30
Presentation transcript:

Learning Control Knowledge for Planning Yi-Cheng Huang

Outline I.Brief overview of planning II.Planning with Control knowledge III.Learning control knowledge IV.Conclusion

I. Overview of Planning Planning - a very general framework for many applications:  Robot control;  Airline scheduling;  Hubble space telescope control. Planning – find a sequence of actions that leads from an initial state to a goal state.

Planning Is Difficult – Abundance of Negative Complexity Results Domain-independent planning: PSPACE- complete or worse (Chapman 1987; Bylander 1991; Backstrom 1993). Domain-dependent planning: NP-complete or worse (Chenoweth 1991; Gupta and Nau 1992). Approximate planning: NP-complete or worse (Selman 1994).

Recent State-of-the-art Planners Constraint-based Planners – Graphplan, Blackbox. Heuristic Search Planners – HSP, FF. Both kinds of planners can solve problems in seconds or minutes that traditional planners take hours or days.

Graphplan (Blum & Furst, 1995) Facts... FactsActions... Search on planning graph to find plan Time iTime i+1

Blackbox (Kautz & Selman, 1999) Satisfiability Tester ( Chaff,WalkSat, Satz, RelSat,...) plan problem

Heuristic Search Based Planning (Bonet & Geffner, ‘97) Use various heuristic functions to approximate the distance from the current state to the goal state based on the planning graph. Use Best-First Search or A* search to find plans.

II. Planning With Control General focus on planning: avoid search as much as possible. Many real-world applications are tailored and simplified by domain specific knowledge. TLPlan is an efficient planner when using control knowledge to guild a forward-chaining search planner (Bacchus & Kabanza 2000).

TLPlan Temporal Logic Control Formula

A Simple Control Rule Example Goal Do NOT move an object at the goal location (goal (at (obj loc)) at (obj loc)) Temporal logic operator: “always”“next”

Question: Whether the same level of control can be effectively incorporated into constraint-based planner?

I.Rules involves only static information. II.Rules depends on the current state. III.Rules depends on the current state and require dynamic user-defined predicates. Control Rules Categories

Category I Control Rules (only depends on goal; toy example) Do NOT unload an package from an airplane if the current location is not in the package’s goal Goal L a a a

Pruning the Planning Graph Category I Rules Facts Actions...

Effect of Graph Pruning

Category II Control Rules L Do NOT move an airplane if there is an object in the airplane that needs to be unloaded at that location. a

Control by Adding Constraints Temporal Logic Control Rules Planning FormulaConstraints Clauses

Rules Without Compact Encoding NYC SFO ORL Do NOT move a vehicle unless (a) there is an object that needs to be picked up (b) there is an object in the vehicle that needs to be unloaded Goal DC a a b b

Complex Encoding for Category III Rules Need to define extra predicates: need_to_move_by_airplane; need_to_unload_by_airplane Introduce extra literals and clauses. O(mn) ground literals; O(mn+km^2) clauses at each time step. m: #cities, n: #objects, k: #airports No easy encoding for category III rules. However, it appears category I & II rules do most of work.

Blackbox with Control Knowledge (Logistics domain with hand-coded rules) Note: Logarithmic time scale

Comparison of Blackbox and TLPlan (Run Time)

Comparison of Blackbox and TLPlan (parallel plan length; “plan quality”)

Summary Adding Control Knowledge We have shown how to add declarative control knowledge to a constraint-based planners by using temporal logic statements. Adding such knowledge gives significant speedups (up to two orders of magnitude ). Pure heuristic search with control can be still faster but with much lower plan quality.

III. Can we learn domain knowledge from example plans?

Motivation Control Rules used in TLPlan and Blackbox are hand-coded. Idea: learn control rules on a sequence of small problems solved by planner.

Learning System Framework Plan Justification / Type Inference Blackbox Planner Problem ILP Learning Module / Verification Control Rules

Target Concepts for Actions Action Select Rule: indicate conditions under which the action can be performed immediately. Action Reject Rule: indicate conditions under which it must not be performed.

Basic Assumption on Learning Control Plan found by planner on simple problems are optimal or near-optimal. Actions appear in an optimal plan must be selected. Actions that can be executed but do not appear in the plan must be rejected.

 Real action: action appears in the plan.  Virtual action: action that its preconditions are hold but does not appear in the plan. Definition

An Toy Planning Example Goal Initial BOSSFONYC Initial ab ab

Real & Virtual Actions for UnloadAirplane Time 1: LoadAirplane (P a BOS) Time 2: FlyAirplane (P SFO NYC) UnloadAirplane (P a BOS) Time 3: LoadAirplane (P b NYC) UnloadAirplane (P a NYC) Time 4: FlyAirplane (P NYC SFO) UnloadAirplane (P a NYC) UnloadAirplane (P b NYC) Time 5: UnloadAirplane (P a SFO) UnloadAirplane (P b SFO) Real Virtual

Heuristics for Extracting Examples

Rule Induction Literal:  Xi = Xj, ex., loc1 = loc2  P(X1,…, Xn), ex., at (pkg, loc)  goal (P(X1,…, Xn)), ex., goal (at (pkg, loc))  negation of the above Based on Quinlan’s FOIL (Quinlan 1990; 1996).

Reject Rule: UnloadAirplane timeplnpkgapt +2PaBOS +3PaNYC +4Pa +4Pa -5PaSFO -5Pa UnloadAirplane (pln pkg apt)

Reject Rule: UnloadAirplane UnloadAirplane (pln pkg apt) goal(at (pkg loc)) timeplnpkgaptloc +2PaBOSSFO +3PaNYCSFO +4PaNYCSFO +4PaNYCSFO -5Pa -5Pa

Reject Rule: UnloadAirplane UnloadAirplane (pln pkg apt) goal(at (pkg loc)) ^ (apt != loc) timeplnpkgaptloc +2PaBOSSFO +3PaNYCSFO +4PaNYCSFO +4PaNYCSFO -5Pa -5Pa

Learning Time

Logistics Domain

Learned Logistics Control Rules If an object’s goal location is at different city, do NOT unload the object from airplanes. Unload an object from a truck if the current location is an airport and it is not in the same city as the package’s goal location.

Briefcase Domain

Grid Domain

Gripper Domain

Mystery Domain

Tireworld Domain

Summary of Learning for Planning Introduced inductive logic programming methodology into constraint-based planning framework to obtain “trainable planner”. Demonstrated clear practical speedups on range of benchmark problems.

IV. Single-agent vs. Multi-agent planning  Observations: heuristic planners degrade rapidly in multi-agent settings. They tend to assign all work to a single agent.  We studied this phenomenon by exploring different work-load distributions.

Force the Planners There is no easy way to modify the heuristic search planners to find better quality plans. Limit the number of feature actions an agent can perform to force the planners to find plans with the same level of participation of all agents.

Sokoban Domain

Restricted Sokoban Domain

Complexity Analysis on Restricted Domain C.B.PH.P. SokobanPSPACE-Complete (Culberson, 1997) V RocketNP-Complete (reduce from vertex feedback) V GridPolynomial SolvableV ElevatorPolynomial SolvableV

Conclusions (a) Demonstrated how performance of state-of-the- art general purpose planning systems can be boosted by incorporating control knowledge. Knowledge encoded in purely declarative form using temporal logic formulas. Obtained up to 2 orders of magnitude speedup on series of benchmarks.

Conclusions (b) Demonstrated feasibility of a “trainable” planning system: system learns domain / control knowledge from many small example plans. Based on concepts from inductive logic programming. Learned knowledge in temporal logic form. First demonstration of practical speedups using learning in a planning system on realistic benchmarks. Approach avoids learning “accidental truths” that can hurt system performance (problem in earlier systems)

Conclusions (c) Uncovered link between performance of planners and inherent complexity of planning task. Heuristic search planners work well on problems solvable in poly time with specialized algorithms. Constraint-based planner dominate on NP- complete planning tasks.

Conclusion Comparison of constraint-based planner and heuristic search planner shows that they complement each other on different domains. Hand-coded control knowledge can be effectively applied in constraint-based planners.

Conclusion (cont.) Our learning system is simple and modular; learning time is short. Learned rules are on par with hand-coded ones and shown to improve the performance for over two orders of magnitude. Learned rules are in logic form and can be used on other planning systems.

Demonstrated a way for effectively learning domain knowledge from small general plans. Learned control knowledge boosts performance on larger problems. First clear demonstration of boosting plan system performance through learning. Declarative, logic-based approach is general and fits wide range of planning applications.

The End