Presentation is loading. Please wait.

Presentation is loading. Please wait.

Beyond Geometric Path Planning: When Context Matters Ashesh Jain, Shikhar Sharma Thorsten Joachims and Ashutosh Saxena.

Similar presentations


Presentation on theme: "Beyond Geometric Path Planning: When Context Matters Ashesh Jain, Shikhar Sharma Thorsten Joachims and Ashutosh Saxena."— Presentation transcript:

1 Beyond Geometric Path Planning: When Context Matters Ashesh Jain, Shikhar Sharma Thorsten Joachims and Ashutosh Saxena

2 Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

3 Structured To Unstructured Environments [Images from Google] Kuka arm Kiva Beam Baxter PR2 Robot Nurse Jain, Sharma, Joachims, Saxena

4 Path Planning High DoF manipulators Continuous high dimensional space Obstacles BASE 7 DoF arm Joint Link End-Effector A B Geometric criteria's Collision free Shortest path Least time Minimum energy Kavraki et. al. PRM LaValle et. al. RRT Ratliff et. al. CHOMP Karaman et. al. RRT* Schulman et. al. TrajOpt Jain, Sharma, Joachims, Saxena

5 Context Rich Environment Jain, Sharma, Joachims, Saxena

6 Context Rich Environment Jain, Sharma, Joachims, Saxena http://www.youtube.com/watch?v=uLktpkd7ojA Video [14 sec to 18 sec]

7 What went wrong? Robot not modeling the context Does not understand the preferences Jain, Sharma, Joachims, Saxena

8 Does Existing Works Address This? Inverse Reinforcement Learning (Kober and Peters 2011, Abbeel et. al. 2010, Ziebrat et. al. 2008, Ratliff et. al. 2006) Context is not important, focuses on specific trajectory – Modeling human navigation patterns Kitani et. al. ECCV 2012 Optimal Demonstrations: Requires an expert Abbeel et. al.Ratliff et. al.Kober et. al. Jain, Sharma, Joachims, Saxena

9 Our Goal Model Context Generate multiple trajectories for a task User preferences Learn from non-experts Jain, Sharma, Joachims, Saxena

10 Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

11 Learning Setting UserRobot 1.Online learning system 2.Learns from user feedback 3.Sub-optimal feedback Goal: Learn user preferences Jain, Sharma, Joachims, Saxena

12 Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

13 Example of Preferences Move a glass of water Upright Context Contorted Arm Preferences varies with users, tasks and environments Jain, Sharma, Joachims, Saxena

14 Score function Robot configuration and Environment Interactions Context Trajectory Jain, Sharma, Joachims, Saxena Object-object Interactions Connecting waypoints to neighboring objects Trajectory graph

15 Score function Jain, Sharma, Joachims, Saxena Trajectory graph Object attributes: {electronic, fragile, sharp, liquid, hot, …} E.g.Laptop: {electronic, fragile} Knife: {sharp} ….. Hermans et. al. ICRA w/s 2011 Koppula et. al. NIPS 2011 Distance features Object-object Interactions

16 Score function Object-object Interactions Robot configuration and Environment Interactions Bad Good Jain, Sharma, Joachims, Saxena Features 1.Spectrogram 2.Objects distance from horizontal and vertical surfaces 3.Objects angle with vertical axis 4.Robots wrist and elbow configuration in cylindrical co-ordinate Cakmak et. al. IROS 2011

17 Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

18 User Feedback Intuitive feedback mechanisms Re-rank Interactive Zero-G Jain, Sharma, Joachims, Saxena

19 1. Re-rank Robot ranks trajectories and user selects one Top three trajectories User feedback User observing top three trajectories Jain, Sharma, Joachims, Saxena

20 2. Zero-G User corrects trajectory waypoints Bad waypoint in redHolding wrist activates zero-G mode Jain, Sharma, Joachims, Saxena

21 Bad waypoint in redHolding wrist activates zero-G mode 2. Zero-G User corrects trajectory waypoints Jain, Sharma, Joachims, Saxena

22 3. Interactive Not all robots support zero-G feedback Jain, Sharma, Joachims, Saxena

23 3. Interactive Not all robots support zero-G feedback Jain, Sharma, Joachims, Saxena

24 Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

25 Coactive Learning UserRobot Goal: Learn user preferences Shivaswamy & Joachims, ICML 2012 Learn from sub- optimal feedback Jain, Sharma, Joachims, Saxena

26 Trajectory Preference Perceptron Regret bound Shivaswamy & Joachims, ICML 2012 Jain, Sharma, Joachims, Saxena

27 Outline Motivation Approach – Context-based score – Feedback mechanism – Learning algorithm Results Jain, Sharma, Joachims, Saxena

28 Experimental Setup Two robots: Baxter and PR2 35 tasks in household setting – 2100 expert labeled trajectories 16 tasks in grocery store checkout settings – 1300 expert labeled trajectories 14 objects – Bowl, Knife, Laptop, Metal box, Fruits, Egg cartons etc. 7 users Jain, Sharma, Joachims, Saxena

29 Experimental Setting 1 Household environment on PR2 Pouring Cleaning the table Setting up table 35 tasks Variation in objects and environment Experts label on 2100 trajectories on a scale of 1 to 5 Jain, Sharma, Joachims, Saxena

30 Experimental Setting 2 Grocery store checkout on Baxter Cereal box Egg carton Knife in human vicinity 16 tasks Variations in objects and their placement Experts label on 1300 trajectories on a scale of 1 to 5 Jain, Sharma, Joachims, Saxena

31 Generalization #Feedback nDCG@3 Ours w/o pre-training Ours pre-trained SVM-rank MMP-online Household setting Testing on a new environment Higher nDCG w/o feedback SVM-rank trained on experts labels MMP-online is an IRL technique Jain, Sharma, Joachims, Saxena

32 User Study 10 tasks per user – 7 users – Total 7 hours worth robot interaction – Users interacts until satisfied Jain, Sharma, Joachims, Saxena

33 User Study Task No. Time (min) #Feedback Increasing difficulty Grocery setting Baxter Re-rank popular for easier tasks Increase in zero- G for hard tasks #Feedback Time Re-rank Zero-G Jain, Sharma, Joachims, Saxena

34 User# Re-rank# Zero-G Time (min) Self Score Cross Score 15.43.37.83.84.0 21.81.74.64.33.6 32.92.05.04.43.2 4 1.55.33.03.7 53.61.95.03.53.3 63.12.4-3.53.6 72.31.8-4.1 User Study 3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3) Avg. Jain, Sharma, Joachims, Saxena

35 User# Re-rank# Zero-G Time (min) Self Score Cross Score 15.43.37.83.84.0 21.81.74.64.33.6 32.92.05.04.43.2 4 1.55.33.03.7 53.61.95.03.53.3 63.12.4-3.53.6 72.31.8-4.1 User Study 3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3) Avg. 5 Feedback 3 Re-rank 2 Zero-G Jain, Sharma, Joachims, Saxena

36 User# Re-rank# Zero-G Time (min) Self Score Cross Score 15.43.37.83.84.0 21.81.74.64.33.6 32.92.05.04.43.2 4 1.55.33.03.7 53.61.95.03.53.3 63.12.4-3.53.6 72.31.8-4.1 User Study 3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3) Avg. 5 to 6 min. per task Jain, Sharma, Joachims, Saxena

37 User# Re-rank# Zero-G Time (min) Self Score Cross Score 15.43.37.83.84.0 21.81.74.64.33.6 32.92.05.04.43.2 4 1.55.33.03.7 53.61.95.03.53.3 63.12.4-3.53.6 72.31.8-4.1 User Study 3.2 (1.1)2.1 (0.6)5.5 (1.3)3.8 (0.5)3.6 (0.3) Avg. Similar preferences Jain, Sharma, Joachims, Saxena

38 Robot Demonstration Jain, Sharma, Joachims, Saxena http://www.youtube.com/watch?v=uLktpkd7ojA Video [full video]

39 Conclusion Challenges of Unstructured Environment Geometric approaches are not enough Modeling context is crucial Learning from users and not experts Jain, Sharma, Joachims, Saxena

40 Thank You For more details visit http://pr.cs.cornell.edu/coactive


Download ppt "Beyond Geometric Path Planning: When Context Matters Ashesh Jain, Shikhar Sharma Thorsten Joachims and Ashutosh Saxena."

Similar presentations


Ads by Google