Using OpenRDK to learn walk parameters for the Humanoid Robot NAO A. Cherubini L. Iocchi it’s me F. Giannone M. Lombardo G. Oriolo.

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO A. Cherubini L. Iocchi it’s me F. Giannone M. Lombardo G. Oriolo

Overview: environment Robotic Agent NAO ApplicationRobotic Soccer SDK Simulator Humanoid Robot Produced by Aldebaran

Process raw data from environment Elaborate raw data to obtain more reliable information Decide the best behaviour to accomplish the agent goal Actuate robot motors accordindly Vision Module Modelling Module Motion Control Module Behaviour Control Module Environment At First !!! Overview: (sub)tasks

Make Nao walk…how? Main Advantage …and a Drawback  Based on an unknow Walk Model  Ready to Use (…to be tuned) Nao is equipped with a set of motion utilities including walk implementation a walk implementation that can be No flexibility at all!!!  called through an interface (NaoQi Motion Proxy)  partially customized by tuning some parameters For these reasons we decided to develop our walk model and to tune it using machine learnig tecniques

SPQR Walking library development workflow Develop the Walk model using Matlab Test the walk model on Webots simulator Design and Implement a C++ library for our RDK Soccer Agent on Webots simulator on real NAO robot Finally tune walk parameters (on webots simulator and on NAO) SPQR Walk Model Test our Walking RDK Agent SPQR Walking Library

A simple Walking RAgent for Nao 2 scenari devono esser possibili: 1. Ragent che gira su webots 2. Ragent che gira su nao Mostrare quanto hanno in comune Spiegare vantaggi nell’usare RDK (possibilita’ di sviluppare e testare la walking library ortogonalmente alla sviluppo del resto del codice a cui puo’ cmq essere facilemente integrata)

A simple walking RAgent for Nao Motion Control Module NaoQi Adaptor Simple Behaviour Module Switches between two states: walk - stand Smemy SPQR Walking Library NAO (NaoQi) Webots Client TCP channel WEBOTS uses

Choose a set of variable output: 3D coordinates of selected points of the robot Choose and parametrize the desired trajectories for these variables at each phase of the gait SPQR Walking Engine Model 21 degrees of freedom Velocity Commands (v,ω) v is linear velocity ω is angolar velocity We follow the “Static Walking Pattern”: Use a-priori definition of the desired trajectories defined by: NAO model characteristics No actuated trunk No dynamic model available

SPQR velocity commands Initial Half Step Rectilinear Walk Swing Stand Position Final Half Step Curvilinear Walk Swing Turn Step Behavior Control Module Motion Control Module Joints Matrix (v,ω) (0,ω) (0,0) (v,0) (v,ω) (v,0) (0,0) (v,ω)

SPQR walking subtasks and parameters SPQR walk subtasks Foot trajectories in the xz plane Center of mass trajectory in lateral direction Hip yaw/pitch control (turn) Arm control X tot, X sw0, X ds Z st, Z sw Y ft, Y ss, Y ds, K r H yp KsKs Biped walking Double support phaseSwing phase SS%

SPQR Walking Library Class Diagram

Walk tuning: main issues Possible choices  By hand  By using machine learning techniques Machine Learning seems the best solution  Less human interaction  Explores the search space in a more systematic way …but take care of some aspects  You need to define an effective fitness function  You need to choose the right algorithm to explore the parameter space  Only a limited amount of experiments can be done on a real robot

SPQR Learning System Architecture Learner Learning library RAgent Walking library uses Real Nao Webots Data to evaluate the fitness Fitness Iteration experiments (GPS)

SPQR Learner First iteration? Return initial Iteration and iteration information Apply the chosen algorithm (strategy) Yes No Policy Gradient (e.g., PGPR) Nelder Mead Simplex Method Genetic Algorithm Learner Return next Iteration and iteration information

Policy Gradient (PG) iteration Given a point p in the parameter space  IR K Generate n (n=mk) policies from p (for each component of p: p i, p i + , or p i -  ) Evaluate the policies For each k  {1, …, K}, compute F k+, F k0, F k- For each k  {1, …, K}, if F 0 > F + and F 0 > F - then  k =0 else  k = F + -F -  *=   normalized(  ) p’=p+  *

Enhancing PG: PGPR At each iteration i, the gradient estimate  (i) can be used to obtain a metric for measuring the relevance of the parameters. Given the relevance and a threshold T, PGPR prunes less relevant parameters in next iterations. forgetting factor

Curvilinear biped walking experiment The robot move along a curve with radius R for a time t Fitness function: In which: radial error path length

Simulators in learning tasks Advantages  You can test the gait model and the learning algorithm without being biased by noise Limits  The results of the experiments on the simulator can be ported on the real robot, but specialized solutions for the simulated model can be not so effective on the real robot (e.g., it does not take into account asymmetries, models are not very accurate)

Results (1) Five sessions of PG, 20 iterations each, all starting from the same initial configuration SS%, Ks, Yft have been set to hand-tuned values 16 policies for each iteration Fitness increases in a regular way Low variance among the five simulations

Results (2) Z sw XsKrX sw0 Five runs of PGPR Final parameter sets for the five PG runs

A. Cherubini, F. Giannone, L. Iocchi, M. Lombardo, G. Oriolo. “Policy Gradient Learning for a Humanoid Soccer Robot”. Accepted for Journal of Robotics and Autonomous Systems. A. Cherubini, F. Giannone, L. Iocchi, and P. F. Palamara, “An extended policy gradient algorithm for robot task learning”, Proc. of IEEE/RSJ International Conference on Intelligent Robots and System, 2007. A. Cherubini, F. Giannone, and L. Iocchi, “Layered learning for a soccer legged robot helped with a 3D simulator”, Proc. of 11th International Robocup Symposium, 2007. http://openrdk.sourceforge.net http://www.aldebaran-robotics.com/ http://spqr.dis.uniroma1.it Bibliography

??? Any Questions ??? ???

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO A. Cherubini L. Iocchi it’s me F. Giannone M. Lombardo G. Oriolo.

Similar presentations

Presentation on theme: "Using OpenRDK to learn walk parameters for the Humanoid Robot NAO A. Cherubini L. Iocchi it’s me F. Giannone M. Lombardo G. Oriolo."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO A. Cherubini L. Iocchi it’s me F. Giannone M. Lombardo G. Oriolo.

Similar presentations

Presentation on theme: "Using OpenRDK to learn walk parameters for the Humanoid Robot NAO A. Cherubini L. Iocchi it’s me F. Giannone M. Lombardo G. Oriolo."— Presentation transcript:

Similar presentations

About project

Feedback