Muhammad Al-Nasser Mohammad Shahab Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting King Fahd University of Petroleum.

Muhammad Al-Nasser Mohammad Shahab Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting King Fahd University of Petroleum and Minerals March 2008 COE584: Robotics COE 584/484: Robotics

Outline 1.Problem Definition 2.Physical Description 3.Humanoid Walking System 4.Feedback 1.Gyroscope 2.Phase Resetting 5.Stochastic Optimization 1.PGRL 6.Experimentation 7.Comments

Problem Definition Authors Felix Faber & Sven Behnke, Univ. of Freinbrg, Germany Problem Statement: “to optimize the walking pattern of a humanoid robot for forward speed using suitable metaheuristics”

First Humanoid Robot! 1206 AD Ibn Ismail Ibn al-Razzaz Al-Jazari A boat with four programmable automatic musicians that floated on a lake to entertain guests at royal drinking parties!!

Problem Definition Problems? Nonlinear Dynamics: i.e. complex system to control Sensor Noise: Camera Gyroscope Ultrasonic Force … Environment Disturbances: Unknown surface … Inaccurate Actuators: Motors …

Physical Description Jupp, team NimbRo 60 cm, 2.3 kg Pocket PC

Physical Description Pitch joint to bend trunk Each leg 3DOF hip Knee 2DOF ankle Each arm 2DOF shoulders elbow

Humanoid Walking System One Approach Model-Based (Geometric Model) Accurate Model Solving motion equations for all joints (offline) 19 Degrees of Freedom Nonlinear model equations Computational complexity Controller Leg Motion Trajectory Joints motor positions  ’s Robot walks!

Humanoid Walking System 2 nd Approach Controller Joints motor positions  ’s Central Pattern Generators (CPG) Sinusoid joint trajectory generated Bio-Inspired no need for model

Humanoid Walking System Open-loop (no feedback) Gait Mechanism 1.Shifting weight from one leg to the other 2.Shortening the leg not needed 3.Leg motion in forward direction

Humanoid Walking System Open-loop Gait Clock-driven, Trunk phase being central clock Trunk Phase (with ‘foot step frequency’  ) Right leg motion phase =  Trunk +  /2 Left leg motion phase =  Trunk -  /2  time  --

Humanoid Walking System (continued) Kinematic Mapping  Left  Right  Leg  Foot r: Roll p: Pitch y: Yaw “Human-Like Walking using Toes Joint and Straight Stance Leg” by Behnke  Swing  Swing is leg swing amplitude   Is leg extension

Feedback Overall Control System Joints motor positions  ’s Mapping Controller 1.Gyroscope:  Gyro = Inclination (Balance) Angular Velocity 2.Force Sensing Resistors: foot touch ground trigger (‘High’ or ‘Low’)

Feedback Gyroscope –device for measuring orientation, based on the principles of conservation of angular momentum –Remember Physics 101!

Feedback  P-Control  Gyro increase = robot fall Proportional Control reactive action proportionate to ‘error’ (Error = sensor value – desired value) Desired values = zero (i.e. no inclination) Other: Proportional-Integral Control action proportionate to ‘error’ and proportionate to accumulation of ‘error’ Joints motor positions  ’s  Gyro

Feedback Overall System Joints motor positions  ’s Mapping P-Control

Feedback Overall System Controller Joints motor positions  ’s Online Adaptation (Stochastic Optimization) Adaptive Control Online tuning of ‘parameters’ of the controller

Stochastic Optimization Approach Goal: –Adjust parameters to achieve faster and more stable walk. Fitness function (cost function) is used to express optimization goals (i.e. speed & robustness) f (.): R N --->R N : number of parameters of interest

Stochastic Optimization Approach The parameters are Kinematic Mapping (Behnke paper)

Stochastic Optimization Approach We evaluate f in a given set of parameters x = [x 1, x 2,..., x N ] (Table 1) Now, how to find the values of the parameters that will result in the highest fitness value? –use a metaheuristic method called PGRL ? +1 d <d exp

Policy Gradient Reinforcement Learning (PGRL) An optimization method to maximize the walking speed It automatically searches a set of possible parameters aiming to find the fastest walk that can be achieved

Policy Gradient Reinforcement Learning How dose PGRL work? 1 st : generates randomly B test polices {x 1, x 2,…, x B } around an initially given set of parameter vector x π (where x = [x 1, x 2, …, x N ]) –Each parameter in a given test policy x i is randomly set to where 1≤i ≤B and 1 ≤j ≤N ε is a small constant value

Policy Gradient Reinforcement Learning 2 nd : –the test policy is evaluated by ‘fitness function’. For each parameter j is grouped into 3 categories Which are depending on where the jth parameter is modified by –ε, 0, +ε

Policy Gradient Reinforcement Learning Next 3 rd, construct vector a=[a 1, a 2, …, a N ] As are average of each category

Policy Gradient Reinforcement Learning Then 4 th (finally), adjust x π as follows where η is a scalar step size

Extension to PRLG Adaptive step size after g steps: where s: the number of fitness functions evaluations S: maximum allowed number of s

Overall Overall System Controller Joints motor positions  ’s PGRL xπxπ

Experiment

Results

speed is 21.3 cm/s fitness is 1.36 Speed is 34.0 cm/s Fitness is 1.52 After 1000 iteration Initial 60%

Parameters

Glossary Stance leg: –the leg which is on the floor during the walk. Swing leg: –the leg which moving during the walk. Single support: –The case where robot is touching the floor with one leg. Double support: –The case where robot is touching the floor with both legs.

Muhammad Al-Nasser Mohammad Shahab Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting King Fahd University of Petroleum.

Similar presentations

Presentation on theme: "Muhammad Al-Nasser Mohammad Shahab Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting King Fahd University of Petroleum."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Muhammad Al-Nasser Mohammad Shahab Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting King Fahd University of Petroleum.

Similar presentations

Presentation on theme: "Muhammad Al-Nasser Mohammad Shahab Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting King Fahd University of Petroleum."— Presentation transcript:

Similar presentations

About project

Feedback