Download presentation
Presentation is loading. Please wait.
Published byDiana Morgan Modified over 9 years ago
1
Muhammad Al-Nasser Mohammad Shahab Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting King Fahd University of Petroleum and Minerals March 2008 COE584: Robotics COE 584/484: Robotics
2
Outline 1.Problem Definition 2.Physical Description 3.Humanoid Walking System 4.Feedback 1.Gyroscope 2.Phase Resetting 5.Stochastic Optimization 1.PGRL 6.Experimentation 7.Comments
3
Problem Definition Authors Felix Faber & Sven Behnke, Univ. of Freinbrg, Germany Problem Statement: “to optimize the walking pattern of a humanoid robot for forward speed using suitable metaheuristics”
4
First Humanoid Robot! 1206 AD Ibn Ismail Ibn al-Razzaz Al-Jazari A boat with four programmable automatic musicians that floated on a lake to entertain guests at royal drinking parties!!
5
Problem Definition Problems? Nonlinear Dynamics: i.e. complex system to control Sensor Noise: Camera Gyroscope Ultrasonic Force … Environment Disturbances: Unknown surface … Inaccurate Actuators: Motors …
6
Physical Description Jupp, team NimbRo 60 cm, 2.3 kg Pocket PC
7
Physical Description Pitch joint to bend trunk Each leg 3DOF hip Knee 2DOF ankle Each arm 2DOF shoulders elbow
8
Humanoid Walking System One Approach Model-Based (Geometric Model) Accurate Model Solving motion equations for all joints (offline) 19 Degrees of Freedom Nonlinear model equations Computational complexity Controller Leg Motion Trajectory Joints motor positions ’s Robot walks!
9
Humanoid Walking System 2 nd Approach Controller Joints motor positions ’s Central Pattern Generators (CPG) Sinusoid joint trajectory generated Bio-Inspired no need for model
10
Humanoid Walking System Open-loop (no feedback) Gait Mechanism 1.Shifting weight from one leg to the other 2.Shortening the leg not needed 3.Leg motion in forward direction
11
Humanoid Walking System Open-loop Gait Clock-driven, Trunk phase being central clock Trunk Phase (with ‘foot step frequency’ ) Right leg motion phase = Trunk + /2 Left leg motion phase = Trunk - /2 time --
12
Humanoid Walking System (continued) Kinematic Mapping Left Right Leg Foot r: Roll p: Pitch y: Yaw “Human-Like Walking using Toes Joint and Straight Stance Leg” by Behnke Swing Swing is leg swing amplitude Is leg extension
13
Feedback Overall Control System Joints motor positions ’s Mapping Controller 1.Gyroscope: Gyro = Inclination (Balance) Angular Velocity 2.Force Sensing Resistors: foot touch ground trigger (‘High’ or ‘Low’)
14
Feedback Gyroscope –device for measuring orientation, based on the principles of conservation of angular momentum –Remember Physics 101!
15
Feedback P-Control Gyro increase = robot fall Proportional Control reactive action proportionate to ‘error’ (Error = sensor value – desired value) Desired values = zero (i.e. no inclination) Other: Proportional-Integral Control action proportionate to ‘error’ and proportionate to accumulation of ‘error’ Joints motor positions ’s Gyro
16
Feedback Overall System Joints motor positions ’s Mapping P-Control
17
Feedback Overall System Controller Joints motor positions ’s Online Adaptation (Stochastic Optimization) Adaptive Control Online tuning of ‘parameters’ of the controller
18
Stochastic Optimization Approach Goal: –Adjust parameters to achieve faster and more stable walk. Fitness function (cost function) is used to express optimization goals (i.e. speed & robustness) f (.): R N --->R N : number of parameters of interest
19
Stochastic Optimization Approach The parameters are Kinematic Mapping (Behnke paper)
20
Stochastic Optimization Approach We evaluate f in a given set of parameters x = [x 1, x 2,..., x N ] (Table 1) Now, how to find the values of the parameters that will result in the highest fitness value? –use a metaheuristic method called PGRL ? +1 d <d exp
21
Policy Gradient Reinforcement Learning (PGRL) An optimization method to maximize the walking speed It automatically searches a set of possible parameters aiming to find the fastest walk that can be achieved
22
Policy Gradient Reinforcement Learning How dose PGRL work? 1 st : generates randomly B test polices {x 1, x 2,…, x B } around an initially given set of parameter vector x π (where x = [x 1, x 2, …, x N ]) –Each parameter in a given test policy x i is randomly set to where 1≤i ≤B and 1 ≤j ≤N ε is a small constant value
23
Policy Gradient Reinforcement Learning 2 nd : –the test policy is evaluated by ‘fitness function’. For each parameter j is grouped into 3 categories Which are depending on where the jth parameter is modified by –ε, 0, +ε
24
Policy Gradient Reinforcement Learning Next 3 rd, construct vector a=[a 1, a 2, …, a N ] As are average of each category
25
Policy Gradient Reinforcement Learning Then 4 th (finally), adjust x π as follows where η is a scalar step size
26
Extension to PRLG Adaptive step size after g steps: where s: the number of fitness functions evaluations S: maximum allowed number of s
27
Overall Overall System Controller Joints motor positions ’s PGRL xπxπ
28
Experiment
29
Results
30
speed is 21.3 cm/s fitness is 1.36 Speed is 34.0 cm/s Fitness is 1.52 After 1000 iteration Initial 60%
31
Parameters
32
Glossary Stance leg: –the leg which is on the floor during the walk. Swing leg: –the leg which moving during the walk. Single support: –The case where robot is touching the floor with one leg. Double support: –The case where robot is touching the floor with both legs.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.