Presentation is loading. Please wait.

Presentation is loading. Please wait.

Goal Finding Robot using Fuzzy Logic and Approximate Q-Learning

Similar presentations


Presentation on theme: "Goal Finding Robot using Fuzzy Logic and Approximate Q-Learning"— Presentation transcript:

1 Goal Finding Robot using Fuzzy Logic and Approximate Q-Learning
CAP 5636 Project Presentation by S SANTHOSH CHAITANYA

2 Background Significant usage in indoor environments
Hospital Shop stores Hotel Robot navigation in a semi-structural environment has three main problems: Cognition by sensors Navigation obstacle avoidance. Fuzzy logic algorithms can solve complex navigational problem with reasonable accuracy Q- learning helps to understands the environment so that the navigational strategy during training phase can be adapted to new environment.

3 Program Design using Fuzzy Logic
Robot uses Goal location Direction θ = robot’s odometer angle φ= arctan( 𝑍 𝑔 − 𝑍 𝑟 𝑋 𝑔 − 𝑋 𝑟 )

4 Continued… -100*β -25*β 0*β 25*β 100*β
Membership function of ω(angular velocity) Z: Zero, SN: small negative, SP: small positive, BP: Big positive

5 continued… while(d_goal>distance_threshold) { get_sensor_data();
//initialize odometry readings of the robot while(d_goal>distance_threshold) { get_sensor_data(); compuete_distance_to_goal(); compuete orientation_angle(); //beta angular_speed1 = compuete_membership(); angular_speed2 = oam();//obstacle avoidance module if(obstacle_exists) guided by oam module } else guided by membership function

6 Obstacle avoidance module
void oam(void){ double oam_delta = 0; oam_speed[LEFT]=oam_speed[RIGHT]=0; if (obstacle[PS_1] || obstacle[PS_2]) { //turn left oam_delta += (int) (k_2* ps_value[PS_LEFT_90]); oam_delta += (int) (k_1 * ps_value[PS_LEFT_45]); } else if (obstacle[PS_5] || obstacle[PS_6]) //turn right oam_delta -= (int) (k_2* ps_value[PS_LEFT_90]); oam_delta -= (int) (k_1 * ps_value[PS_LEFT_45]); oam_speed[LEFT] -= oam_delta; oam_speed[RIGHT] += oam_delta;

7 Program Design for Approximate Q-learning
Actions Forward Forward Left Forward Right Circular Left Circular Right For each state feature vector is combinational conditions of below 3 values: distance to obstacle if obstacle present val = 1.0 else val = 0.0 distance to goal = val/10 difference angle between goal and bot -0.1<d_angle< forward = 1 d_angle> forward = 1 d_angle<0.1

8 Main algorithm flow For I in range(1,nooftrainingepisodes) Until goal state is reached or hit with wall for each state computes the Q = (product of weights and feature vectors) get action with maximum q value Take that action Update reward using reward function for current action. updateWeights wi = wi+α⋅difference⋅fi(s,a) difference=(r+γ*maxa′Q(s′,a′))−Q(s,a) epsilon = γ is gamma = 0.8 α is alpha = 0.2

9 rewards ////if there is no obstacle reward is calculated as below
if (action == FORWARD) reward += 0.3; else if (action ==LEFT || action==RIGHT|| action ==FORWARDLEFT || action==FORWARDRIGHT) reward += 0.07; //if there is obstacle rewards is calculated in this way if (obstacle is forward) { reward += -0.15; } else if (obstacle ==LEFT || obstacle ==RIGHT|| obstacle ==FORWARDLEFT || obstacle ==FORWARDRIGHT) reward += -0.05; //reward for orientation toward goal direction. More toward the goal more reward and vice versa reward +=1/(fabsf(d_angle)*10);

10 Implementation Both Algorithms are implemented on E-puck robot on we-bots simulator It has 8 IR sensors out of 6 forward sensors are used for sensing obstacle During learning phase Sends weight values to controller Supervisor controller Robot controller Sends weight values to controller and Saves weight values in text file before restating simulation

11 Continued… Assumptions: Map contains only bot and obstacles
Only static obstacle are considered for simulation

12 Analysis Fuzzy logic seems to provide solution in any episode and on multiple scenarios Q- learning requires lot of tuning of features for learning the environment( algorithm took around 1.0 hours to learn the simulation arena). No of episodes used for training are

13 Future Goals Implement q learning algorithm on more complex scenario like moving obstacle present in the environment by fine tuning feature vectors

14 references [1]Strauss, Clement and Sahin, Ferat, "Autonomous navigationased on a Q-learning algorithm for a robot"(2008). [2] Webot guide [3] Mohannad Abid Shehab Ahmed ,"OPTIMUM SHORT PATH FINDER FOR ROBOT USING LEARNING",vol.05, No. 01, pp , June 2012 [4] Reinforcement Learning on the Lego Mindstorms NXT Robot. Analysis and Implementation

15 Questions?

16 Goal Finding Robot using Fuzzy Logic and Approximate Q-Learning
CAP 5636 Project Presentation by S SANTHOSH CHAITANYA


Download ppt "Goal Finding Robot using Fuzzy Logic and Approximate Q-Learning"

Similar presentations


Ads by Google