Integrating Neural Network and Genetic Algorithm to Solve Function Approximation Combined with Optimization Problem Term presentation for CSC7333 Machine.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Slides from: Doug Gray, David Poole
NEURAL NETWORKS Backpropagation Algorithm
Neural networks Introduction Fitting neural networks
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Neural Networks  A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Tuomas Sandholm Carnegie Mellon University Computer Science Department
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
Machine Learning Neural Networks
Data Mining Techniques Outline
Artificial Neural Networks ML Paul Scheible.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Neural Networks Marco Loog.
Lecture 4 Neural Networks ICS 273A UC Irvine Instructor: Max Welling Read chapter 4.
Artificial Neural Networks
ICS 273A UC Irvine Instructor: Max Welling Neural Networks.
The Performance of Evolutionary Artificial Neural Networks in Ambiguous and Unambiguous Learning Situations Melissa K. Carroll October, 2004.
Introduction to Directed Data Mining: Neural Networks
Microsoft Enterprise Consortium Data Mining Concepts Introduction to Directed Data Mining: Neural Networks Prepared by David Douglas, University of ArkansasHosted.
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Genetic Algorithm.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Artificial Neural Networks
Biointelligence Laboratory, Seoul National University
Evolving a Sigma-Pi Network as a Network Simulator by Justin Basilico.
Multiple-Layer Networks and Backpropagation Algorithms
Cascade Correlation Architecture and Learning Algorithm for Neural Networks.
C. Benatti, 3/15/2012, Slide 1 GA/ICA Workshop Carla Benatti 3/15/2012.
Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Classification / Regression Neural Networks 2
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 9: Ways of speeding up the learning and preventing overfitting Geoffrey Hinton.
Evolving Virtual Creatures & Evolving 3D Morphology and Behavior by Competition Papers by Karl Sims Presented by Sarah Waziruddin.
Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.
Akram Bitar and Larry Manevitz Department of Computer Science
1 Genetic Algorithms and Ant Colony Optimisation.
Neural Network Implementation of Poker AI
 Based on observed functioning of human brain.  (Artificial Neural Networks (ANN)  Our view of neural networks is very simplistic.  We view a neural.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
CS621 : Artificial Intelligence
CITS7212: Computational Intelligence An Overview of Core CI Technologies Lyndon While.
Artificial Neural Networks (Cont.) Chapter 4 Perceptron Gradient Descent Multilayer Networks Backpropagation Algorithm 1.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Artificial Neural Network
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
1 Autonomic Computer Systems Evolutionary Computation Pascal Paysan.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
129 Feed-Forward Artificial Neural Networks AMIA 2003, Machine Learning Tutorial Constantin F. Aliferis & Ioannis Tsamardinos Discovery Systems Laboratory.
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
Evolutionary Design of the Closed Loop Control on the Basis of NN-ANARX Model Using Genetic Algoritm.
Evolutionary Computation Evolving Neural Network Topologies.
Machine Learning Supervised Learning Classification and Regression
Fall 2004 Backpropagation CS478 - Machine Learning.
Artificial Neural Networks
The Gradient Descent Algorithm
One-layer neural networks Approximation problems
Classification / Regression Neural Networks 2
Capabilities of Threshold Neurons
Deep Learning for Non-Linear Control
Lecture Notes for Chapter 4 Artificial Neural Networks
Neural Networks ICS 273A UC Irvine Instructor: Max Welling
Neural networks (1) Traditional multi-layer perceptrons
Presentation transcript:

Integrating Neural Network and Genetic Algorithm to Solve Function Approximation Combined with Optimization Problem Term presentation for CSC7333 Machine Learning Xiaoxi Xu May 3, 2006

Outline Problem description & analysis Neural Network Genetic Algorithm Implementations & experiments Results Remarks Conclusion

Problem Description We have plentiful data gathered over time; We are not aware the underlying relationship between the data input ( some are human controllable) and its output; We expect to minimize or maximize the output in the future; We hope to know that what kind of input would generate a minimum or maximum output, so that we could adjust the input to achieve our end.

Problem Analysis The characteristics of this problem are: a.Unknown exact nature of relationship between input and output, likely non-linear b. Inputs are likely in N-dimension (N>10) In addition…. d. The global optimum is expected to obtain

Problem Break Up 1. Function Approximation 2. Optimization Problem

Solution for Function Approximation This solution should meet the following requirements: a. Have the parallel structure to handle N dimension variables b. Be able to model the nonlinear relation between variables and their responses c. Had better to be fault-tolerant (noisy data could appear when data set is large)

Solution for Function Approximation (Cont ’ d) Neural Network could be one. Why? (Any function can be approximated to arbitrary accuracy by a network with 3 layers with linear transfer function in output layer, and sigmoid function in hidden layer) We want to train a NN as such: Topology: Multiple Layer Network Connection Type: Feed-forward (From a mathematical point of view, a feed-forward neural network is a function. It takes an input and produces an output.) Transfer Function: Logsigmoid, linear Training Algorithm: Back-propagation

Problem Break Up 1. Function Approximation 2. Optimization Problem

Solution for Optimization Problem To solve it mathematically? In mathematics, LOCAL optima can ONLY be found for functions with good properties such as convex or unimodel. These line search methods include Conjugate gradient descent, quasi-Newton and so on. This solution should meet the following requirements: a. Be able to recognize the expression of the objective function b. Be able to solve the function c. Should have a better chance to find a global optimum

Solution for Optimization Problem (Cont ’ d) Genetic Algorithm could be one. Why? ( GA have most been applied to optimization problems)  We can use GA as such: a. Representation ( Any real number can be represented by a string of numerical numbers in base 10) b. Fitness function ( Neural Net) c. Genetic operators ( Crossover,Selection,Mutation )

Implementation & Experiment of NN for 2D Function Approximation Initialization with random selected weights Multiple layer with one hidden layer of 20 hidden nodes Transfer function: sigmoid function (hidden layer), purelin function( output layer) Back-propagation algorithm Learning rate: 0.05 Momentum: 0.1 Stop criteria: 1.MSE below 0.01% 2.Exceed epochs ( training times) 100 Test function: tan(sin(x)) - sin(tan(x)) Training data: [-4,4] -4, -3.6, -3.2, -2.8

Implementation & Experiment of Genetic Algorithm for 2D Function Optimization  Representation: A string of numerical number in base 10 to represent a real number and its sign;  Random initialization; Range: [-4,4];  Population size: 25; Chromosome length: 6;  One point crossover pr: 0.7; Mutation pr: 0.3;  Roulette wheel selection: preference to best-fit individual ;  Fitness function: Neural Network (represented by inputs,weights and biases as follows: weight_oh*sigmoid(weight_hi* input + bias_hi*1)+ bias_oh*1  Elitism: best-fit individual goes to next generation;  Stop criteria 1. Value of the fitness function changes less than 0.01 after 10 consecutive generations 2. Maximum generation is 30

Experiment Result-2D Real function--red solid line Approximation by NN –blue dash line Optimal by GA for approximation function --- magnate star

Implementation of NN for 3D Function Approximation One hidden layer of 30 hidden nodes Learning rate: 0.06 Momentum: 0.5 Stop criteria: 1.MSE below 0.01% 2.Exceed epochs ( training times) 250 Test function: 1.85*sin(x)*exp(-3*(y-1.5)^2)+0.7*x*exp(-4*(x-1.2)^2)-1.4*cos(x+ y)*exp(-5*(y+1.3)^2)-1.9*exp(-8*(x+0.5)^2) Training data: [-1,3] (difference between numbers is 0.17)

Implementation & Experiment of Genetic Algorithm for 3D Function Optimization  Random initialization  Range: [-1,3]  Population size: 40  Chromosome length: 12  One point crossover pr: 0.8  Mutation pr: 0.6  Stop criteria: 1. Fitness function changes less than 0.01,after 10 consecutive generations 2. Maximum generation is 100

Experiment Result- Mesh

Experiment Result - Contour Map Optimum-- black circle

One More Experiment sin(0.007*x^5)/cos(exp(0.0009*y^7))

Remarks How to adjust some parameters for NN? It’s really application-dependent. With regard to my experiments: Learning rate: too small, no apparent decrease of MSE over long while too large, MSE jumped between decrease and increase Momentum: compared to smaller, larger value could model better but keep it in a proper degree Hidden nodes: keep it in a proper degree,otherwise will be overfitting more nodes, better performance but our computer was down before we knew how best the performance would be Epochs: Longer, better performance, but avoid overfittng trade off between the accuracy and the training time

Remarks (cont ’ d) How to adjust some parameters for GA?  Population size: bigger is better, keep in a proper degree, otherwise will be overfitting  Crossover probability: typically, [0.6,0.9] it works in my experiment  Mutation probability: typically, [1/pop_size,1/chromosome_length] larger value in my experiment  Generation size: larger, better performance,but avoid overfitting trade off between the accuracy and the time Random initialization influences on the performance success We used random selected weights in training NN and used random selected individuals for the first generation of GA. We found that sometimes, random initialization value determines the success of the performance

Room for Improvements Random data would be used to train the neural network with noisy data added More complex examples would be tested for the performance of NN Building up more knowledge on adjusting NN & GA parameters Error surface would be shown Time complexity would be analyzed

Conclusion Integrating GA&NN by using NN as the fitness function of GA could approximate a good global optimum pretty close to that found by GA using the real function GA performs well in searching for the global optimum no matter what the fitness function would be In practical, Multiple Layer NNs with one hidden layer have an overall good performance in function approximation, but sometimes they still have the difficulty The random initialization value could sometimes determine the performance success

Thanks! Questions & Comments ??