Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimization and Learning via Genetic Programming

Similar presentations


Presentation on theme: "Optimization and Learning via Genetic Programming"— Presentation transcript:

1 Optimization and Learning via Genetic Programming
AI Project #2 Biointelligence lab Cho, Dong-Yeon

2 Optimization of Boolean Functions (1/2)
Parity Functions Even-2 Parity Function set F = {AND, OR, NAND, NOR} Terminal set T = {D0, D1} Fitness function The number of fitness cases for which the individual is incorrect OR D0 D1 Out 1 NOR AND D1 D0 D0 D1 © 2005 SNU CSE Biointelligence Lab

3 Optimization of Boolean Functions (2/2)
IF 6-Multiplexer 2 address bits, 4 data bits Function set F = {AND, OR, NOT, IF} Terminal set T = {A0, A1, D0, D1, D2, D3} Fitness function The number of fitness cases for which the individual is incorrect IF AND D3 D0 D1 Out IF A0 A1 A0 D1 D2 D3 A1 D2 D0 A0 A1 © 2005 SNU CSE Biointelligence Lab

4 © 2005 SNU CSE Biointelligence Lab
Learning a Classifier Pima Indian Diabetes Functions Numerical and condition operators {+, -, *, /, exp, log, sin, cos, sqrt, iflte ifltz, …} Some operators should be protected from the illegal operation. Terminals Input features and constants {x0, x1, … x7, R} where R  [a, b] Additional parameters Threshold value For normalization © 2005 SNU CSE Biointelligence Lab

5 © 2005 SNU CSE Biointelligence Lab
Cross Validation (1/3) K-fold Cross Validation The data set is randomly divided into k subsets. One of the k subsets is used as the test set and the other k-1 subsets are put together to form a training set. D1 D2 D3 D4 D5 D6 128 128 128 128 128 128 D1 D2 D3 D4 D6 D5 128 128 128 128 128 128 D2 D3 D4 D5 D6 D1 128 128 128 128 128 128 © 2005 SNU CSE Biointelligence Lab

6 © 2005 SNU CSE Biointelligence Lab
Cross Validation (2/3) Calculation of the error Confusion Matrix True Predict Positive Negative © 2005 SNU CSE Biointelligence Lab

7 © 2005 SNU CSE Biointelligence Lab
Cross Validation (3/3) Cross validation and Confusion Matrix At least 10 runs for your k value. Show the confusion matrix for the best result of your experiments. Run Test Error 1 2  10 Average © 2005 SNU CSE Biointelligence Lab

8 © 2005 SNU CSE Biointelligence Lab
Initialization Maximum initial depth of trees Dmax is set. Full method (each branch has depth = Dmax): nodes at depth d < Dmax randomly chosen from function set F nodes at depth d = Dmax randomly chosen from terminal set T Grow method (each branch has depth  Dmax): nodes at depth d < Dmax randomly chosen from F  T nodes at depth d = Dmax randomly chosen from T Common GP initialisation: ramped half-and-half, where grow and full method each deliver half of initial population © 2005 SNU CSE Biointelligence Lab

9 © 2005 SNU CSE Biointelligence Lab
Selection (1/2) Fitness proportional (roulette wheel) selection The roulette wheel can be constructed as follows. Calculate the total fitness for the population. Calculate selection probability pk for each chromosome vk. Calculate cumulative probability qk for each chromosome vk. © 2005 SNU CSE Biointelligence Lab

10 © 2005 SNU CSE Biointelligence Lab
Procedure: Proportional_Selection Generate a random number r from the range [0,1]. If r  q1, then select the first chromosome v1; else, select the kth chromosome vk (2 k  pop_size) such that qk-1 < r  qk. pk qk 1 2 3 4 5 6 7 8 9 10 © 2005 SNU CSE Biointelligence Lab

11 © 2005 SNU CSE Biointelligence Lab
Selection (2/2) Tournament selection Tournament size q Ranking-based selection 2    POP_SIZE 1  +  2 and - = 2 - + © 2005 SNU CSE Biointelligence Lab

12 © 2005 SNU CSE Biointelligence Lab
GP Flowchart GA loop GP loop © 2005 SNU CSE Biointelligence Lab

13 © 2005 SNU CSE Biointelligence Lab
Bloat Bloat = “survival of the fattest”, i.e., the tree sizes in the population are increasing over time Ongoing research and debate about the reasons Needs countermeasures, e.g. Prohibiting variation operators that would deliver “too big” children Parsimony pressure: penalty for being oversized © 2005 SNU CSE Biointelligence Lab

14 © 2005 SNU CSE Biointelligence Lab

15 © 2005 SNU CSE Biointelligence Lab
Experiments Two optimization problems Parity Even(odd)-3, Even(odd)-4, Even(odd)-5, … 11-multiplexer 3 address bits, 8 data bits One classification problems Pima Indian diabetes Cross validation Various experimental setup Termination condition: maximum_generation Each setting  10 runs Effects of the penalty term Selection methods and their parameters Different function and terminal sets Crossover and mutation probabilities … © 2005 SNU CSE Biointelligence Lab

16 Results For each problem Result table and your analysis
Present the optimal function. Compare with the results of neural networks Draw a learning curve for the run where the best solution was found. Training (optimization) Test (classification) Average  SD Best Worst Setting 1 Setting 2 Setting 3 © 2005 SNU CSE Biointelligence Lab

17 © 2005 SNU CSE Biointelligence Lab
Fitness (Error) Generation © 2005 SNU CSE Biointelligence Lab

18 © 2005 SNU CSE Biointelligence Lab
References Source Codes GP libraries (C, C++, JAVA, …) MATLAB Tool box Web sites © 2005 SNU CSE Biointelligence Lab

19 © 2005 SNU CSE Biointelligence Lab
Pay Attention! Due: Nov. 1, 2005 Submission Source code and executable file(s) Proper comments in the source code Via Report: Hardcopy!! Running environments Results for many experiments with various parameter settings Analysis and explanation about the results in your own way © 2005 SNU CSE Biointelligence Lab


Download ppt "Optimization and Learning via Genetic Programming"

Similar presentations


Ads by Google