Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.

Similar presentations


Presentation on theme: "Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs."— Presentation transcript:

1

2 Genetic Algorithms

3 Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs and statistics –decile performance maximization –multi-objective models

4 Natural Genetics to AI Computational models inspired by biological evolution –survival of the fittest –reproduction through cross-breeding

5 Genetic Algorithms Population based search (parallel) –simultaneous search from multiple points in search space –useful in complex, unstructured search spaces (less prone to local failures) Population members: potential solutions Population of solutions evolve from one generation to the next

6 Genetic Algorithms Search objective –Fitness score for population members (fitness function) Survival of the fittest –selection Generating new solutions –“Mating” and reproduction of individuals (crossover, mutation)

7 Basic Operation Selection Recombination Crossover Mutation Generation tGeneration t+1

8 GAs: Parallel Search X X Hill climber Fitness x

9 GAs: Basic Principles Representation of individuals –String of parameters (genes) : chromosome eg. optimize a function F(p,q,r,s,t) Population members: p q r s t –genotype and phenotype

10 Binary representation? Population members as bit strings F( p,q,r,s,t) as: 1 0 0 1 1 0 1 0 1 1 0 1 1 0 0 1 1 0 1 0 p q r s t –early theory in terms of binary strings (schema theorem) –unnecessary perversity?

11 GAs: Basic Principles Survival of the fittest (Fitness function) –numerical “figure of merit”/utility measure of an individual –tradeoff amongst a multiple evaluation criteria –efficient evaluation

12 GAs: Basic Principles Iterative search –population evolves over generations Convergence –progression towards uniformity in population –premature convergence? (local optima)

13 Typical GA Run Fitness Generations Best Average

14 Operators: Selection Fitness proportionate selection (f i /f ) number of reproductive trials for individuals

15 Selection Roulette-wheel selection (stochastic sampling with replacement) – wheel spaced in proportion to fitness values –N (pop size) spins of the wheel Stochastic universal sampling –N equally spaced pins on wheel –single turn of the wheel

16 Selection Premature converge Fitness scaling f = f - (2*avg. - max.) Ranked fitness Elitism Steady-state selection Demetic grouping

17 Operators: Crossover Parent 1: axpsqvqbtpihd Parent 2: qzxxaycgbtphw crossover sites Offspring 1: azpsavcbtpphd Offspring 2: qxxxqyqgbtihw (Uniform crossover) combining good building blocks

18 Operators: Mutation alters each gene with small probability x 1 y x 0 y 0 y y 0 x y x y x 1 y x 0 y 1 y y 0 x x x y

19 Non-Binary Representations Integer, real-number, order-based, rules,... Binary or Real-valued? real representations give faster, more consistent, more accurate results High-level representation –intuitive, can utilize specialized operators –effective search over complex spaces

20 Real-valued representation Parent1: 3.45 0.56 6.78 0.976 2.5 Parent2: 0.98 1.06 4.20 0.34 1.8 Offspring1: 3.22 0.56 6.78 0.65 2.12 Offspring2: 1.43 1.06 4.20 0.41 1.93 (Arithmetic crossover)

21 High-level representation Parent1: Parent2: Offspring1: Offspring2:

22 High-level representation Generalize/Specialize

23 Tree-structured representation (GP) / x 5 log * (x log(y))/5) y Automated learning of programs (originally) parse tree expressions Non-linear interaction terms Function set : internal nodes {+,-,*,/,log} terminal set: leaf nodes {constants, variables}

24 Tree-structured representation Representing complex patterns < if y 7 0 * y x2 + AND > x2 If (y 2) then 0 else 2x+y

25 Genetic search: Issues Coding scheme, fitness function critical –the “art” in GA design! –General mechanism so robust that, within reasonable margins, parameter settings are not critical. Representation to match problem, domain –utilizing domain knowledge problem-specific crossover, mutation, selection Flexibility in fitness function formulation –modeling business objectives

26 Genetic search: Issues Stochastic search –initial populations, probabilistic operators –multiple runs with different random streams –Initializing population with known solutions –seeding initial population with solutions from multiple, independent runs

27 Genetic search: Issues Guarantees optimality? –But... GAs and traditional techniques –especially useful where traditional approaches fail –in conjunction with traditional techniques Parallelizable for large data –multi-processor, networked machines

28 Using GAs ? When to use a GA? GA and traditional techniques How long does it take? Will it perform better?

29 Using GAs population size mutation, crossover rates how many generations multiple runs

30 Is it a “black-box”? ? Huh? Data characteristics Fitness function GA parameters

31 GA Application Examples Function optimizers –difficult, discontinuous, multi-modal, noisy functions Combinatorial optimization –layout of VLSI circuits, factory scheduling, traveling salesman problem Design and Control –bridge structures, neural networks, communication networks design; control of chemical plants, pipelines

32 GA Application Examples Machine learning –classification rules, economic modeling, scheduling strategies Portfolio design, optimized trading models, direct marketing models, sequencing of TV advertisements, adaptive agents, data mining, etc.


Download ppt "Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs."

Similar presentations


Ads by Google