Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP,

Slides:



Advertisements
Similar presentations
Biologically Inspired AI (mostly GAs). Some Examples of Biologically Inspired Computation Neural networks Evolutionary computation (e.g., genetic algorithms)
Advertisements

Genetic Programming 김용덕 Page 2 Contents What is Genetic Programming? Difference between GP and GA Flowchart for GP Structures in GP.
Non-Linear Problems General approach. Non-linear Optimization Many objective functions, tend to be non-linear. Design problems for which the objective.
1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Doug Downey, adapted from Bryan Pardo, Machine Learning EECS 349 Machine Learning Genetic Programming.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.
Genetic Algorithms Nehaya Tayseer 1.Introduction What is a Genetic algorithm? A search technique used in computer science to find approximate solutions.
Intro to AI Genetic Algorithm Ruth Bergman Fall 2004.
Chapter 6: Transform and Conquer Genetic Algorithms The Design and Analysis of Algorithms.
Genetic Programming. Agenda What is Genetic Programming? Background/History. Why Genetic Programming? How Genetic Principles are Applied. Examples of.
Coordinative Behavior in Evolutionary Multi-agent System by Genetic Algorithm Chuan-Kang Ting – Page: 1 International Graduate School of Dynamic Intelligent.
Genetic Programming Dinesh Dharme Prateek Srivastav Pankhil Chheda
Genetic Programming.
Genetic Programming Chapter 6. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Programming GP quick overview Developed: USA.
Initial Value Problem: Find y=f(x) if y’ = f(x,y) at y(a)=b (1) Closed-form solution: explicit formula -y’ = y at y(0)=1 (Separable) Ans: y = e^x (2)
Genetic Algorithm.
Evolutionary Intelligence
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory Mixed Integer Problems Most optimization algorithms deal.
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
What is Genetic Programming? Genetic programming is a model of programming which uses the ideas (and some of the terminology) of biological evolution to.
Genetic algorithms Charles Darwin "A man who dares to waste an hour of life has not discovered the value of life"
Introduction to Evolutionary Algorithms Session 4 Jim Smith University of the West of England, UK May/June 2012.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Genetic Algorithms Siddhartha K. Shakya School of Computing. The Robert Gordon University Aberdeen, UK
Brief introduction to genetic algorithms and genetic programming A.E. Eiben Free University Amsterdam.
G ENETIC P ROGRAMMING Ranga Rodrigo March 17,
Artificial Intelligence Chapter 4. Machine Evolution.
Initial Population Generation Methods for population generation: Grow Full Ramped Half-and-Half Variety – Genetic Diversity.
Exact and heuristics algorithms
Chapter 9 Genetic Algorithms.  Based upon biological evolution  Generate successor hypothesis based upon repeated mutations  Acts as a randomized parallel.
Genetic Programming. GP quick overview Developed: USA in the 1990’s Early names: J. Koza Typically applied to: machine learning tasks (prediction, classification…)
Genetic Algorithms What is a GA Terms and definitions Basic algorithm.
ECE 103 Engineering Programming Chapter 52 Generic Algorithm Herbert G. Mayer, PSU CS Status 6/4/2014 Initial content copied verbatim from ECE 103 material.
Project 2: Classification Using Genetic Programming Kim, MinHyeok Biointelligence laboratory Artificial.
EE749 I ntroduction to Artificial I ntelligence Genetic Algorithms The Simple GA.
Machine Learning A Quick look Sources: Artificial Intelligence – Russell & Norvig Artifical Intelligence - Luger By: Héctor Muñoz-Avila.
Genetic Programming A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Chapter 6.
GENETIC PROGRAMMING. THE CHALLENGE "How can computers learn to solve problems without being explicitly programmed? In other words, how can computers be.
Introduction Genetic programming falls into the category of evolutionary algorithms. Genetic algorithms vs. genetic programming. Concept developed by John.
GENETIC ALGORITHM Basic Algorithm begin set time t = 0;
1 Autonomic Computer Systems Evolutionary Computation Pascal Paysan.
Genetic Programming COSC Ch. F. Eick, Introduction to Genetic Programming GP quick overview Developed: USA in the 1990’s Early names: J. Koza Typically.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Genetic Algorithms. Underlying Concept  Charles Darwin outlined the principle of natural selection.  Natural Selection is the process by which evolution.
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
Symbolic Regression via Genetic Programming AI Project #2 Biointelligence lab Cho, Dong-Yeon
Genetic Programming Using Simulated Natural Selection to Automatically Write Programs.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
Genetic Programming. What is Genetic Programming? GP for Symbolic Regression Other Representations for GP Example of GP for Knowledge Discovery Outline.
Evolving Recursive Algorithms The SRF function Automatically defined recursions (ADRs) The recur operator Dealing with infinite and time-consuming recursions.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
Genetic Programming.
Introduction Genetic programming falls into the category of evolutionary algorithms. Genetic algorithms vs. genetic programming. Concept developed by John.
An evolutionary approach to solving complex problems
GENETIC PROGRAMMING BBB4003.
Basics of Genetic Algorithms (MidTerm – only in RED material)
Artificial Intelligence Chapter 4. Machine Evolution
Basics of Genetic Algorithms
Artificial Intelligence Chapter 4. Machine Evolution
EE368 Soft Computing Genetic Algorithms.
Genetic Programming Chapter 6.
Genetic Programming.
Genetic Programming Chapter 6.
Genetic Programming Chapter 6.
GENETIC PROGRAMMING BBB4003.
Beyond Classical Search
Coevolutionary Automated Software Correction
Presentation transcript:

Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP, etc.) have been successfully applied in science How about mathematics? Can they be used to discover interesting relationships in mathematical “data”? How about mathematics? Can they be used to discover interesting relationships in mathematical “data”? This is an exploration of using GP for that purpose This is an exploration of using GP for that purpose Specifically, using GP to automatically discover Euler’s identity (V – E + F = 2) from a fairly limited amount of data Specifically, using GP to automatically discover Euler’s identity (V – E + F = 2) from a fairly limited amount of data

Cubes V = 8 E = 12 E = 12 F = 6 V – E + F = 8 – = 2

Tetrahedra V = 4 E = 6 F = 4 V – E + F = 4 – = 2

Octahedra V = 6 E = 12 F = 8 V – E + F = 6 – = 2

Data for Euler’s identity Data for Euler’s identityPolyhedron V E F 1Cube Triangular prism Pentagonal prism Square pyramid Triangular pyramid Pentagonal pyramid Octahedron Tower Truncated cube 10157

At a glance 50 generations 50 generations Population: 4000 ASTs Population: 4000 ASTs Generation #: 3600 (90% of population) Generation #: 3600 (90% of population) Maximum AST depth: 13 Maximum AST depth: 13 Ramped half-and-half initialization Ramped half-and-half initialization 3 non-terminals: +, -, * 3 non-terminals: +, -, * 12 terminals: V, E, F, 1, 2, …, 9 12 terminals: V, E, F, 1, 2, …, 9 Crossover, no mutation Crossover, no mutation

Genetic algorithms (GA) Search a space of solution attempts (“individuals”) Search a space of solution attempts (“individuals”) Use natural selection to guide the search Use natural selection to guide the search Must have a fitness function that can evaluate any given individual Must have a fitness function that can evaluate any given individual Individuals procreate by exchanging (recombining) “genetic material” Individuals procreate by exchanging (recombining) “genetic material”

Example: SAT solving Problem: Given a CNF formula P over n variables x 1,…,x n, find a satisfying assignment Problem: Given a CNF formula P over n variables x 1,…,x n, find a satisfying assignment Search space: all n-bit strings Search space: all n-bit strings Fitness measure for a given individual Fitness measure for a given individual b 1  b n : # of satisfied clauses in P b 1  b n : # of satisfied clauses in P Genetic operations: crossover and mutation Genetic operations: crossover and mutation

a 1 … a j-1 | a j … a n + b 1 … b j-1 | b j … b n a 1 … a j-1 | b j … b n b 1 … b j-1 | a j … a n Crossover: Mutation:

Generic GA algorithm 1. Construct a random initial population 2. Set i := 1 3. If i > N then halt 4. Compute the fitness of each individual; if the fittest solves the problem, halt. if the fittest solves the problem, halt. 5. Create a new population: 1.Pick P – G individuals and copy them 2.Create G new individuals by repeated applications of genetic operations 6. Set i := i + 1 and go to step 3 Parameterized over: N, P, G

Selection How is an individual “picked” for reproduction or copying? How is an individual “picked” for reproduction or copying? Main idea: the probability that an individual is selected should be proportional to the individual’s fitness Main idea: the probability that an individual is selected should be proportional to the individual’s fitness Many ways to ensure that. One method is tournament selection: Many ways to ensure that. One method is tournament selection: – Pick 0 < k <= P individuals randomly – Select the fittest of the k When k = 1: No selection pressure When k = 1: No selection pressure When k = P: Too much selection pressure When k = P: Too much selection pressure

Genetic Programming (GP) An instance of the generic GA scheme An instance of the generic GA scheme Individuals are now programs, i.e., syntactic objects Individuals are now programs, i.e., syntactic objects Search space is kept finite by bounding program size Search space is kept finite by bounding program size Programs are represented as ASTs (abstract syntax trees) Programs are represented as ASTs (abstract syntax trees)

if x > 0 then y := x * x y := x * xelse y := z + 1 y := z + 1 if > x 0 := y x x * := y z 1 + Programs as ASTs Parsing

Program structure in GP Programs are usually simple Herbrand terms, i.e., functional expressions Programs are usually simple Herbrand terms, i.e., functional expressions AST leaves are called terminals AST leaves are called terminals Internal nodes are non-terminals Internal nodes are non-terminals Non-terminals are function symbols (e.g. +) Non-terminals are function symbols (e.g. +) Terminals are constants and variables Terminals are constants and variables Terminals + non-terminals must be sufficient for expressing solutions Terminals + non-terminals must be sufficient for expressing solutions

Viewing a functional AST as a “program” + * x2 y The program has two “inputs”, x and y. Given specific values for these, it produces a unique result as output

* T3T3T3T3 T1T1T1T1 T2T2T2T2 + T4T4T4T4 T5T5T5T5 T6T6T6T T3T3T3T3 T5T5T5T5 T6T6T6T6 + T4T4T4T4 T1T1T1T1 T2T2T2T2 - * Crossover pt 1 Crossover pt 2 AST Crossover AST CrossoverParents Children

Initial population Built randomly Built randomly Two methods for building a random AST: Two methods for building a random AST: – Full method: All branches are equally long – Grow method: Different subtrees can have different sizes (but less than the maximum) More usual: ramped half-and-half initialization: half of the trees are built with one method, the other half with the other method More usual: ramped half-and-half initialization: half of the trees are built with one method, the other half with the other method

Problem formulation Can cast it as a standard symbolic regression problem Can cast it as a standard symbolic regression problem View F as a function of E and V, and search space of all rational functions of two variables (up to a max depth) View F as a function of E and V, and search space of all rational functions of two variables (up to a max depth) Error function: difference between actual # of faces and the result produced by the program Error function: difference between actual # of faces and the result produced by the program Optimization: minimize the error Optimization: minimize the error Quick convergence Quick convergence

Another approach Search space of all identities Search space of all identities Generated as follows: Generated as follows: I T 1 = T 2 I T 1 = T 2 T L | T 1 + T 2 | T 1 – T 2 | T 1 * T 2 T L | T 1 + T 2 | T 1 – T 2 | T 1 * T 2 L V | E | F | 1 | 2 | … | 9 L V | E | F | 1 | 2 | … | 9 Any other integer can be built from 1,…, 9 and the given non-terminals Any other integer can be built from 1,…, 9 and the given non-terminals Identity is not a non-terminal; it can only appear at the root of an AST Identity is not a non-terminal; it can only appear at the root of an AST

Details Generate P identities randomly (using ramped half-and-half initialization) Generate P identities randomly (using ramped half-and-half initialization) Crossover on two identities S 1 = S 2 and T 1 = T 2 : Crossover on two identities S 1 = S 2 and T 1 = T 2 : Mate two random subterms S i and T j from each identity, producing two new subterms S i ’ and T j ’ Mate two random subterms S i and T j from each identity, producing two new subterms S i ’ and T j ’ If either new term is deeper than the max depth, then use one of the original parents If either new term is deeper than the max depth, then use one of the original parents Replace S i and T j in the identities by S i ’ and T j ’ Replace S i and T j in the identities by S i ’ and T j ’ No mutation No mutation

Fitness An identity is evaluated on a given triple of values for V, E, and F An identity is evaluated on a given triple of values for V, E, and F Computing the fitness of an identity Computing the fitness of an identity S = T: S = T:  For each of the k data triples ½ :  If S = T holds for ½, then give the identity a point Higher score, greater fitness Higher score, greater fitness Maximum fitness: 9, minimum: 0 Maximum fitness: 9, minimum: 0

Problem Trivially true identities can get perfect scores, e.g.: Trivially true identities can get perfect scores, e.g.:  V = V  = 5 – 3  E – E + E = E Solution: negative triples, e.g.: Solution: negative triples, e.g.:  V = 0, E = 0, F = 1 Trivial identities will hold for such negative triples, but plausible identities will not Trivial identities will hold for such negative triples, but plausible identities will not

Fitness computation To evaluate an identity S = T: To evaluate an identity S = T: For each of the k data triples p : For each of the k data triples p : – Allocate a point if S = T holds for p – Allocate a second point if S = T does not hold for the negative triple Maximum score: 18, minimum: 0 Maximum score: 18, minimum: 0 Also impose a penalty of b n/20 c points for an identity of length n (to discourage excessively long expressions) Also impose a penalty of b n/20 c points for an identity of length n (to discourage excessively long expressions)