Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP,

Similar presentations


Presentation on theme: "Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP,"— Presentation transcript:

1 Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP, etc.) have been successfully applied in science How about mathematics? Can they be used to discover interesting relationships in mathematical “data”? How about mathematics? Can they be used to discover interesting relationships in mathematical “data”? This is an exploration of using GP for that purpose This is an exploration of using GP for that purpose Specifically, using GP to automatically discover Euler’s identity (V – E + F = 2) from a fairly limited amount of data Specifically, using GP to automatically discover Euler’s identity (V – E + F = 2) from a fairly limited amount of data

2 Cubes V = 8 E = 12 E = 12 F = 6 V – E + F = 8 – 12 + 6 = 2

3 Tetrahedra V = 4 E = 6 F = 4 V – E + F = 4 – 6 + 4 = 2

4 Octahedra V = 6 E = 12 F = 8 V – E + F = 6 – 8 + 12 = 2

5 Data for Euler’s identity Data for Euler’s identityPolyhedron V E F 1Cube 8126 2 Triangular prism 695 3 Pentagonal prism 10 157 4 Square pyramid 585 5 Triangular pyramid 464 6 Pentagonal pyramid 6106 7Octahedron 6128 8Tower 9169 9 Truncated cube 10157

6 At a glance 50 generations 50 generations Population: 4000 ASTs Population: 4000 ASTs Generation #: 3600 (90% of population) Generation #: 3600 (90% of population) Maximum AST depth: 13 Maximum AST depth: 13 Ramped half-and-half initialization Ramped half-and-half initialization 3 non-terminals: +, -, * 3 non-terminals: +, -, * 12 terminals: V, E, F, 1, 2, …, 9 12 terminals: V, E, F, 1, 2, …, 9 Crossover, no mutation Crossover, no mutation

7 Genetic algorithms (GA) Search a space of solution attempts (“individuals”) Search a space of solution attempts (“individuals”) Use natural selection to guide the search Use natural selection to guide the search Must have a fitness function that can evaluate any given individual Must have a fitness function that can evaluate any given individual Individuals procreate by exchanging (recombining) “genetic material” Individuals procreate by exchanging (recombining) “genetic material”

8 Example: SAT solving Problem: Given a CNF formula P over n variables x 1,…,x n, find a satisfying assignment Problem: Given a CNF formula P over n variables x 1,…,x n, find a satisfying assignment Search space: all n-bit strings Search space: all n-bit strings Fitness measure for a given individual Fitness measure for a given individual b 1  b n : # of satisfied clauses in P b 1  b n : # of satisfied clauses in P Genetic operations: crossover and mutation Genetic operations: crossover and mutation

9 a 1 … a j-1 | a j … a n + b 1 … b j-1 | b j … b n a 1 … a j-1 | b j … b n b 1 … b j-1 | a j … a n Crossover: Mutation: 0 1 1 0 1 0 0 1 0 1 1 0 0 0 0 1

10 Generic GA algorithm 1. Construct a random initial population 2. Set i := 1 3. If i > N then halt 4. Compute the fitness of each individual; if the fittest solves the problem, halt. if the fittest solves the problem, halt. 5. Create a new population: 1.Pick P – G individuals and copy them 2.Create G new individuals by repeated applications of genetic operations 6. Set i := i + 1 and go to step 3 Parameterized over: N, P, G

11 Selection How is an individual “picked” for reproduction or copying? How is an individual “picked” for reproduction or copying? Main idea: the probability that an individual is selected should be proportional to the individual’s fitness Main idea: the probability that an individual is selected should be proportional to the individual’s fitness Many ways to ensure that. One method is tournament selection: Many ways to ensure that. One method is tournament selection: – Pick 0 < k <= P individuals randomly – Select the fittest of the k When k = 1: No selection pressure When k = 1: No selection pressure When k = P: Too much selection pressure When k = P: Too much selection pressure

12 Genetic Programming (GP) An instance of the generic GA scheme An instance of the generic GA scheme Individuals are now programs, i.e., syntactic objects Individuals are now programs, i.e., syntactic objects Search space is kept finite by bounding program size Search space is kept finite by bounding program size Programs are represented as ASTs (abstract syntax trees) Programs are represented as ASTs (abstract syntax trees)

13 if x > 0 then y := x * x y := x * xelse y := z + 1 y := z + 1 if > x 0 := y x x * := y z 1 + Programs as ASTs Parsing

14 Program structure in GP Programs are usually simple Herbrand terms, i.e., functional expressions Programs are usually simple Herbrand terms, i.e., functional expressions AST leaves are called terminals AST leaves are called terminals Internal nodes are non-terminals Internal nodes are non-terminals Non-terminals are function symbols (e.g. +) Non-terminals are function symbols (e.g. +) Terminals are constants and variables Terminals are constants and variables Terminals + non-terminals must be sufficient for expressing solutions Terminals + non-terminals must be sufficient for expressing solutions

15 Viewing a functional AST as a “program” + * x2 y The program has two “inputs”, x and y. Given specific values for these, it produces a unique result as output

16 * T3T3T3T3 T1T1T1T1 T2T2T2T2 + T4T4T4T4 T5T5T5T5 T6T6T6T6 - + + T3T3T3T3 T5T5T5T5 T6T6T6T6 + T4T4T4T4 T1T1T1T1 T2T2T2T2 - * Crossover pt 1 Crossover pt 2 AST Crossover AST CrossoverParents Children

17 Initial population Built randomly Built randomly Two methods for building a random AST: Two methods for building a random AST: – Full method: All branches are equally long – Grow method: Different subtrees can have different sizes (but less than the maximum) More usual: ramped half-and-half initialization: half of the trees are built with one method, the other half with the other method More usual: ramped half-and-half initialization: half of the trees are built with one method, the other half with the other method

18 Problem formulation Can cast it as a standard symbolic regression problem Can cast it as a standard symbolic regression problem View F as a function of E and V, and search space of all rational functions of two variables (up to a max depth) View F as a function of E and V, and search space of all rational functions of two variables (up to a max depth) Error function: difference between actual # of faces and the result produced by the program Error function: difference between actual # of faces and the result produced by the program Optimization: minimize the error Optimization: minimize the error Quick convergence Quick convergence

19 Another approach Search space of all identities Search space of all identities Generated as follows: Generated as follows: I T 1 = T 2 I T 1 = T 2 T L | T 1 + T 2 | T 1 – T 2 | T 1 * T 2 T L | T 1 + T 2 | T 1 – T 2 | T 1 * T 2 L V | E | F | 1 | 2 | … | 9 L V | E | F | 1 | 2 | … | 9 Any other integer can be built from 1,…, 9 and the given non-terminals Any other integer can be built from 1,…, 9 and the given non-terminals Identity is not a non-terminal; it can only appear at the root of an AST Identity is not a non-terminal; it can only appear at the root of an AST

20 Details Generate P identities randomly (using ramped half-and-half initialization) Generate P identities randomly (using ramped half-and-half initialization) Crossover on two identities S 1 = S 2 and T 1 = T 2 : Crossover on two identities S 1 = S 2 and T 1 = T 2 : Mate two random subterms S i and T j from each identity, producing two new subterms S i ’ and T j ’ Mate two random subterms S i and T j from each identity, producing two new subterms S i ’ and T j ’ If either new term is deeper than the max depth, then use one of the original parents If either new term is deeper than the max depth, then use one of the original parents Replace S i and T j in the identities by S i ’ and T j ’ Replace S i and T j in the identities by S i ’ and T j ’ No mutation No mutation

21 Fitness An identity is evaluated on a given triple of values for V, E, and F An identity is evaluated on a given triple of values for V, E, and F Computing the fitness of an identity Computing the fitness of an identity S = T: S = T:  For each of the k data triples ½ :  If S = T holds for ½, then give the identity a point Higher score, greater fitness Higher score, greater fitness Maximum fitness: 9, minimum: 0 Maximum fitness: 9, minimum: 0

22 Problem Trivially true identities can get perfect scores, e.g.: Trivially true identities can get perfect scores, e.g.:  V = V  1 + 2 = 5 – 3  E – E + E = E Solution: negative triples, e.g.: Solution: negative triples, e.g.:  V = 0, E = 0, F = 1 Trivial identities will hold for such negative triples, but plausible identities will not Trivial identities will hold for such negative triples, but plausible identities will not

23 Fitness computation To evaluate an identity S = T: To evaluate an identity S = T: For each of the k data triples p : For each of the k data triples p : – Allocate a point if S = T holds for p – Allocate a second point if S = T does not hold for the negative triple Maximum score: 18, minimum: 0 Maximum score: 18, minimum: 0 Also impose a penalty of b n/20 c points for an identity of length n (to discourage excessively long expressions) Also impose a penalty of b n/20 c points for an identity of length n (to discourage excessively long expressions)


Download ppt "Automated discovery in math Machine learning techniques (GP, ILP, etc.) have been successfully applied in science Machine learning techniques (GP, ILP,"

Similar presentations


Ads by Google