ขั้นตอนวิธีเชิงพันธุกรรมสำหรับการอนุมานเครื่องจักรสถานะจำกัด อาจารย์ที่ปรึกษาวิทยานิพนธ์ รศ. ดร. ประภาส จงสถิตย์วัฒนา ประธานกรรมการ ศ. ดร. ชิดชนก เหลือสินทรัพย์ กรรมการ ผศ. ดร. บุญเสริม กิจศิริกุล ผศ. ดร. ณชล ไชยรัตนะ เสนอโดย นายนัทที นิภานันท์ เลขประจำตัว 403 02410 21
Introduction Inference of FSM ≡ ? ? ? from observed input/output Mimic the target machine Target Machines Hypothesis Machine ? ? ? ≡ INPUT OUTPUT Learning Method
Presentation Outline Claim Some details of claim Conviction Extras Legal stuff Some details of claim Conviction Experiment Analysis Conclusion Extras Summary
Hypothesis A new genetic algorithm proposed in this thesis is a better way to solve the problem of finite state machine inference than the former genetic algorithm
Legal stuff: Objective To develop a better genetic algorithm for the problem
Legal Stuff: Scope Compare the new method with reference genetic algorithm The new method must be shown to be better than the reference method The solutions from the new method must be shown to be consistent
Former GA (REF) Encode δ and λ in bit string Single point crossover Evaluate by counting similar output bit ... Next State Output Next State Output Next State Output Next State Output 0-transition 1-transition 0-transition 1-transition State 0 State N Hypothesis Machine Hypothesis OUTPUT INPUT Sequence Compare OUTPUT Sequence OUTPUT Sequence
New GA New evaluation & encoding New crossover operator NEW1 method
New Evaluation Old evaluation can mislead the search 0/B Old evaluation can mislead the search Correct δ under wrong λ will result in totally wrong score B A 0/A 1/A 1/B Target Machine 0/A B A 0/B 1/B 1/A Hypothesis Machine
New Evaluation Main idea Perform local search on each output value Each transition is evaluated by some IO pairs Why make-then-ask? Perform local search on each output value old method fixes output (A) it is only 1/3 correct IO Sequence: Evaluate particular state X by (0,B) (0,B) (0,A) 0/A X new method adjusts its output according to IO. It choose B to get 2/3 correct 0/? 0/B X
New Evaluation: Example (b) 1 (c) Input : 0 0 1 0 1 0 1 Output : 0 1 1 0 0 0 0 a c d a d a d (a) X Y (d) 1 Evaluation value = 3 + 0 + 1 + 2 = 6
Output Definition X Y (b) 1 (c) (a) (d) 1 Output: 0 (a) X Y (d) 1 Output: 0 N/A (any arbitrary value) 1
New Encoding Encoding λ is futile ... omitted Next State Next State 0-transition 1-transition 0-transition 1-transition State 0 State N
New Crossover The encoding scheme introduces chance of not having a tight linkage high destructive effect A B C D E F G
New Crossover Choose two parents, find the best one Rearrange state according to DFS order discard inaccessible state Perform single point crossover on the new list of states
New Crossover: Example B C D E F G A G C E D B
Experiment To compare performance of REF, NEW1 and NEW2 Measure number of generation used, time used and successful run
Setup of the experiment Output Algorithm FSM Generator Hypothesis Machine REF Hypothesis Machine IO Sequence Generator Hypothesis Machine Target FSM FSM FSM Hypothesis Machine NEW1 Hypothesis Machine Hypothesis Machine IO Seq Set FSM FSM Hypothesis Machine NEW2 Hypothesis Machine Hypothesis Machine Input
Experimental Result
Experimental Result
Experimental Result : Summary (Generation) REF NEW1 NEW2 Total 749,246 527,488 505,780 Relative 100% 70.40% 67.51% Best 31 35 Successful Runs 461 561 585 (Time) REF NEW1 NEW2 Total 64,350 46,393 51,801 Relative 100% 72.10% 80.50% Best 2 52 26
Analysis New evaluation & coding = local search Schema preservation Search space reduction Intron from inaccessible state Schema preservation Intron nullifying intron from unexercised state
Search Space Reduction
Search Space Reduction
Search Space Reduction
Search Space Reduction : Summary Experiment A A1 A2 Avg. Generation Used 70.40% 61.60% 54.12% Avg. Time Used 72.09% 55.16% 49.43% Number of problem Solved 121.69% 135.62% 150.64%
Additional Experiment Comparison between NEW2 and heuristic based method, red-blue compare correctness of result when the size of training set is reduced using cross validation method correctness = proportion of correctly identified data on test set Training set : IO Sequence length: 5 35 (step 2) number: 6 36 (step 6)
Heuristic Method : red-blue Using heuristic in search fast scalable No restriction on the size of the hypothesis
Correctness vs. Sample Size
Additional Experiment: Analysis shorter description of hypothesis is more preferable Occam’s Razor
Size of hypothesis vs. Sample Size
What have been done? A genetic algorithm for finite state machine inference problem is presented It is empirically shown that the proposed method is better than former methods
What can be extended by others? Practical Issues Better linkage awareness Chromosome representation Theoretical Issues Effect of intron Formal analysis of preference of short hypothesis
What would you like to ask?