Learning Classifier Systems (LCS) The system has three layers: – A performance system that interacts with environment, – An apportionment of credit algorithm that rates rules as to usefulness, – A rule discovery algorithm that generates plausible new rules to replace less useful rules.
Performance System Cycles Message is posted in the message list from the input interface. Each rule is matched against the message list All matching rules compete to post in the next message list via bidding process; winning rule posts in the new message list The output interface checks the new message and produces an effector action. The new message list replaces the previous one. Repeat.
Bidding Process [M] Rule id ConditionActionStrength r1#1###010100 r30#0#0011100 β = 0.2 Bid(r1) = 0.2 × ¼ × 100 = 5 Bid(r3) = 0.2 × ½ × 100 = 10 r3 posts its message in the new message list. Bid(R,t) = β × specificity(R) × Strength(R,t) Specificity(R)= number of non # / k
Maze Environment A Environment Message List 40 5 f N 5 (1,2) GF ConditionActionStrengt h # >0 # # # #GF1000 # <0 # # # # ∧ TL TL1000 # <0 # # # # ∧ TR TR1000 (Signal smell-ahead bump heading score location)
References A Mathematical framework for Studying Learning in Classifier Systems, John H. Holland, Phsyca D, Vol 2, No 1-3, 1986, pp. 307-317 A Mathematical framework for Studying Learning in Classifier Systems A First Order Logic Classifier System, Drew Mellor Gecco ’05 A First Order Logic Classifier System