Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Greedy Prepend Algorithm for Decision List Induction Deniz Yuret Michael de la Maza.

Similar presentations


Presentation on theme: "The Greedy Prepend Algorithm for Decision List Induction Deniz Yuret Michael de la Maza."— Presentation transcript:

1 The Greedy Prepend Algorithm for Decision List Induction Deniz Yuret Michael de la Maza

2 Overview Decision Lists Greedy Prepend Algorithm Opus search and UCI problems Version space search and secondary structure prediction Limited look-ahead search and Turkish morphology disambiguation

3 Introduction to Decision Lists Prototypical machine learning problem: –Decide democrat or republican for 435 representatives based on 16 votes. Class Name: 2 (democrat, republican) 1. handicapped-infants: 2 (y,n) 2. water-project-cost-sharing: 2 (y,n) 3. adoption-of-the-budget-resolution: 2 (y,n) 4. physician-fee-freeze: 2 (y,n) 5. el-salvador-aid: 2 (y,n) 6. religious-groups-in-schools: 2 (y,n) … 16. export-administration-act-south-africa: 2 (y,n)

4 Introduction to Decision Lists Prototypical machine learning problem: –Decide democrat or republican for 435 representatives based on 16 votes. 1. If adoption-of-the-budget-resolution = y and anti-satellite-test-ban = n and water-project-cost-sharing = y then democrat 2. If physician-fee-freeze = y then republican 3. If TRUE then democrat

5 Alternative Representations Decision trees:

6 Alternative Representations CNF: DNF:

7 Alternative Representations For 0 2, k-CNF(n) U k-DNF(n) is a subset of k-DL(n) For 0 2, k-DT(n) is a subset of k-CNF(n) k-DNF(n) k-DT(n) is a subset of k-DL(n) Rivest 1987

8 Overview Decision Lists Greedy Prepend Algorithm Opus search and UCI problems Version space search and secondary structure prediction Limited look-ahead search and Turkish morphology disambiguation

9 Decision List Induction Start with an empty decision list or a default rule. Keep adding the best rule that covers the unclassified and misclassified cases. Design Decisions: Where to add the new rules (front, back) Criteria for best rule Search algorithm for best rule

10 The Greedy Prepend Algorithm GPA(data) 1.dlist = NIL 2.default-class = most-common-class(data) 3.rule = [ if true then default-class ] 4.while gain(rule, dlist, data) > 0 5. do dlist = prepend(rule, dlist) 6. rule = max-gain-rule(dlist, data) 7.return dlist

11 The Greedy Prepend Algorithm Starts with a default rule that picks the most common class Prepends subsequent rules to the front of the decision list The best rule is the one with maximum gain (increase in number of correctly classified instances) Several search algorithms implemented

12 Rule Search The default rule predicts all instances to belong to the most common category + - Correct Assignments Partition with respect to the Base Rule False Assignments Training Set

13 Rule Search At each step add the maximum gain rule Partition with respect to the Decision List Partition with respect to the Next Rule

14 Overview Decision Lists Greedy Prepend Algorithm Opus search and UCI problems Version space search and secondary structure prediction Limited look-ahead search and Turkish morphology disambiguation

15 Opus Search: Simple tree

16 Opus Search: Fixed order tree

17 Opus Search: Optimal pruning

18 GPA-Opus on UCI Problems

19 Overview Decision Lists Greedy Prepend Algorithm Opus search and UCI problems Version space search and secondary structure prediction Limited look-ahead search and Turkish morphology disambiguation

20 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD ??????????????????????????????????????

21 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD -?????????????????????????????????????

22 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD -?????????????????????????????????????

23 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD --????????????????????????????????????

24 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD --????????????????????????????????????

25 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD ---???????????????????????????????????

26 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD ----H-----????????????????????????????

27 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD ----H-----H???????????????????????????

28 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD ----H-----H???????????????????????????

29 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD ----H-----HH??????????????????????????

30 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD ----H-----HHHHHHHHHH------EEEEE------?

31 A Generic Prediction Algorithm: Sequence to Structure MRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGD ----H-----HHHHHHHHHH------EEEEE

32 GPA Rules The first three rules of the sequence-to- structure decision list –58.86% performance (of 66.36%)

33 GPA Rule 1 Everything => Loop

34 GPA Rule 2 HELIX L4L3L2L10R1R2R3R4 **!GLY !ASN!GLY!PRO !GLY!PRO !SER (Non-polar or large)

35 GPA Rule 3 STRAND L4L3L2L10R1R2R3R4 !LEU!ALA!ASP!ALACYS!PRO!ARG!LEU !GLN !ASPILE!GLN!MET !GLU !GLYLEU!GLU !PROPHE!LYS TRP!PRO TYR (Non-Polar and Not Charged) VAL (Non-polar)

36 GPA-Opus not feasible for secondary structure prediction 9 positions 20 possible amino-acids per position Size of rule space: –With only pos=val type attributes: 21^9 –If we include disjunctions: 2^180

37 GPA Version Space Search Searching for a candidate rule: Pick a random instance If the instance is currently misclassified and candidate rule corrects it: generalize candidate rule to include instance If the instance is currently correct and candidate rule changes classification: specialize candidate rule to exclude instance

38 GPA Secondary Structure Prediction Results PhD72.3 NNSSP71.7 GPA69.2 DSC69.1 Predator 69.0

39 Overview Decision Lists Greedy Prepend Algorithm Opus search and UCI problems Version space search and secondary structure prediction Limited look-ahead search and Turkish morphology disambiguation

40 Morphological Analyzer for Turkish masalı masal+Noun+A3sg+Pnon+Acc (= the story) masal+Noun+A3sg+P3sg+Nom (= his story) masa+Noun+A3sg+Pnon+Nom^DB+Adj+With (= with tables) Oflazer, K. (1994). Two-level description of Turkish morphology. Literary and Linguistic Computing Oflazer, K., Hakkani-Tür, D. Z., and Tür, G. (1999) Design for a turkish treebank. EACL99 Kenneth R. Beesley and Lauri Karttunen, Finite State Morphology, CSLI Publications, 2003Kenneth R. BeesleyLauri KarttunenCSLI Publications

41 Features, IGs and Tags 126 unique features 9129 unique IGs unique tags distinct tags observed in 1M word training corpus masa+Noun+A3sg+Pnon+Nom^DB+Adj+With stem features inflectional group (IG) IG derivational boundary tag

42 Morphological disambiguation Task: pick correct parse given context 1.masal+Noun+A3sg+Pnon+Acc 2.masal+Noun+A3sg+P3sg+Nom 3.masa+Noun+A3sg+Pnon+Nom^DB+Adj+With –Uzun masalı anlatTell the long story –Uzun masalı bittiHis long story ended –Uzun masalı odaRoom with long table

43 Morphological disambiguation Task: pick correct parse given context 1.masal+Noun+A3sg+Pnon+Acc 2.masal+Noun+A3sg+P3sg+Nom 3.masa+Noun+A3sg+Pnon+Nom^DB+Adj+With Key Idea Build a separate classifier for each feature.

44 GPA on Morphological Disambiguation 1.If (W = çok) and (R1 = +DA) Then W has +Det 2.If (L1 = pek) Then W has +Det 3.If (W = +AzI) Then W does not have +Det 4.If (W = çok) Then W does not have +Det 5.If TRUE Then W has +Det pek çok alanda(R1) pek çok insan(R2) insan çok daha(R4)

45 GPA-Opus not feasible Attributes for a five word window: The exact word string (e.g. W=Ali'nin) The lowercase version (e.g. W=ali'nin) All suffixes (e.g. W=+n, W=+In, W=+nIn, W=+'nIn, etc.) Character types (e.g. Ali'nin would be described with W=UPPER-FIRST, W=LOWER-MID, W=APOS-MID, W=LOWERLAST ) Average 40 features per instance.

46 GPA limited look-ahead search New rules are restricted to adding one new feature to existing rules in the decision list

47 GPA Turkish morphological disambiguation results Test corpus: 1000 words, hand tagged Accuracy: 95.87% (conf. int: ) Better than the training data !?

48 Contributions and Future Work Established GPA as a competitive alternative to SVMs, C4.5 etc. Need theory on why the best-gain rule does well. Need to study robustness to irrelevant or redundant attributes. Need to speed up the application of the resulting decision lists (convert to FSM?)


Download ppt "The Greedy Prepend Algorithm for Decision List Induction Deniz Yuret Michael de la Maza."

Similar presentations


Ads by Google