Presentation is loading. Please wait.

Presentation is loading. Please wait.

Shared Ensemble Learning using Multi-trees 전자전기컴퓨터공학과 G201249003 김영제 Database Lab.

Similar presentations


Presentation on theme: "Shared Ensemble Learning using Multi-trees 전자전기컴퓨터공학과 G201249003 김영제 Database Lab."— Presentation transcript:

1 Shared Ensemble Learning using Multi-trees 전자전기컴퓨터공학과 G201249003 김영제 Database Lab

2 Introduction What is a decision tree? Each node in the tree specifies a test for some attribute of the instance Each branch corresponds to an attribute value Each leaf node assigns a classification Decision Tree for PlayTennis

3 Cost Associated with Machine Learning Generation costs Computational costs i.e. computer resource consumption Give better solutions for provided resources

4 Cost Associated with Machine Learning Application costs First Models are accurate in average This does not mean seamless, and confident Model can be highly accurate for frequent cases Extremely inaccurate for infrequent, critical situations i.e. diagnosis, fault detection

5 Cost Associated with Machine Learning Application costs Second Even accurate models can be useless If the purpose is to obtain some new knowledge not expressed in the form of rules the number of rules is too high The interpretation of results significant costs it may even be impossible

6 Construction of Decision Tree Tree construction Driven by a splitting criterion that selects the best split The selected split is applied to generate new branches The rest of splits are discarded Algorithm stops when the examples that fall into a branch belong to the same class

7 Construction of Decision Tree Pruning Removal of not useful parts of the tree in order to avoid over- fitting Pre-pruning performed during the construction of the tree Post-pruning performed by analyzing the leaves once the tree has been built

8 Merit and Demerit of Decision Tree Merit Allows the quick construction of a model Because decision trees are built in a eager way (greedy) Demerit It may produce bad models because of bad decisions

9 Multi-tree Structure Rejected splits are not removed But stored as suspended nodes Two new criteria required for the construction of a single decision tree Suspended node selection To populate a multi-tree, need to specify a criterion that selects one of the suspended nodes Selection of a model Select one or more comprehensible models according to a selection criterion

10 Multi-tree Structure

11 Shared Ensembles Combination Combination of a set of classifiers improves the accuracy of simple classifiers Combination methods Boosting, Bagging, Randomization, Stacking, Windowing The large amount of memory required to store Shared the common parts of the components of the ensemble Using the multi-tree

12 Shared Ensembles Combination

13

14 Original Good loser Bad loser MajorityDifference {40, 10, 30} {80, 0, 0} {40, 0, 0} {1, 0, 0} {0, -60, -20} {7, 2, 10} {0, 0, 19} {0, 0, 10} {0, 0, 1} {-5, -15, 1}

15 Experiments #DatasetSizeClasses Nom. Attr. Num. Attr. 1Balance-scale625304 2Cars1728450 3Dermatology3586331 4Ecoli336807 5Iris150304 6House-votes4352160 7Monks1566260 8Monks2601260 9Monks3554260 10New-thyroid215305 11Post-operative87371 12Soybean-small354350 13Tae151323 14Tic-tac958280 15Wine1783013 Information about datasets used in the experiments.

16 Experiments Arit.Sum.Prod.Max.Min. #Acc.Dev.Acc.Dev.Acc.Dev.Acc.Dev.Acc.Dev. 180.695.0181.244.6676.615.0483.024.7676.615.04 291.222.2591.252.2683.383.6590.92.0983.383.65 394.174.0694.343.8789.065.19944.0589.065.19 480.096.2679.916.1376.977.1480.096.1176.977.14 595.633.1995.773.1893.283.7195.932.8193.283.71 694.535.3994.25.66945.3494.475.4594.45.34 799.671.399.711.18818.699.890.51818.6 873.355.8673.735.8274.535.2577.155.8874.535.25 997.87297.911.897.582.4597.621.9397.582.45 1094.524.2593.765.192.055.7192.575.4392.055.71 1162.516.7663.2516.9361.6317.6167.1314.6161.6317.61 1297.58.3397.59.0697.758.0294.7511.9497.758.02 1363.612.5964.3311.746212.2663.9312.036212.26 1481.733.8282.043.7878.933.7382.683.9778.933.73 1594.06693.886.4291.477.1192.536.9991.477.11 Geomean85.834.7285.994.7182.535.9386.44.5282.555.93 Comparison between fusion techniques

17 Experiments Max+OrigMax+GoodMax+BadMax+Majo.Max+Diff. #Acc.DevAcc.DevAcc.DevAcc.DevAcc.Dev 183.024.7683.024.7683.024.7667.846.6183.024.76 290.92.0990.92.0990.92.0981.483.2290.92.09 3944.05944.05944.0579.977.98944.05 480.096.1180.096.1180.096.1178.216.0780.096.11 595.932.8195.932.8195.932.8189.444.8495.932.81 694.475.4594.475.4594.475.4591.476.994.475.45 799.890.5199.890.5199.890.5177.586.2999.890.51 877.155.8877.155.8877.155.8883.425.0677.155.88 997.621.9397.621.9397.621.9390.44.0297.621.93 1092.575.4392.575.4392.575.4389.146.7492.575.43 1167.1314.6167.1314.6167.1314.6168.2515.336714.6 1294.7511.9494.7511.9494.7511.9450.7528.0894.7511.94 1363.9312.0363.8712.1463.9312.0360.9311.4565.1312.53 1482.683.9782.683.9782.683.9768.264.3582.683.97 1592.536.9992.536.9992.536.9978.4111.2592.536.99 Gmean86.44.5286.394.5386.44.5276.117.1986.494.54 Comparison between vector transformation methods

18 Experiments 1101001000 #Acc.Dev.Acc.Dev.Acc.Dev.Acc.Dev. 176.824.9977.895.1883.024.7687.684.14 289.012.0289.342.290.92.0991.532.08 3904.7291.434.67944.05944.05 477.556.9678.586.8480.096.1180.096.11 593.633.5794.563.4195.932.8195.562.83 694.675.8494.275.6994.475.45955.14 792.256.2796.454.1599.890.511000.01 874.835.1775.335.1177.155.8882.44.52 997.551.8997.841.8697.621.9397.751.92 1092.625.2293.435.0592.575.4390.765.89 1160.8817.916315.886714.668.1315.11 1297.259.339610.4994.7511.9495.510.88 1362.9312.516512.1965.1312.5365.3312.92 1478.224.2579.234.0382.683.9784.653.34 1593.126.9593.296.3192.536.9992.995 Gmean83.885.5284.915.386.494.5487.474.47 Influence of the size of the multi-tree

19 Experiments

20

21 References http://www.lsi.us.es/iberamia2002/confman/SUBMISSIONS/2 54-escicucrri.pdf http://www.lsi.us.es/iberamia2002/confman/SUBMISSIONS/2 54-escicucrri.pdf Shared Ensemble Learning using Multi-trees V. Estruch, C. Ferri, J. Hernandez-Orallo, M.J. Ramirez-Quintana Wikipedia http://ai-times.tistory.com/77

22 Thank you for listening


Download ppt "Shared Ensemble Learning using Multi-trees 전자전기컴퓨터공학과 G201249003 김영제 Database Lab."

Similar presentations


Ads by Google