# Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang \$ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun.

## Presentation on theme: "Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang \$ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun."— Presentation transcript:

Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang \$ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun Yat-sen University \$ Xavier University of Louisiana # Montclair State University

RDT: Random Decision Tree (Fan et al 03) Encoding data in trees. At each node, an un-used feature is chosen randomly A discrete feature is un-used if it has never been chosen previously on a given decision path starting from the root to the current node. A continuous feature can be chosen multiple times on the same decision path, but each time a different threshold value is chosen Stop when one of the following happens: A node becomes too small or belong to same class Or the total height of the tree exceeds some limits:

Illustration of RDT B1: {0,1} B2: {0,1} B3: continuous B2: {0,1} B3: continuous B2: {0,1} B3: continuous B3: continous Random threshold 0.3 Random threshold 0.6 B1 chosen randomly B2 chosen randomly B3 chosen randomly

Probabilistic view of decision trees - PETs | Petal.Width< 1.75 setosa 50/0/0 versicolor 0/49/5 virginica 0/1/45 Petal.Length< 2.45 P( setosa |x,θ) = 0 P( versicolor |x,θ) = 49/54 P( virginica |x,θ) = 5/54 Given an example x :, E.g. (C4.5, CART) confidences in the predicted labels the dependence of P(y|x,θ) on θ is non-trivial For example :

Problems of probability estimation via conventional DTs 1.Probability estimates tend to approach the extremes of 1 and 0. --------------------------------------------- 2.Additional inaccuracies result from the small number of examples at a leaf. --------------------------------------------- 3.Same probability is assigned to the entire region of space defined by a given leaf. C4.4 (Provost,03) BC44 (Zhang,06), RDT (Fan,03)

PET Algorithms Single or Multiple Model(s) Splitting Criterio n Probability Estimation Method Pruning Strategy Diversity Acquisation C4.5 (Quinlan,93) Single Gain Ratio Frequency Estimation Error- based Pruning N/A C4.4 (Provost,03) Single Gain Ratio Laplace Correction No RDT(Fan,03)Multiple Randomly Chosen Bayesian Averaging No or Depth Constraint Random Manipulation of feature set BaggingPET (Breiman,96) Multiple Gain Ratio Bayesian Averaging No Random Manipulation of training set Popular PET Algorithms

bRDT bRDT is the averaging of RDT and BC44, where RDT is Random Decision Tree and BC44 is Bagged C4.4

Sampling strategy for Task 1 &2 For station Z, negative instances are partitioned into blocks such that the size of each block is Approximately 3 times as that of the positive. ………… …… Positive Negative Block 1 Block n

Task 1 & 2 - Result For V station, row 2 and 3, corresponding to task 1 and 2 The optimal classifiers of task 1 and 2 for station W, X, Y, Z are the same. Thus there s only one row for these 4 stations