Presentation is loading. Please wait.

Presentation is loading. Please wait.

Decision Tree (Rule Induction)

Similar presentations


Presentation on theme: "Decision Tree (Rule Induction)"— Presentation transcript:

1 Decision Tree (Rule Induction)
Analysis of customer behavior and service modeling Decision Tree (Rule Induction)

2 Poll: Which data mining technique..?

3 Classification Process with 10 records Step 1: Model Construction with 6 records
Algorithms Training Data Classifier (Model) IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’

4 Step 2: Test model with 6 records & Use the Model in Prediction
Classifier Testing Data Unseen Data (Jeff, Professor, 4) Tenured?

5 Who buys notebook computer? Training Dataset is given below:
This follows an example from Quinlan’s ID3

6 Tree Output: A Decision Tree for Credit Approval
age? <=30 overcast 30..40 >40 student? yes credit rating? no yes excellent fair no yes yes no

7 Extracting Classification Rules from Trees
Represent the knowledge in the form of IF-THEN rules One rule is created for each path from the root to a leaf Each attribute-value pair along a path forms a conjunction The leaf node holds the class prediction Rules are easier for humans to understand Example IF age = “<=30” AND student = “no” THEN buys_computer = “no” IF age = “<=30” AND student = “yes” THEN buys_computer = “yes” IF age = “31…40” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “excellent” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “no”

8 An Example of ‘Car Buyers’ – Who buys Lexton?
no Job M/F Area Age Y/N 1 NJ M N 35 2 F 51 3 OW 31 Y 4 EM 38 5 S 33 6 54 7 49 8 32 9 10 11 12 50 13 36 14 * (a,b,c) means a: total # of records, b: ‘N’ counts, c: ‘Y’ counts

9 Lab on Decision Tree(1) SPSS Clementine, SAS Enterprise Miner
See5/C5.0Download See5/C Evaluation from

10 Lab on Decision Tree(2) From below initial screen, choose File – Locate Data

11 Lab on Decision Tree(3) Select housing.data from Samples folder and click open.

12 Lab on Decision Tree(3(4)
This data set is on deciding house price in Boston area. It has 350 cases and 13 variables.

13 Lab on Decision Tree (5) Input variables crime rate
proportion large lots: residential space proportion industrial: ratio of commercial area CHAS: dummy variable nitric oxides ppm: polution rate in ppm av rooms per dwelling: # of room for dwelling proportion pre-1940 distance to employment centers: distance to the center of city accessibility to radial highways: accessibility to high way property tax rate per $10\,000 pupil-teacher ratio: teachers’ rate B: racial statistics percentage low income earners: ratio of low income people Decision variable Top 20%, Bottom 80%

14 Lab on Decision Tree(6) For the analysis, click Construct Classifier or click Construct Classifier from File menu

15 Lab on Decision Tree(7) Click on Global pruning to (V ). Then, click OK

16 Lab on Decision Tree(8) Decision Tree Evaluation with Training data
Evaluation with Test data

17 Lab on Decision Tree(9) Understanding picture
We can see that (av rooms per dwelling) is the most important variable in deciding house price.

18 Lab on Decision Tree(11) 의사결정나무 그림으로는 규칙을 알아보기 어렵다.
To view the rules, close current screen and click Construct Classifier again or click Construct Classifier from File menu.

19 Lab on Decision Tree(12) Choose/click Rulesets. Then click OK.

20 Lab on Decision Tree(13)


Download ppt "Decision Tree (Rule Induction)"

Similar presentations


Ads by Google