Presentation is loading. Please wait.

Presentation is loading. Please wait.

- Sachin Singh. Data Mining - Concepts Extracting meaningful knowledge from huge chunk of ‘raw’ data. Types –Association –Classification –Temporal.

Similar presentations


Presentation on theme: "- Sachin Singh. Data Mining - Concepts Extracting meaningful knowledge from huge chunk of ‘raw’ data. Types –Association –Classification –Temporal."— Presentation transcript:

1 - Sachin Singh

2 Data Mining - Concepts Extracting meaningful knowledge from huge chunk of ‘raw’ data. Types –Association –Classification –Temporal

3 Classification Method Prediction model The C4.5 Tree algorithm Trans_IdAgeStudentCredit_ratingBuys_Computer 100028noExcellentno 100125YesExcellentno 100235YesFairYes 100338NoExcellentYes 100433nofairyes 100556YesExcellentYes 100676Nofairno 100757nofairno

4 Classification Tree

5 Analysis of Trees Current work focuses largely on generation of trees –Efficient algorithms –Disk Resident gigantic data sources –Improving accuracy of the generated models Motivation –Current research area – need for analysis

6 Areas of Analysis Two Sub Problems –Filtering Sub Problem –Comparison Sub Problem

7 Filtering Sub Problem Typical data warehouses are huge !! Generation of “Bushy” trees Not all outcomes are significant Need to filter trees based on the required outcomes

8 Filtering Sub Problem Full Classification Tree Filtered Classification Tree

9 Filtering Sub Problem Advantages –Efficient querying. Faster results –Easy Managed –Useful for comparison sub problem

10 Comparison Sub Problem Need to monitor changes in data trends by comparing the classification trees Levels of changes identified –Change in test (partition) value –Change in the partitions –Change in node levels –Change in outcome(leaves)

11 Comparison Sub Problem Issues –Structure of trees unpredictable –Comparing two trees with no standard structure

12 Solution XML Trees –Convert the tree structure in XML files –XML inherently tree structure –Take advantage of existing XML related technologies –Standard specs

13 Solution – Proposed File format

14 Approach Devise Algorithms to solve filtering and comparison problems Analyzing results of comparison in logical terms Measuring efficiency of the algorithms through time and space complexities

15 Progress

16


Download ppt "- Sachin Singh. Data Mining - Concepts Extracting meaningful knowledge from huge chunk of ‘raw’ data. Types –Association –Classification –Temporal."

Similar presentations


Ads by Google