Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part II Tools for Knowledge Discovery. Knowledge Discovery in Databases Chapter 5.

Similar presentations


Presentation on theme: "Part II Tools for Knowledge Discovery. Knowledge Discovery in Databases Chapter 5."— Presentation transcript:

1 Part II Tools for Knowledge Discovery

2 Knowledge Discovery in Databases Chapter 5

3 5.1 A KDD Process Model

4 Figure 5.1 A seven-step KDD process model

5 Figure 5.2 Applyiing the scientific method to data mining

6 Step 1: Goal Identification Define the Problem. Choose a Data Mining Tool. Estimate Project Cost. Estimate Project Completion Time. Address Legal Issues. Develop a Maintenance Plan.

7 Step 2: Creating a Target Dataset

8 Figure 5.3 The Acme credit card database

9 Step 3: Data Preprocessing Noisy Data Missing Data

10 Noisy Data Locate Duplicate Records. Locate Incorrect Attribute Values. Smooth Data.

11 Preprocessing Missing Data Discard Records With Missing Values. Replace Missing Real-valued Items With the Class Mean. Replace Missing Values With Values Found Within Highly Similar Instances.

12 Processing Missing Data While Learning Ignore Missing Values. Treat Missing Values As Equal Compares. Treat Missing values As Unequal Compares.

13 Step 4: Data Transformation Data Normalization Data Type Conversion Attribute and Instance Selection

14 Data Normalization Decimal Scaling Min-Max Normalization Normalization using Z-scores Logarithmic Normalization

15 Attribute and Instance Selection Eliminating Attributes Creating Attributes Instance Selection

16

17 Step 5: Data Mining 1. Choose training and test data. 2. Designate a set of input attributes. 3. If learning is supervised, choose one or more output attributes. 4. Select learning parameter values. 5. Invoke the data mining tool.

18 Step 6: Interpretation and Evaluation Statistical analysis. Heuristic analysis. Experimental analysis. Human analysis.

19 Step 7: Taking Action Create a report. Relocate retail items. Mail promotional information. Detect fraud. Fund new research.

20 5.9 The Crisp-DM Process Model 1.Business understanding 2.Data understanding 3.Data preparation 4.Modeling 5.Evaluation 6.Deployment

21 5.10 Experimenting with ESX

22 A Four-Step Model for Knowledge Discovery 1.Identify the goal. 2.Prepare the data. 3.Apply data mining. 4.Interpret and evaluate the results.

23 Experiment 1: Attribute Evaluation *Applying the Four-Step Process Model to the Credit Screening Dataset*

24

25

26 Experiment 2: Parameter Evaluation *Applying the Four-Step Process Model to the Satellite Image Dataset*

27 Figure 5.4 Satellite image data


Download ppt "Part II Tools for Knowledge Discovery. Knowledge Discovery in Databases Chapter 5."

Similar presentations


Ads by Google