Presentation is loading. Please wait.

Presentation is loading. Please wait.

Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 12 Discovering New Knowledge – Data Mining.

Similar presentations


Presentation on theme: "Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 12 Discovering New Knowledge – Data Mining."— Presentation transcript:

1 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 12 Discovering New Knowledge – Data Mining

2 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter Objectives Introduce the student to the concept of Data Mining.  How it is different from knowledge elicitation from experts  How it is different from extracting existing knowledge from databases. The objectives of data mining  Explanation of past events (descriptive DM)  Prediction of future events (predictive DM) (continued)

3 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter Objectives (cont.) Introduce the student to the different classes of methods available for DM  Symbolic (induction)  Connectionist (neural networks)  Statistical Introduce the student to the details of some of the methods described in the chapter.

4 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.1 - Objectives Introduction of chapter contents

5 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.2 - Objectives Defines the concept of data mining and the reasons for performing DM studies Defines the objectives of data mining  Descriptive DM  Predictive DM Introduces the three basic approaches to DM  Symbolic (induction)  Connectionist (neural networks)  Statistical (curve-fitting, others)

6 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.3 - Objectives Present a detailed description of the symbolic approach to data mining - rule induction Present the main algorithm for rule induction - C5.0 and its ancestors, ID3 and CLS Present several example applications of rule induction Present other alternate algorithms for rule induction  CART  CHAID

7 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.4 - Objectives Provide a detailed description of the connectionist approach to data mining - neural networks Present the basic neural network architecture - the multi-layer feed forward neural network Present the main supervised learning algorithm - backpropagation Present the main unsupervised neural network architecture - the Kohonen network

8 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.5 - Objectives Provide a detailed description of the most important statistical methods for data mining  Curve fitting with least squares method  Multi-variate correlation  K-Means clustering  Market Basket analysis  Discriminant analysis  Logistic regression

9 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.6 - Objectives Provide useful guidelines for determining what technique to use for specific problems

10 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.7 - Objectives Discuss the importance of errors in data mining studies Define the types of errors possible in data mining studies

11 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.8 - Objectives Summarize the chapter Provide Key terms Provide Review Questions Provide Review Exercises

12 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.1 Is the stock’s price/earning s ratio > 5? Has the company’s quarterly profit increased over the last year by 10% or more? Don’ t buy No Yes Root node Is the company’s management stable? Leaf node Yes Don’ t buy No Buy Don’ t Buy Yes No

13 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.2 DS3 - Not Enjoyable DS2 - Not Enjoyable = Sunny = Cloudy = Rainy Rain Outlook {DS1, DS2, DS3, DS4} DS1 – Enjoyable DS4 - Not Enjoyable

14 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.3 None DS3 - Not Enjoyable DS2 - Not Enjoyable = Sunny = Cloudy= Rain Rain Outlook Temperature DS1 – Enjoyable DS4 – Not Enjoyable = Cold = Hot = Mild DS1 - Enjoyable DS4 – Not Enjoyable {DS1, DS2, DS3, DS4}

15 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.4 MilliExpert ThoughtGen OffSite Language Genie XS SilverWorks XS Lis p C++ Java

16 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.5 Language MilliExpert ThoughtGen OffSite Backward Forward Backward C++ Java MilliExpert ThoughtGen OffSite Genie XS SilverWorks XS Lis p OffSiteXSGinie XS SilverWorks Forward Backwards

17 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.6 MilliExpert ThoughtGen OffSite Backward Forward Backward C++ Java MilliExpert ThoughtGen OffSite Language Genie XS SilverWorks XS Lis p OffSite XSGenie XSSilverWorks Forward Backward MilliExpert ThoughtGen OffSite SpreadsheetXL ASCII Devices dBase

18 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.7 = Dry= Humid DS2 – Not Enjoyable DS3 – Not Enjoyable DS4 – Not Enjoyable {DS1, DS2, DS3, DS4} Humidity DS1 - Enjoyable

19 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.8 x2x2 x1x1 xkxk Activation function f() Inputs y  W1W1 W2W2 WnWn

20 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.9 Threshold function Piece-wise Linear function Sigmoid function 1.0

21 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.10 Inputs Outputs

22 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.11

23 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.12 Variable A Variable B Cluster #1 Cluster #2

24 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.13 Inputs WiWi

25 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.14 x y

26 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.15 x y Best fitting equation x

27 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.1

28 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.2

29 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.3

30 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.3 (cont.)

31 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.4

32 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.5

33 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.5 (cont.)

34 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.6

35 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.7

36 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Conclusions The student should be able to use:  The C5.0 algorithm to capture rules from examples.  Basic feedforward neural networks with supervised learning.  Unsupervised learning, clustering techniques and the Kohonen networks.  Curve-fitting algorithms.  Statistical methods for clustering.  Other statistical techniques.

37 Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 12 Discovering New Knowledge – Data Mining


Download ppt "Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 12 Discovering New Knowledge – Data Mining."

Similar presentations


Ads by Google