Download presentation

1
**Data Mining: Metode dan Algoritma**

Romi Satria Wahono

2
**Romi Satria Wahono SD Sompok Semarang (1987) SMPN 8 Semarang (1990)**

SMA Taruna Nusantara, Magelang (1993) S1, S2 dan S3 (on-leave) Department of Computer Sciences Saitama University, Japan ( ) Research Interests: Software Engineering and Intelligent Systems Founder IlmuKomputer.Com Peneliti LIPI ( ) Founder dan CEO PT Brainmatics Cipta Informatika

3
**Course Outline Pengenalan Data Mining Proses Data Mining**

Object-Oriented Programming Course Outline Pengenalan Data Mining Proses Data Mining Evaluasi dan Validasi pada Data Mining Metode dan Algoritma Data Mining Penelitian Data Mining

4
Metode dan Algoritma

5
**Metode dan Algoritma Inferring rudimentary rules Statistical modeling**

Constructing decision trees Constructing rules Association rule learning Linear models Instance-based learning Clustering

6
**Simplicity first Simple algorithms often work very well!**

There are many kinds of simple structure, eg: One attribute does all the work All attributes contribute equally & independently A weighted linear combination might do Instance-based: use a few prototypes Use simple logical rules Success of method depends on the domain

7
**Inferring rudimentary rules**

1R: learns a 1-level decision tree I.e., rules that all test one particular attribute Basic version One branch for each value Each branch assigns most frequent class Error rate: proportion of instances that don’t belong to the majority class of their corresponding branch Choose attribute with lowest error rate (assumes nominal attributes)

8
Pseudo-code for 1R Note: “missing” is treated as a separate attribute value For each attribute, For each value of the attribute, make a rule as follows: count how often each class appears find the most frequent class make the rule assign that class to this attribute-value Calculate the error rate of the rules Choose the rules with the smallest error rate

9
**Evaluating the weather attributes**

No True High Mild Rainy Yes False Normal Hot Overcast Sunny Cool Play Windy Humidity Temp Outlook 3/6 True No* 5/14 2/8 False Yes Windy 1/7 Normal Yes 4/14 3/7 High No Humidity Total errors 1/4 Cool Yes 2/6 Mild Yes 2/4 Hot No* Temp 2/5 Rainy Yes 0/4 Overcast Yes Sunny No Outlook Errors Rules Attribute

10
**Dealing with numeric attributes**

Discretize numeric attributes Divide each attribute’s range into intervals Sort instances according to attribute’s values Place breakpoints where class changes (majority class) This minimizes the total error Example: temperature from weather data Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No … Yes False 80 75 Rainy 86 83 Overcast No True 90 Sunny 85 Play Windy Humidity Temperature Outlook

11
**The problem of overfitting**

This procedure is very sensitive to noise One instance with an incorrect class label will probably produce a separate interval Also: time stamp attribute will have zero errors Simple solution: enforce minimum number of instances in majority class per interval Example (with min = 3): Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No Yes No Yes Yes Yes | No No Yes Yes Yes | No Yes Yes No

12
**With overfitting avoidance**

Resulting rule set: 0/1 > 95.5 Yes 3/6 True No* 5/14 2/8 False Yes Windy 2/6 > 82.5 and 95.5 No 3/14 1/7 82.5 Yes Humidity 4/14 Total errors 2/4 > 77.5 No* 3/10 Yes Temperature 2/5 Rainy Yes 0/4 Overcast Yes Sunny No Outlook Errors Rules Attribute

13
**Discussion of 1R 1R was described in a paper by Holte (1993)**

Contains an experimental evaluation on 16 datasets (using cross-validation so that results were representative of performance on future data) Minimum number of instances was set to 6 after some experimentation 1R’s simple rules performed not much worse than much more complex decision trees Simplicity first pays off! Very Simple Classification Rules Perform Well on Most Commonly Used Datasets Robert C. Holte, Computer Science Department, University of Ottawa

14
**Discussion of 1R: Hyperpipes**

Another simple technique: build one rule for each class Each rule is a conjunction of tests, one for each attribute For numeric attributes: test checks whether instance's value is inside an interval Interval given by minimum and maximum observed in training data For nominal attributes: test checks whether value is one of a subset of attribute values Subset given by all possible values observed in training data Class with most matching tests is predicted

15
**Statistical modeling “Opposite” of 1R: use all the attributes**

Two assumptions: Attributes are equally important statistically independent (given the class value) I.e., knowing the value of one attribute says nothing about the value of another (if the class is known) Independence assumption is never correct! But … this scheme works well in practice

16
**Probabilities for weather data**

Outlook Temperature Humidity Windy Play Yes No Yes No Yes No Yes No Yes No Sunny 2 3 Hot 2 2 High 3 4 False 6 2 9 5 Overcast 4 Mild 4 2 Normal 6 1 True 3 3 Rainy 3 2 Cool 3 1 Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/ 14 5/ 14 Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5 Rainy 3/9 2/5 Cool 3/9 1/5 No True High Mild Rainy Yes False Normal Hot Overcast Sunny Cool Play Windy Humidity Temp Outlook

17
**Probabilities for weather data**

5/ 14 5 No 9/ 9 Yes Play 3/5 2/5 3 2 3/9 6/9 6 True False Windy 1/5 4/5 1 4 Normal High Humidity 4/9 2/9 Cool Rainy Mild Hot Temperature 0/5 Overcast Sunny Outlook A new day: ? True High Cool Sunny Play Windy Humidity Temp. Outlook Likelihood of the two classes For “yes” = 2/9 3/9 3/9 3/9 9/14 = For “no” = 3/5 1/5 4/5 3/5 5/14 = Conversion into a probability by normalization: P(“yes”) = / ( ) = 0.205 P(“no”) = / ( ) = 0.795

18
**Bayes’s rule Probability of event H given evidence E:**

A priori probability of H : Probability of event before evidence is seen A posteriori probability of H : Probability of event after evidence is seen 𝑃𝑟 𝐻∣𝐸 = 𝑃𝑟 𝐸∣𝐻 𝑃𝑟 𝐻 𝑃𝑟 𝐸 𝑃𝑟 𝐻 𝑃𝑟 𝐻∣𝐸 Thomas Bayes Born: 1702 in London, England Died: 1761 in Tunbridge Wells, Kent, England

19
**Naïve Bayes for classification**

Classification learning: what’s the probability of the class given an instance? Evidence E = instance Event H = class value for instance Naïve assumption: evidence splits into parts (i.e. attributes) that are independent 𝑃𝑟 𝐻∣𝐸 = 𝑃𝑟 𝐸 1 ∣𝐻 𝑃𝑟 𝐸 2 ∣𝐻 𝑃𝑟 𝐸 𝑛 ∣𝐻 𝑃𝑟 𝐻 𝑃𝑟 𝐸

20
**Weather data example Evidence E = 2 9 × 3 9 × 3 9 × 3 9 × 9 14 𝑃𝑟 𝐸**

? True High Cool Sunny Play Windy Humidity Temp. Outlook Evidence E 𝑃𝑟 𝑦𝑒𝑠∣𝐸 =𝑃𝑟 𝑂𝑢𝑡𝑙𝑜𝑜𝑘=𝑆𝑢𝑛𝑛𝑦∣𝑦𝑒𝑠 ×𝑃𝑟 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒=𝐶𝑜𝑜𝑙∣𝑦𝑒𝑠 ×𝑃𝑟 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦=𝐻𝑖𝑔ℎ∣𝑦𝑒𝑠 Probability of class “yes” ×𝑃𝑟 𝑊𝑖𝑛𝑑𝑦=𝑇𝑟𝑢𝑒∣𝑦𝑒𝑠 × 𝑃𝑟 𝑦𝑒𝑠 𝑃𝑟 𝐸 = 2 9 × 3 9 × 3 9 × 3 9 × 𝑃𝑟 𝐸

21
**Cognitive Assignment II**

Pahami dan kuasai satu metode data mining dari berbagai literatur Rangkumkan dengan detail dalam bentuk slide, dari mulai definisi, perkembangan metode, tahapan algoritma, penerapannya untuk suatu studi kasus, dan buat/temukan code Java (eksekusi contoh kasus) Presentasikan di depan kelas pada mata kuliah berikutnya dengan bahasa manusia

22
**Pilihan Algoritma atau Metode**

Neural Network Support Vector Machine Naive Bayes K-Nearest Neighbor CART Linear Discriminant Analysis Agglomerative Clustering Support Vector Regression Expectation Maximization C4.5 K-Means Self-Organizing Map FP-Growth A Priori Logistic Regression Random Forest K-Medoids Radial Basis Function Fuzzy C-Means K* Support Vector Clustering OneR

23
Object-Oriented Programming Referensi Ian H. Witten, Frank Eibe, Mark A. Hall, Data mining: Practical Machine Learning Tools and Techniques 3rd Edition, Elsevier, 2011 Daniel T. Larose, Discovering Knowledge in Data: an Introduction to Data Mining, John Wiley & Sons, 2005 Florin Gorunescu, Data Mining: Concepts, Models and Techniques, Springer, 2011 Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques Second Edition, Elsevier, 2006 Oded Maimon and Lior Rokach, Data Mining and Knowledge Discovery Handbook Second Edition, Springer, 2010 Warren Liao and Evangelos Triantaphyllou (eds.), Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications, World Scientific, 2007

Similar presentations

OK

Classification II. 2 Numeric Attributes Numeric attributes can take many values –Creating branches for each value is not ideal The value range is usually.

Classification II. 2 Numeric Attributes Numeric attributes can take many values –Creating branches for each value is not ideal The value range is usually.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on fdi in indian retail sector Ppt on chromosomes and genes activities Ppt on front office department Ppt on art of war audio Ppt on bond length of n2 Download ppt on fibre to fabric Download ppt on conic sections for class 11 Ppt on ms project 2007 Ppt on business environment nature concept and significance of research Ppt on video teleconferencing equipment