Data Mining CSCI 307, Spring 2019 Lecture 7

Slides:



Advertisements
Similar presentations
Florida International University COP 4770 Introduction of Weka.
Advertisements

Weka & Rapid Miner Tutorial By Chibuike Muoh. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
About ISoft … What is Decision Tree? Alice Process … Conclusions Outline.
WEKA Evaluation of WEKA Waikato Environment for Knowledge Analysis Presented By: Manoj Wartikar & Sameer Sagade.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
March 25, 2004Columbia University1 Machine Learning with Weka Lokesh S. Shrestha.
An Extended Introduction to WEKA. Data Mining Process.
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
1 Statistical Learning Introduction to Weka Michel Galley Artificial Intelligence class November 2, 2006.
Machine Learning Márk Horváth Morgan Stanley FID Institutional Securities.
1 How to use Weka How to use Weka. 2 WEKA: the software Waikato Environment for Knowledge Analysis Collection of state-of-the-art machine learning algorithms.
Data Mining Techniques
An Exercise in Machine Learning
CSCI 347 / CS 4206: Data Mining Module 05: WEKA Topic 01: WEKA Navigation.
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
Contributed by Yizhou Sun 2008 An Introduction to WEKA.
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
Data Mining – Output: Knowledge Representation
WEKA and Machine Learning Algorithms. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of.
Appendix: The WEKA Data Mining Software
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
1 Research Groups : KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems SCI 2 SMetrology and Models Intelligent.
Machine Learning for Language Technology Introduction to Weka: Arff format and Preprocessing.
Figure 1.1 Rules for the contact lens data.. Figure 1.2 Decision tree for the contact lens data.
1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
Weka: Experimenter and Knowledge Flow interfaces Neil Mac Parthaláin
For ITCS 6265/8265 Fall 2009 TA: Fei Xu UNC Charlotte.
W E K A Waikato Environment for Knowledge Analysis Branko Kavšek MPŠ Jožef StefanNovember 2005.
Artificial Neural Network Building Using WEKA Software
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Weka – A Machine Learning Toolkit October 2, 2008 Keum-Sung Hwang.
Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
W E K A Waikato Environment for Knowledge Aquisition.
An Exercise in Machine Learning
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Application of Data Mining Techniques on Survey Data using R and Weka
Weka Tutorial. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering – association rule Created by.
CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.
WEKA's Knowledge Flow Interface Data Mining Knowledge Discovery in Databases ELIE TCHEIMEGNI Department of Computer Science Bowie State University, MD.
Copyright  2004 limsoon wong Using WEKA for Classification (without feature selection)
Machine Learning in Practice Lecture 8 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
@relation age sex { female, chest_pain_type { typ_angina, asympt, non_anginal,
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
An Introduction to WEKA
Introduction to WEKA Mark Hall Data Mining WEKA - what is it? WEKA UIs
CSE 711: DATA MINING Sargur N. Srihari Phone: , ext. 113.
Waikato Environment for Knowledge Analysis
WEKA.
Sampath Jayarathna Cal Poly Pomona
Data Mining: Concepts and Techniques Course Outline
An Introduction to WEKA
Figure 1.1 Rules for the contact lens data.
Machine Learning with WEKA
Weka Package Weka package is open source data mining software written in Java. Weka can be applied to your dataset from the GUI, the command line or called.
Machine Learning with Weka
An Introduction to WEKA
Tutorial for WEKA Heejun Kim June 19, 2018.
CSCI N317 Computation for Scientific Applications Unit Weka
CS4705 – Natural Language Processing Thursday, September 28
Machine Learning with WEKA
Lecture 10 – Introduction to Weka
Statistical Learning Introduction to Weka
Machine Learning for Cyber
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

Data Mining CSCI 307, Spring 2019 Lecture 7 Output: Trees WEKA intro

Can Use Trees for Numeric Prediction Too Regression: the process of computing an expression that predicts a numeric quantity Regression tree: “decision tree” where each leaf predicts a numeric quantity Predicted value is average value of training instances that reach the leaf Model tree: “regression tree” with linear regression models at the leaf nodes Linear patches approximate continuous function Can Use Trees for Numeric Prediction Too

Linear Regression for the CPU Data PRP = -56.1 + 0.049 MYCT + 0.015 MMIN + 0.006 MMAX + 0.630 CACH - 0.270 CHMIN + 1.46 CHMAX

Regression Tree for the CPU Data

Model Tree for the CPU Data LM1 PRP = 8.29 + 0.004MMAX + 2.77CHMIN LM2 PRP = 20.3 + 0.004MMIN – 3.99CHMIN + 0.946CHMAX LM3 PRP = 38.1 + 0.012MMIN LM4 PRP = 19.5 + 0.002MMAX + 0.698CACH + 0.969CHMAX LM5 PRP = 285 – 1.46MYCT + 1.02CACH -9.39CHMIN LM6 PRP = -65.8 + 0.003MMIN – 2.94CHMIN + 4.98CHMAX

WEKA Waikato Environment for Knowledge Analysis On Radius, do this once (make a WEKA folder, copy all the .arff files, copy the weka jar file) cd mkdir WEKAfiles cd WEKAfiles cp /usr/local/weka-3-8-1/data/* . cp /usr/local/weka-3-8-1/weka.jar weka.jar To Run the WEKA application (cd WEKAfiles, if not there already) java –Xmx1000M -jar weka.jar To Download onto a Windows or Mac computer, visit: https://www.cs.waikato.ac.nz/ml/weka/

WEKA Introduction A collection of open source of many data mining and machine learning algorithms, including pre-processing on data classification clustering association rule extraction Created by researchers at the University of Waikato in New Zealand. Java based (also open source).

WEKA Main Features ∼ 49 data preprocessing tools ∼ 76 classification/regression algorithms ∼ 8 clustering algorithms ∼15 attribute/subset evaluators + 10 search algorithms for feature selection ∼ 3 algorithms for finding association rules 3 graphical user interfaces “The Explorer” (exploratory data analysis) “The Experimenter” (experimental environment) “The Knowledge Flow” (new process model inspired interface)

WEKA

WEKA Application Interface Explorer preprocessing, attribute selection, learning, visualization Experimenter testing and evaluating machine learning algorithms Knowledge Flow visual design of the KDD (Knowledge Discovery /from Data/in Databases/with Data mining) process Simple Command-line A simple interface for typing commands

WEKA Functions and Tools Preprocessing Filters Attribute selection Classification/Regression Clustering Association discovery Visualization

WEKA: Pros and Cons Pros Cons Open source, Free Extensible Can be integrated into other java packages GUIs (Graphic User Interfaces) Relatively easy to use Features Run individual experiment, or Build KDD phases Cons Lack of proper and adequate documentations Systems are updated constantly (Kitchen Sink Syndrome)