W E K A Waikato Environment for Knowledge Aquisition.

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Florida International University COP 4770 Introduction of Weka.
Web Usage Mining Classification Fang Yao MEMS Humboldt Uni zu Berlin.
Weka & Rapid Miner Tutorial By Chibuike Muoh. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering.
Instance Based Learning IB1 and IBK Find in text Early approach.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Linear Regression Demo using PolyAnalyst Generating Linear Regression Formula Generating Regression Rules for Categorical classification.
Data Mining with Naïve Bayesian Methods
WEKA Evaluation of WEKA Waikato Environment for Knowledge Analysis Presented By: Manoj Wartikar & Sameer Sagade.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
© 2002 by Prentice Hall 1 SI 654 Database Application Design Winter 2003 Dragomir R. Radev.
March 25, 2004Columbia University1 Machine Learning with Weka Lokesh S. Shrestha.
An Extended Introduction to WEKA. Data Mining Process.
Machine Learning with WEKA. WEKA: the bird Copyright: Martin Kramer
1 How to use Weka How to use Weka. 2 WEKA: the software Waikato Environment for Knowledge Analysis Collection of state-of-the-art machine learning algorithms.
CSc288 Term Project Data mining on predict Voice-over-IP Phones market Huaqin Xu.
Data Mining – Algorithms: OneR Chapter 4, Section 4.1.
An Exercise in Machine Learning
CSCI 347 / CS 4206: Data Mining Module 05: WEKA Topic 01: WEKA Navigation.
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
Contributed by Yizhou Sun 2008 An Introduction to WEKA.
Department of Computer Science, University of Waikato, New Zealand Geoff Holmes WEKA project and team Data Mining process Data format Preprocessing Classification.
WEKA – Knowledge Flow & Simple CLI
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
WEKA and Machine Learning Algorithms. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of.
Appendix: The WEKA Data Mining Software
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Data Mining Applied to Document Imaging Jeff Rekoske.
Weka: a useful tool in data mining and machine learning Team 5 Noha Elsherbiny, Huijun Xiong, and Bhanu Peddi.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Machine Learning with Weka Cornelia Caragea Thanks to Eibe Frank for some of the slides.
COMP3410 DB32: Technologies for Knowledge Management 10 : Introduction to Knowledge Discovery By Eric Atwell, School of Computing, University of Leeds.
W E K A Waikato Environment for Knowledge Analysis Branko Kavšek MPŠ Jožef StefanNovember 2005.
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Weka – A Machine Learning Toolkit October 2, 2008 Keum-Sung Hwang.
WEKA Machine Learning Toolbox. You can install Weka on your computer from
An Exercise in Machine Learning
Introduction to Weka ML Seminar for Rookies Byoung-Hee Kim Biointelligence Lab, Seoul National University.
Data Warehouse [ Example ] J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2001, ISBN Data Mining: Concepts and.
Weka Tutorial. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering – association rule Created by.
Machine Learning (ML) with Weka Weka can classify data or approximate functions: choice of many algorithms.
A new clustering tool of Data Mining RAPID MINER.
Fundamentals, Design, and Implementation, 9/e KDD and Data Mining Instructor: Dragomir R. Radev Winter 2005.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Department of Computer Science, University of Waikato, New Zealand Geoff Holmes WEKA project and team Data Mining process Data format Preprocessing Classification.
An Introduction to WEKA
Prepared by: Mahmoud Rafeek Al-Farra
Waikato Environment for Knowledge Analysis
Decision Tree Saed Sayad 9/21/2018.
WEKA.
Sampath Jayarathna Cal Poly Pomona
An Introduction to WEKA
Machine Learning with WEKA
Machine Learning with WEKA
Weka Package Weka package is open source data mining software written in Java. Weka can be applied to your dataset from the GUI, the command line or called.
Machine Learning with Weka
An Introduction to WEKA
Tutorial for WEKA Heejun Kim June 19, 2018.
CSCI N317 Computation for Scientific Applications Unit Weka
Machine Learning with Weka
Machine Learning with WEKA
Lecture 10 – Introduction to Weka
Neural Networks Weka Lab
Data Mining CSCI 307, Spring 2019 Lecture 7
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

W E K A Waikato Environment for Knowledge Aquisition

Goals of the workshop Aquisition of functional knowledge about the WEKA platform Ability of processing (own) data in WEKA Write seminar work identifying a problem transform into data choose appropriate DM technique apply to data evaluate & interpret the results

Some basic facts about WEKA: WEKA(1)WEKA(1) = a flightless bird with an inquisitive nature (found only on the islands of New Zealand) WEKA(2)WEKA(2) = a software ‘workbench’ incorporating several standard ML/DM techniques AuthorsAuthors = Ian H. Witten, Eibe Frank (et. al.) Programming languageProgramming language = JAVA OriginOrigin = The University of Waikato, New Zealand LiteratureLiterature = Ian H. Witten, Eibe Frank: Practical Machine Learning Tools with JAVA Implementations, Morgan Kaufmann, 1999 HomepageHomepage = What is WEKA ?

make ML/DM techniques generally available apply them to practical problems (in agriculture) develop new ML/DM algorithms contribute to the theoretical framework of the field (ML/DM) Objectives of WEKA

Versions of WEKA There are several versions of WEKA: –WEKA 3.0: “book version” compatible with description in data mining book –WEKA 3.2: “GUI version” adds graphical user interfaces (book version is command- line only) –WEKA 3.4: “development version” with lots of improvements This workshop is based on WEKA 3.4(.3)

ARFF format (“flat” files) : example: Play-tennis domain The input to WEKA %this is an example of a knowledge %domain in ARFF outlook {sunny, overcast, temperature humidity windy {TRUE, play {yes, sunny,85,85,FALSE,no sunny,80,90,TRUE,no overcast,83,86,FALSE,yes rainy,70,96,FALSE,yes rainy,68,80,FALSE,yes rainy,65,70,TRUE,no overcast,64,65,TRUE,yes sunny,72,95,FALSE,no sunny,69,70,FALSE,yes rainy,75,80,FALSE,yes sunny,75,70,TRUE,yes overcast,72,90,TRUE,yes overcast,81,75,FALSE,yes... Conversion to the ARFF format ? Example: converting from MS-EXCEL to ARFF

Starting WEKA – the GUI

Preprocess panel A quick tour of the “explorer” Domain info. panel Attributes panel Status bar Filters panel Attribute info. panel Log file Attribute visualization panel

Classify panel Classifier panel Class attribute Output panel Test options panel Result panel A quick tour of the “explorer”

Visualize panel A quick tour of the “explorer”

example: The command line C:\Temp>java weka.classifiers.trees.J48 Weka exception: No training file and no object input file given. General options: -t Sets training file. -T Sets test file. If missing, a cross-validation will be performed on the training data. -c Sets index of class attribute (default: last). -x Sets number of folds for cross-validation (default: 10). -s Sets random number seed for cross-validation (default: 1). -m Sets file with cost matrix. -l Sets model input file. -d Sets model output file. -v Outputs no statistics for training data. -o Outputs statistics only, not the classifier. -i Outputs detailed information-retrieval statistics for each class. -k Outputs information-theoretic statistics. -p Only outputs predictions for test instances. -r Only outputs cumulative margin distribution. -z Only outputs the source representation of the classifier, giving it the supplied name. -g Only outputs the graph representation of the classifier. Options specific to weka.classifiers.j48.J48: -U Use unpruned tree. -C Set confidence threshold for pruning. (default 0.25) -M Set minimum number of instances per leaf. (default 2) -R Use reduced error pruning. -N Set number of folds for reduced error pruning. One fold is used as pruning set. (default 3) -B Use binary splits only. -S Don't perform subtree raising. -L Do not clean up after the tree has been built.

GUI (+): visualisation of data and (some) models GUI (-): not all the parameters can be set (reduced functionality) GUI vs. command line Command line (-): only textual visualisation of models awkward to use Command line (+): full functionality (‘saving the model’) batch processing

PROs: open source (GNU licence) platform-independent (JAVA) easy to use (relatively) easy to modify PROs & CONs of WEKA CONs: relatively slow (JAVA) ‘incomplete’ documentation (some GUI features could be explained better) some features available only from command line

Let’s go to work