The Knowledge Flow Interface 김개원 데이터베이스 연구실. 1. Overview.

Slides:



Advertisements
Similar presentations
Machine Learning Homework
Advertisements

Florida International University COP 4770 Introduction of Weka.
Weka & Rapid Miner Tutorial By Chibuike Muoh. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering.
How to Run WEKA Demo SVM in WEKA T.B. Chen
WEKA (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains.
WEKA - Experimenter (sumber: WEKA Explorer user Guide for Version 3-5-5)
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Three kinds of learning
1 Homework  What’s important (i.e., this will be used in determining your grade): Finding features that make a difference You should expect to do some.
A Short Introduction to Weka Natural Language Processing Thursday, September 25th.
An Extended Introduction to WEKA. Data Mining Process.
1 Statistical Learning Introduction to Weka Michel Galley Artificial Intelligence class November 2, 2006.
A Short Introduction to Weka Natural Language Processing Thursday, September 27 Frank Enos and Andrew Rosenberg.
1 How to use Weka How to use Weka. 2 WEKA: the software Waikato Environment for Knowledge Analysis Collection of state-of-the-art machine learning algorithms.
CSCI 347 / CS 4206: Data Mining Module 05: WEKA Topic 04: Data Preparation Tools.
An Exercise in Machine Learning
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
Contributed by Yizhou Sun 2008 An Introduction to WEKA.
Rapid Miner Session CIS 600 Analytical Data Mining,EECS, SU Three steps for use  Assign the dataset file first  Select functionality  Execute.
WEKA – Knowledge Flow & Simple CLI
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
Appendix: The WEKA Data Mining Software
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Weka Project assignment 3
1 1 Slide Evaluation. 2 2 n Interactive decision tree construction Load segmentchallenge.arff; look at dataset Load segmentchallenge.arff; look at dataset.
Figure 1.1 Rules for the contact lens data.. Figure 1.2 Decision tree for the contact lens data.
Hands-on predictive models and machine learning for software Foutse Khomh, Queen’s University Segla Kpodjedo, École Polytechnique de Montreal PASED - Canadian.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Weka: Experimenter and Knowledge Flow interfaces Neil Mac Parthaláin
For ITCS 6265/8265 Fall 2009 TA: Fei Xu UNC Charlotte.
W E K A Waikato Environment for Knowledge Analysis Branko Kavšek MPŠ Jožef StefanNovember 2005.
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
Weka – A Machine Learning Toolkit October 2, 2008 Keum-Sung Hwang.
ITSC/University of Alabama in Huntsville ADaM version 4.0 (Eagle) Tutorial Information Technology and Systems Center University of Alabama in Huntsville.
WEKA Machine Learning Toolbox. You can install Weka on your computer from
CSE/CIS 787 Analytical Data Mining, Dept. of EECS, SU Three steps for use  Assign the dataset file first  Assign the analysis type you want.
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
W E K A Waikato Environment for Knowledge Aquisition.
An Exercise in Machine Learning
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Introduction to Weka ML Seminar for Rookies Byoung-Hee Kim Biointelligence Lab, Seoul National University.
Weka Tutorial. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering – association rule Created by.
WEKA's Knowledge Flow Interface Data Mining Knowledge Discovery in Databases ELIE TCHEIMEGNI Department of Computer Science Bowie State University, MD.
Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Machine Learning Homework Gaining familiarity with Weka, ML tools and algorithms.
@relation age sex { female, chest_pain_type { typ_angina, asympt, non_anginal,
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
Detecting Web Attacks Using Multi-Stage Log Analysis
CS 8520: Artificial Intelligence
Waikato Environment for Knowledge Analysis
WEKA.
Sampath Jayarathna Cal Poly Pomona
Weka Package Weka package is open source data mining software written in Java. Weka can be applied to your dataset from the GUI, the command line or called.
Machine Learning with Weka
Tutorial for LightSIDE
Tutorial for WEKA Heejun Kim June 19, 2018.
Opening Weka Select Weka from Start Menu Select Explorer Fall 2003
CAR EVALUATION SIYANG CHEN ECE 539 | Dec
CSCI N317 Computation for Scientific Applications Unit Weka
CS4705 – Natural Language Processing Thursday, September 28
Machine Learning with WEKA
Lecture 10 – Introduction to Weka
Statistical Learning Introduction to Weka
Assignment 1: Classification by K Nearest Neighbors (KNN) technique
Assignment 8 : logistic regression
Neural Networks Weka Lab
Data Mining CSCI 307, Spring 2019 Lecture 7
Data Mining CSCI 307, Spring 2019 Lecture 8
Presentation transcript:

The Knowledge Flow Interface 김개원 데이터베이스 연구실

1. Overview

Overview  Explorer  특정 data 에 대해 다양한 option 으로 실험을 해볼 수 있는 환경  KnowledgeFlow  기능적으로 Explorer 와 동일하지만 Drag & Drop 방식으로 실험해 볼 수 있음  Incremental Learning 이 가능함  Simple CLI  Command Line Interface 를 구동시키는 메뉴로 WEKA Java Class Module 을 바로 구동시켜 볼 수 는 환경  Experimenter  여러 Machine Running Algorithm 을 비교해 볼 수 있는 환경 3

2. Components

Data Sources & Data Sinks  Data Sources / Data Sinks  데이터가 입력 / 출력 되는 Source 를 설정하기 위해 사용  Possible Data Sources  ARFF file (Attribute-Relation File Format)  CSV file (Comma-Separated Values)  Spreadsheets 에서 데이터 형식이 변환된 파일  C4.5 file  C4.5 Decision Tree Algorithm 이 적용된 File  Serialized Instance  Java Object 의 Instance 로 저장된 데이터 파일  Database 5

Data Sources File Format 비교 ARFF File FormatCSV File Format 6

Visualization  Visualization  출력을 Text 나 Grapth 등으로 시각적으로 나타내기 위해 사용  Components  Data Visualizer  Scatter Plot Matrix  Attribute Summarizer  Model Performance Chart  Text Viewer  Graph Viewer  Strip Chart 7

Visualization 8 Data VisualizerScatter Plot Matrix Attribute SummarizerModel Performance Chart

Evaluation  Evaluation  입력과 출력 알고리즘을 구성하기 위해 사용  Components  Training Set Maker  Test Set Maker  Cross Validation Fold Maker  Train Test Split Maker  Class Assigner  Class Value Picker  Classifier Performance Evaluator  Incremental Classifier Evaluator  Cluster Performance Evaluator 9

Evaluation  Components  TrainingSetMaker / TestSetMaker  Training Set / Test Set 으로 Data Set 을 만든다.  CrossValidationFoldMaker  Data Set 으로부터 Cross-Validation Folds 를 구성한다.  Cross-Validation  모집단의 표본에 자주 이용되는 방법을 같은 모집단의 다른 표본에 적용시켜 정확성을 확인  K-Folds Cross-Validation  1 개 – Test Set, K-1 개 – Training Set 으로 구성  TrainTestSplitMaker  Data Set 에서 Training Set 을 ?% 사용할 것인지 설정  ClassAssigner  분석의 목적이 되는 속성 ( 종속 변수 ) 을 설정  ClassValuePicker  ClassifierPerformanceEvaluator / ClusterPerformanceEvaluator  알고리즘 평가 통계치를 수집  Visualization Components 에 연결  IncrementalClassifierEvaluator 10

3. Operations

Edit Operations & Action Operations  Edit Operation  The Edit operations delete components and open up their configuration panel  Actions Operation  The Actions operations are specific to that type of component 12

 Connections Operation  The Connections operations are used to connect components  Two kinds of connection from data sources  Data Set  Batch operation  Test Set or Training Set 을 구성하여 일괄적으로 처리하는 Classifier Components 에 연결  Instance  Stream operation  Incremental Learning 이 가능한 Classifier Components 에 연결  Two types of connection from classifier  graph, text  batchClassifier, incrementalClassifier  Performance Evaluator, Incremental Classifier Evaluator 에 연결 13 Connections Operation

4. Incremental Learning

 Several classifiers & Filters that can handle data incrementally  Classifiers  AODE, NaiveBayesUpdateable, Winnow, instance-based learners(IBl, IBk, KStar, LWL)  Filters  Add, AddExpression, Copy, FirstOrder, MakeIndicator, MergeTwoValues, NonSparseToSparse, NumericToBinary, NumericTransform, Obfuscate, Remove, RemoveType, RemoveWithValues, SparseToNonSparse, and SwapValues  Incremental Learning Algorithms can process data files that are too large to fit in memory  Many instance-based learners store the entire dataset internally 15 Incremental Learning

5. Example

Example (batch mode) 17

Example (batch mode) 18

Example (batch mode) 19

Example (incremental learning) 20

 Strip chart plots both the accuracy and the root mean-squared probability error against time 21 Example (incremental learning)