Clustering, performance evaluation, and Term Project 1.Term Project 2.Resource for review.

Slides:



Advertisements
Similar presentations
Density-Based Clustering Math 3210 By Fatine Bourkadi.
Advertisements

A Data Mining Course for Computer Science and non Computer Science Students Jamil Saquer Computer Science Department Missouri State University Springfield,
DATA MINING CLUSTERING ANALYSIS. Data Mining (by R.S.K. Baber) 2 CLUSTERING Example: suppose we have 9 balls of three different colours. We are interested.
Nokia Technology Institute Natural Partner for Innovation.
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
CSC 177 Data warehouse and Mining project Pooja Vora Vishma Shah Guided by – Prof. Meiliu lu.
SAK 5609 DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman
Microarray GEO – Microarray sets database
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Cluster Analysis: Advanced Concepts and Algorithms Figures for Chapter 9 Introduction.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Cluster Analysis: Basic Concepts and Algorithms Figures for Chapter 8 Introduction.
Learning Bit by Bit Clustering. Supervised vs. Unsupervised Training vs. Exploring.
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Linear Clustering Algorithm BY Horne Ken & Khan Farhana & Padubidri Shweta.
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
CSIT530 Projects -- 1 H.Lu/HKUST CSIT530: Suggested Projects  Three types of projects  System implementation  Literature survey  Research  General.
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining: Exploring Data Figures for Chapter 3 Introduction to Data Mining by Tan, Steinbach,
Neural Network Homework Report: Clustering of the Self-Organizing Map Professor : Hahn-Ming Lee Student : Hsin-Chung Chen M IEEE TRANSACTIONS ON.
© Tan,Steinbach, Kumar Introduction to Data Mining 1/17/ Data Mining Association Analysis: Advanced Concepts Figures for Chapter 7 Introduction to.
Evaluating Performance for Data Mining Techniques
CSc288 Term Project Data mining on predict Voice-over-IP Phones market Huaqin Xu.
CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling
CSCI 347 – Data Mining Lecture 01 – Course Overview.
Appendix: The WEKA Data Mining Software
1 Research Groups : KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems SCI 2 SMetrology and Models Intelligent.
1 Lecture 10 Clustering. 2 Preview Introduction Partitioning methods Hierarchical methods Model-based methods Density-based methods.
No. 1 Classification and clustering methods by probabilistic latent semantic indexing model A Short Course at Tamkang University Taipei, Taiwan, R.O.C.,
1 Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston.
Research Methods in Computer Science James Gain
A protocol for evaluating an OODBMS Master thesis in Computer Science Anders Carlsson.
27-18 września Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,
Chameleon: A hierarchical Clustering Algorithm Using Dynamic Modeling By George Karypis, Eui-Hong Han,Vipin Kumar and not by Prashant Thiruvengadachari.
Prepared by: Mahmoud Rafeek Al-Farra
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.8: Clustering Rodney Nielsen Many of these.
Data Mining Techniques Clustering. Purpose In clustering analysis, there is no pre-classified data Instead, clustering analysis is a process where a set.
Summary „Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.
Cluster Analysis.
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Clustering in R Xue li CS548 showcase. Source html project.org/web/packages/cluster/index.html.
Clustering Algorithms Meta Applier (CAMA) Toolbox Dmitry S. Shalymov Kirill S. Skrygan Dmitry A. Lyubimov.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
Cluster Analysis Data Mining Experiment Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
Cluster Analysis Dr. Bernard Chen Assistant Professor Department of Computer Science University of Central Arkansas.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 28 Nov 9, 2005 Nanjing University of Science & Technology.
Clustering Algorithms Sunida Ratanothayanon. What is Clustering?
Information Retrieval Search Engine Technology (8) Prof. Dragomir R. Radev.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Machine Learning Lecture 4: Unsupervised Learning (clustering) 1.
Notes on Introduction CSC 196K In Class Discussion Course Overview Basic concepts of data mining Introduction to data warehousing.
No. 1 Classification Methods for Documents with both Fixed and Free Formats by PLSI Model* 2004International Conference in Management Sciences and Decision.
Analysis of New York State Medicaid Program Enrollment by Month: Beginning 2009 TEAM #3 : TEAM PROJECT PRESENTATION (DATA MINING) DCS861A EMERGING INFORMATION.
Clustering and Term Project
Discussion Class 11 Cluster Analysis.
Waikato Environment for Knowledge Analysis
Research Areas Christoph F. Eick
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 4 - Case Study Clustering
Objectives Data Mining Course
Dept. of Computer Science University of Liverpool
Introduction: Some Representative Problems
Clustering The process of grouping samples so that the samples are similar within each group.
CSE572: Data Mining by H. Liu
Presentation transcript:

Clustering, performance evaluation, and Term Project 1.Term Project 2.Resource for review

Term Project Questions? Examples: –Research problems in Data MiningResearch problems in Data Mining –Industry problems in Data Mining/Data Warehousing –Explore new data with existing/new tools (C5, Cubist, Weka) –Explore data in comparative analysis (different algorithms, tool extensions, data selection, preprocessing ) –Focus on solving a problem (application or technical) and conduct a literature survey

Clustering (Dunham’s ppt Part II clustering ) –Similarity and distance measures –Hierarchical algorithms (single link…) –Partition algorithms (K-Means, PAM,…)

Additional Notes on EM Algorithms: Clustering Witten’s book , pdf ; –Background, introduction on Statistical based clustering (EM algorithm) Dunham’s book 47-51, Part I –Basic concept of EM algorithm

Performance Evaluation Witten’s book Chapter 5 (see on-line notes)