Data Mining Teaching experience at the FIB. What is Data Mining? A broad set of techniques and algorithms brought from machine learning and statistics.

Slides:



Advertisements
Similar presentations
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Advertisements

On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach Author: Steven L. Salzberg Presented by: Zheng Liu.
A Data Mining Course for Computer Science and non Computer Science Students Jamil Saquer Computer Science Department Missouri State University Springfield,
Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon.
Machine Learning Instance Based Learning & Case Based Reasoning Exercise Solutions.
Classification with Multiple Decision Trees
Spike Sorting Goal: Extract neural spike trains from MEA electrode data Method 1: Convolution of template spikes Method 2: Sort by spikes features.
Collaborative Filtering in iCAMP Max Welling Professor of Computer Science & Statistics.
SAK 5609 DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman
Machine Learning Group University College Dublin 4.30 Machine Learning Pádraig Cunningham.
CSE 546 Data Mining Machine Learning Instructor: Pedro Domingos.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
Introduction to Artificial Neural Network and Fuzzy Systems
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Introduction to Data Mining Engineering Group in ACL.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
General Information Course Id: COSC6342 Machine Learning Time: MO/WE 2:30-4p Instructor: Christoph F. Eick Classroom:SEC 201
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Welcome to Computing. How is Computing assessed? AS Unit 1 Practical Theory of computation. Fundamentals of programming, data structures and algorithms.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
CSCI 347 – Data Mining Lecture 01 – Course Overview.
CEN 592 PATTERN RECOGNITION Spring Term CEN 592 PATTERN RECOGNITION Spring Term DEPARTMENT of INFORMATION TECHNOLOGIES Assoc. Prof.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Chapter 13 Genetic Algorithms. 2 Data Mining Techniques So Far… Chapter 5 – Statistics Chapter 6 – Decision Trees Chapter 7 – Neural Networks Chapter.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
University of Tampere, CS Department Studying Computer Sciences at the University of Tampere Jyrki Nummenmaa
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom:AH123
CSE543T: Algorithms for Nonlinear Optimization Yixin Chen Department of Computer Science & Engineering Washington University in St Louis Spring, 2011.
Machine Learning for Language Technology Introduction to Weka: Arff format and Preprocessing.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Machine Learning Lecture 1. Course Information Text book “Introduction to Machine Learning” by Ethem Alpaydin, MIT Press. Reference book “Data Mining.
An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.
Text Feature Extraction. Text Classification Text classification has many applications –Spam detection –Automated tagging of streams of news articles,
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
Learning from observations
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Summary „Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.
COSC 6340 Projects & Homeworks Spring Learn how to define tables Learn how to load and create an Oracle database Learn how to define user views.
Data Mining and Decision Support
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Machine Learning in CSC 196K
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Biological data representation and data mining Xin Chen
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Usman Roshan Dept. of Computer Science NJIT
Presented by Khawar Shakeel
DATA MINING © Prentice Hall.
2009: Topics Covered in COSC 6368
Prepared by: Mahmoud Rafeek Al-Farra
RESEARCH APPROACH.
Data Mining Techniques So Far…
SEEM5770/ECLT5840 Course Review
Machine Learning Week 1.
Prepared by: Mahmoud Rafeek Al-Farra
Prepared by: Mahmoud Rafeek Al-Farra
Computer Vision Chapter 4
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
Prepared by: Mahmoud Rafeek Al-Farra
2004: Topics Covered in COSC 6368
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

Data Mining Teaching experience at the FIB

What is Data Mining? A broad set of techniques and algorithms brought from machine learning and statistics to make decisions based on data … plus a lot of experience and common sense

What subset is taught at the FIB? Introduction to DM Multivariate statistics Clustering Association rules Regression and GLMs Decision trees Bayesian gaussian classifiers Nearest neighbours Neural networks Support vector machines

Additional subjects Feature selection and extraction Model evaluation, selection and combination Data mining in the real world …. – Does it need professional software? – A talk by a professional

Practical side No exam Three practical homeworks: – 1. Multivariate statistics (15%) – 2. Clustering & association rules (15%) – 3. Full Data Mining Project (70%) Involves a 20’ oral defense! Lab sessions (2 hours/week) Language selected: R

Now for the funny part … Why two teachers? – One from Software department (a computer scientist) – One from Statistics department (a statistician)

An equivalence table

Some tips for the students A successful data mining project has four components: 1.Good data 2.Clean goals 3.Good algorithms 4.Human expertise

The students’ replies ( and subtext ) R is not a programming language – (it can’t be programmed as if C) The results are not as good as we expected – (although we are very clever) Too much theory – (but later on, we shall be missing it) All in all, we have enjoyed it – (?)

In conclusion The students like the course, specially the practical work They tend to work autonomously, but not always for the good In the end, no two results are identical This is a course with much room to improve on the teacher’s part