Credibility: Evaluating what’s been learned This Lecture based on Ch 5 of Witten & Frank Plan for this week 3 classes before Midterm Paper and Survey discussion.

Slides:



Advertisements
Similar presentations
Projects are unique, one-time operations designed to accomplish a specific set of objectives in a limited timeframe Project managers are responsible for.
Advertisements

Model Assessment and Selection
Model Assessment, Selection and Averaging
SLIQ: A Fast Scalable Classifier for Data Mining Manish Mehta, Rakesh Agrawal, Jorma Rissanen Presentation by: Vladan Radosavljevic.
Evaluation.
CSCD 433/533 Advanced Computer Networks Lecture 1 Course Overview Fall 2011.
Evaluation.
Introduction to SEG 5010 Hong Cheng 2009/10 Second Term.
Application of Apriori Algorithm to Derive Association Rules Over Finance Data Set Presented By Kallepalli Vijay Instructor: Dr. Ruppa Thulasiram.
HCC class lecture 15 comments John Canny 3/14/05.
ECIV 301 Programming & Graphics Numerical Methods for Engineers Lecture 5 Approximations, Errors and The Taylor Series.
Evaluation and Credibility
System Evaluation To evaluate the error probability of the designed Pattern Recognition System Resubstitution Method – Apparent Error Overoptimistic Holdout.
Evaluation of Learning Models
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.
Machine Learning CS 165B Spring 2012
Teaching Teaching Discrete Mathematics and Algorithms & Data Structures Online G.MirkowskaPJIIT.
CLassification TESTING Testing classifier accuracy
沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
Intelligent Systems Programming COMM2M Harry R. Erwin, PhD University of Sunderland.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Introduction to Network Security J. H. Wang Feb. 24, 2011.
Proposal for Term Project Operating Systems, Fall 2011 J. H. Wang Nov. 3, 2011.
Proposal for Term Project Operating Systems, Fall 2015 J. H. Wang Sep. 18, 2015.
CS525 DATA MINING COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester 2008 Eng. Tamer Eshtawi First Semester 2008.
Data Mining with Oracle using Classification and Clustering Algorithms Presented by Nhamo Mdzingwa Supervisor: John Ebden.
Proposal for Term Project J. H. Wang Mar. 2, 2015.
BIT 116:JavaScript. BIT 116: Scripting2 Today Ch 9: Object Oriented Programming, Part 1 –Random numbers –Basic OOP stuff Reading quiz turned in?
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Occam’s Razor No Free Lunch Theorem Minimum.
Data Warehousing/Mining 1 Data Warehousing/Mining Comp 150DW Course Overview Instructor: Dan Hebert.
Association Rules Plan for this week Progress report Midterm review In class Minute paper.
Proposal for Term Project Operating Systems, Fall 2012 J. H. Wang Nov. 13, 2012.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
Public Presentation TEMPUS project (CD-JEP 16160/2001) Innovation of Computer Science Curriculum in Higher Education Artificial Intelligence Course Innovation.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Summary „Data mining” Vietnam national university in Hanoi, College of technology, Feb.2006.
Notes for Week 11 Term project evaluation and tips 3 lectures before Final exam Discussion questions for this week.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall DM Finals Study Guide Rodney Nielsen.
Paper Title Authors names Conference and Year Presented by Your Name Date.
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Chemical Hydrogeology GLY What this course is: Diffusion equation Dispersion Convection-Dispersion equation Boundary conditions Sorption Production.
Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.
Evaluation of Learning Models Evgueni Smirnov. Overview Motivation Metrics for Classifier’s Evaluation Methods for Classifier’s Evaluation Comparing Data.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
CIT 341: IT Project Management. Objectives Understand the main concepts of project management Understand how Projects are Delivered within Budget Learn.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Credibility: Evaluating What’s Been Learned Predicting.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
Reflections on Using Simulation Based Methods to Teach Statistical Methods Amanda Ellis and Melissa Pittard University of Kentucky, Department of Statistics.
Adv DSP Spring-2015 Lecture#11 Spectrum Estimation Parametric Methods.
Proposal for Term Project Information Security, Fall 2013 J. H. Wang Nov. 5, 2013.
CS491/691: Introduction to Aerial Robotics YOUR TEAM NAME YOUR TOPIC TITLE (CONTROL, ESTIMATION, ETC)
XMGT 230 master Learn/xmgt230master.com FOR MORE CLASSES VISIT
Data Science Credibility: Evaluating What’s Been Learned
<Student’s name>
Data Science Credibility: Evaluating What’s Been Learned Evaluating Numeric Prediction WFH: Data Mining, Section 5.8 Rodney Nielsen Many of these.
Objectives of the Course and Preliminaries
MAT 510 RANK Education Your Life - mat510rank.com.
Data Mining Practical Machine Learning Tools and Techniques
Evaluation and Its Methods
Proposal for Term Project Operating Systems, Fall 2018
Evaluation and Its Methods
Evaluation and Its Methods
Title: select a descriptive title
Presentation transcript:

Credibility: Evaluating what’s been learned This Lecture based on Ch 5 of Witten & Frank Plan for this week 3 classes before Midterm Paper and Survey discussion

Plan for this week Monday –Performance prediction of learning schemes –Cross-validation, Leave-one-out, bootstrap Wednesday –Loss function, cost-sensitive learning –Evaluating numeric prediction –Minimum description length (MDL) principle –Comparing data mining schemes

3 Classes before Midterm Concept review exercises – in class Less homework More in class discussion Example: What is the design idea for basic association rule mining algorithm Apriori?

Discussions on paper and Survey areas one-page progress Report Due 3/24 Research paper selection (title, authors, source) Project proposal (problem definition and proposed solution design, dataset, algorithms, evaluation methods, or survey topic …)

Concept Questions Class 1 What are the design motivations and ideas for the following evaluation methods: –Cross-validation –Leave-one-out –0.632 bootstrap What is the final error estimate in each of the above method?

Concept Questions class 2 When do we care about accuracy of prediction probabilities? What role does loss function play in evaluation of learning scheme? Which loss function should we use if you are gambling on a particular event coming up? What is MDL principle? Give an application of MDL in data mining. Why lift chart is a valuable tool in marketing?