Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Science Credibility: Evaluating What’s Been Learned Evaluating Numeric Prediction WFH: Data Mining, Section 5.8 Rodney Nielsen Many of these.

Similar presentations


Presentation on theme: "Data Science Credibility: Evaluating What’s Been Learned Evaluating Numeric Prediction WFH: Data Mining, Section 5.8 Rodney Nielsen Many of these."— Presentation transcript:

1 Data Science Credibility: Evaluating What’s Been Learned Evaluating Numeric Prediction WFH: Data Mining, Section 5.8 Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall

2 Credibility: Evaluating What’s Been Learned
Issues: training, testing, tuning Predicting performance Holdout, cross-validation, bootstrap Comparing schemes: the t-test Predicting probabilities: loss functions Cost-sensitive measures Evaluating numeric prediction The Minimum Description Length principle

3 Evaluating Numeric Prediction
Same strategies: independent training, validation and test sets, significance tests, etc. (avoid cross-validation and bootstrapping for reporting) Difference: error measures Actual target values: y1 y2 …yN Predicted target values: y^1 y^2 … y^N Most popular measure: mean-squared error Easy to manipulate mathematically

4 Other Measures The root mean-squared error :
The mean absolute error is less sensitive to outliers than the mean-squared error: Sometimes relative error values are more appropriate (e.g. 10% for an error of 50 when predicting 500)

5 Improvement on the Mean
How much does the scheme improve on simply predicting the average? The relative squared error is: Root relative squared error Relative absolute error

6 Correlation Coefficient
Measures the statistical correlation between the predicted values and the actual values Pearson product-moment correlation coefficient, rho Scale independent, between –1 and +1 Good performance leads to large values

7 Pearson product-moment correlation coefficient
Examples of scatter diagrams with different values of correlation coefficient (ρ)

8 Pearson product-moment correlation coefficient
Several sets of (x, y) points, with the correlation coefficient of x and y for each set. Note that the correlation reflects the non-linearity and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). Note: the figure in the center has a slope of 0 but in that case the correlation coefficient is undefined because the variance of Y is zero.

9 Which Measure? Best to look at all of them Often it doesn’t matter
Student Q: In what situations would we want to use the correlation coefficient as a performance measure for numeric prediction? Best to look at all of them Often it doesn’t matter Example: 0.91 0.89 0.88 Correlation coefficient 30.4% 34.8% 40.1% 43.1% Relative absolute error 35.8% 39.4% 57.2% 42.2% Root rel squared error 29.2 33.4 38.5 41.3 Mean absolute error 57.4 63.3 91.7 67.8 Root mean-squared error D C B A D best C second-best A, B arguable


Download ppt "Data Science Credibility: Evaluating What’s Been Learned Evaluating Numeric Prediction WFH: Data Mining, Section 5.8 Rodney Nielsen Many of these."

Similar presentations


Ads by Google