Assignments CS 434-534 fall 2015. Assignment 1 due 9-18-15 Generate the in silico data set of 2sin(1.5x)+ N (0,1) with 100 random values of x between.

Slides:

Advertisements

Similar presentations

Sampling plans for linear regression

Advertisements

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.

Pattern Recognition and Machine Learning

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.

Ch11 Curve Fitting Dr. Deshi Ye

Lecture (14,15) More than one Variable, Curve Fitting, and Method of Least Squares.

Data mining and statistical learning - lecture 6

Data mining in 1D: curve fitting

Overfitting and Regularization Chapters 11 and 12 on amlbook.com.

The loss function, the normal equation,

Multivariate linear models for regression and classification Outline: 1) multivariate linear regression 2) linear classification (perceptron) 3) logistic.

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

Multiple regression analysis

Feasibility of learning: the issues solution for infinite hypothesis sets VC generalization bound (mostly lecture 5 on AMLbook.com)

Statistical Methods Chichang Jou Tamkang University.

Linear Regression Models Based on Chapter 3 of Hastie, Tibshirani and Friedman Slides by David Madigan.

Reduced Support Vector Machine

Midterm Review Goodness of Fit and Predictive Accuracy

Nonlinear Regression Probability and Statistics Boris Gervits.

Stat 112: Lecture 13 Notes Finish Chapter 5: –Review Predictions in Log-Log Transformation. –Polynomials and Transformations in Multiple Regression Start.

Rating Systems Vs Machine Learning on the context of sports George Kyriakides, Kyriacos Talattinis, George Stefanides Department of Applied Informatics,

Introduction to Linear and Logistic Regression. Basic Ideas Linear Transformation Finding the Regression Line Minimize sum of the quadratic residuals.

1 Linear Classification Problem Two approaches: -Fisher’s Linear Discriminant Analysis -Logistic regression model.

Classification and Prediction: Regression Analysis

CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.

Applications The General Linear Model. Transformations.

5.2 Input Selection 5.3 Stopped Training

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.

Simple Linear Regression (SLR)

Simple Linear Regression. Data available ： (X,Y) Goal ： To predict the response Y. (i.e. to obtain the fitted response function f(X)) Least Squares Fitting.

5.4 Line of Best Fit Given the following scatter plots, draw in your line of best fit and classify the type of relationship: Strong Positive Linear Strong.

Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;

SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,

Fundamentals of Artificial Neural Networks Chapter 7 in amlbook.com.

Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.

Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.

Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.

Dimensionality reduction

MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.

Extending linear models by transformation (section 3.4 in text) (lectures 3&4 on amlbook.com)

Computational Intelligence: Methods and Applications Lecture 22 Linear discrimination - variants Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.

Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Estimating standard error using bootstrap

Chapter 7. Classification and Prediction

Deep Feedforward Networks

CH 5: Multivariate Methods

Classification with Perceptrons Reading:

Statistical Methods For Engineers

Ying shen Sse, tongji university Sep. 2016

CS539: Project 3 Zach Pardos.

Linear regression Fitting a straight line to observations.

Overfitting and Underfitting

Nonlinear Fitting.

Generally Discriminant Analysis

The loss function, the normal equation,

Mathematical Foundations of BME Reza Shadmehr

Review for test #2 Fundamentals of ANN Dimensionality reduction

Ch 4.1 & 4.2 Two dimensions concept

Adequacy of Linear Regression Models

Adequacy of Linear Regression Models

3.2. SIMPLE LINEAR REGRESSION

Classification by multivariate linear regression

Presentation transcript:

Assignments CS fall 2015

Assignment 1 due Generate the in silico data set of 2sin(1.5x)+ N (0,1) with 100 random values of x between 0 and 5 Use 25 samples for training, 75 for validation Fit polynomials of degree 1 – 5 to the training set. Calculate at each degree. Plot your result as shown in previous slide to find the “elbow” in E val and best complexity for data mining Use the full data set to find the optimum polynomial of best complexity Show this result as plot of data and fit on the same set of axes. Report the minimum sum of squared residuals and coefficient of determination

Assignment 2: Due now Suppose we want  < 0.1 with 90% confidence (i.e.  = 0.1) We require using We get Use a non-linear root-finding code to solve this implicit relationship for N with d VC = 3 and 6. Hint:

Classify beer bottles for 3 breweries with the most data Randomized shortened dataset on website For better results change label 6 to 3 Fit and bin results Calculate confusion matrix and training accuracy. Estimate CV-1 accuracy by “leave one out” method Calculate  and confidence in  Revised Assignment 3 due

Assignment 4: Due Code the Logistic Regression Algorithm for fixed learning rate. Use stopping criteria on slide 48. For w(0), use random numbers uniformly distributed on [0,1]. Modify csv file logit-data on the class web page as needed to obtain a training set of 298 samples. For 10 different draws of w(0) find the optimum w and E in For the best case (smallest E in ), use risk scores to construct a 2x2 confusion matrix. Use risk scores of the best case to calculate the probability of a heart attack for each example in the training set. Plot these probabilities using different symbols for positive and negative examples.

Revised assignment 5 due Find the eigenvalues and eigenvectors of the covariance matrix for data set randomized shortened glassdata.csv. Plot the PoV. How many eigenvalues are required to capture more than the 90% of the variance? Transform the attribute data by the eigenvectors of the 3 largest eigenvalues. Do a scatter plot of pc1 vs pc2 with data labels. Use a validation set of 100 examples to find the best quadratic extension of a linear model by successively including z 1 2, z 2 2, z 3 2, z 1 z 2, z 1 z 3, and z 2 z 3. Plot E val vs number of added terms and identify the elbow. Use all data to compare the best quadratic extension with the linear model in attribute space (confusion matrix and fraction correctly classified).

Assignment 6 due Use dataset randomized shortened glassdata.csv to develop a classifier for beer-bottle glass by ANN non-linear regression. Keep the class labels as 1, 2, and 6. With validations set of 100 examples and training set of 74 examples, select the best number of hidden nodes in a single hidden layer and the best number of epochs for weight refinement. Use all the data to optimize weights at the selected structure and training time. Calculate confusion matrix and accuracy of prediction. Use 10-fold cross validation to estimate the accuracy of a test set. MatLab code for calculating confusion matrices is on the class web page