Simple Multiple Line Fitting Algorithm Yan Guo. Motivation To generate better result than EM algorithm, to avoid local optimization.

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Jennifer Siegel. Statistical background Z-Test T-Test Anovas.
Introduction: The General Linear Model b b The General Linear Model is a phrase used to indicate a class of statistical models which include simple linear.
Correlation and Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
1 BA 275 Quantitative Business Methods Residual Analysis Multiple Linear Regression Adjusted R-squared Prediction Dummy Variables Agenda.
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
ASSESSING THE STRENGTH OF THE REGRESSION MODEL. Assessing the Model’s Strength Although the best straight line through a set of points may have been found.
Lecture 24: Thurs., April 8th
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Business Statistics - QBM117 Statistical inference for regression.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation and Regression Analysis
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Correlation & Regression
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
1 MULTI VARIATE VARIABLE n-th OBJECT m-th VARIABLE.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Correlation.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.
Model Selection1. 1. Regress Y on each k potential X variables. 2. Determine the best single variable model. 3. Regress Y on the best variable and each.
You want to examine the linear dependency of the annual sales of produce stores on their size in square footage. Sample data for seven stores were obtained.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 Multivariate Linear Regression Models Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
Aim: Review for Exam Tomorrow. Independent VS. Dependent Variable Response Variables (DV) measures an outcome of a study Explanatory Variables (IV) explains.
Foundation Statistics Copyright Douglas L. Dean, 2015.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
SOCW 671 #11 Correlation and Regression. Uses of Correlation To study the strength of a relationship To study the direction of a relationship Scattergrams.
BPS - 5th Ed. Chapter 231 Inference for Regression.
1 Simple Linear Regression Example - mammals Response variable: gestation (length of pregnancy) days Explanatory: brain weight.
DSCI 346 Yamasaki Lecture 6 Multiple Regression and Model Building.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 15 Multiple Regression Model Building
Statistical analysis.
Regression Analysis AGEC 784.
Correlation, Bivariate Regression, and Multiple Regression
AP Statistics Chapter 14 Section 1.
Statistical analysis.
Elementary Statistics
Estimating with PROBE II
BPK 304W Correlation.
Simple Linear Regression
BA 275 Quantitative Business Methods
Linear Regression.
Multiple Regression Models
CORRELATION ANALYSIS.
Tutorial 8 Table 3.10 on Page 76 shows the scores in the final examination F and the scores in two preliminary examinations P1 and P2 for 22 students in.
Chapter 3 Statistical Concepts.
Basic Practice of Statistics - 3rd Edition Inference for Regression
Multivariate Linear Regression Models
Adequacy of Linear Regression Models
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Simple Multiple Line Fitting Algorithm Yan Guo

Motivation To generate better result than EM algorithm, to avoid local optimization.

Line Score In order to describe how well certain points fits to a line, I developed a score function for the line model. Higher score indicates more linear line, lower score indicates less likely to be a line. Higher score can be achieved by either adding a point, or removing a point.

Simple Linear Regression

is the proportion of variation explained by regression Model. It indicates how well the prediction line fits the data. In general, higher value means better fits. is the square of the Pearson correlation.

Leverage: Used to measure the impact of a point in a line. Student Residual: Jackknife residual: Jackknife residual follows a t distribution with (n-3) degree of freedom.

Line Score Two factors are considered into the Line Score Function: R-Square and Proportion of the points in a line. Line Score is defined as N is the total number of points in the input, n is the number of points in the current line.

Experimenting the Line Score

The algorithm 1. Divide the area into certain finite area of stripes. 2. Calculate the line score for each stripe. 3. Pick the stripe with highest score, filter out outliers. Recalculate the stripe area with the fitted line in the middle. 5. Recalculate the line score with the new points inside the stripe. 6. If new line score is higher, continue to next step, otherwise go back to 3, and pick the next highest score stripe. 7. Recalculate the stripe with the newly fitted line in the middle. Go to step 5. Repeat until no more points are getting added into the stripe. 8. Remove the points from the final stripe from the input, and repeat from step1. 9. Finalize the results, detecting noise etc.

Simple Example Line Score=1.4Line Score=2

Complicated Example Line Score = 0.578

Filter out outliers

Extreme Cases

Future Improvement A Better Scoring Function. Are they more factor to be considered in this function. (Press Statistic, Cp Statistic, P- value, CVSS, etc) Adjusted R square VS R square, VS correlation

Conclusion This Algorithm works on some cases It doesn’t require initialization It works best when line is perfectly straight It can detect noise It will not work on all case, since it is probability based