Rodent Complaints in Boston Number of rodent complaints in Boston per 2010 US census tract, 2011-2013 Question: Is the spatial pattern of rodent complaints.

Slides:



Advertisements
Similar presentations
A Brief Introduction to Spatial Regression
Advertisements

Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Regression and correlation methods
Chapter 12 Inference for Linear Regression
Kin 304 Regression Linear Regression Least Sum of Squares
Brief introduction on Logistic Regression
Forecasting Using the Simple Linear Regression Model and Correlation
Copyright © 2010 Pearson Education, Inc. Slide
The plot below shows residuals from a simple regression. What, if anything, is of greatest concerning about these residuals? (A) They exhibit heteroscedasticity.
Ch.6 Simple Linear Regression: Continued
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation and Regression
Chapter 4 The Relation between Two Variables
AP Statistics Chapter 3 Practice Problems
Chapter 12 Simple Linear Regression
Describing the Relation Between Two Variables
Correlation and Autocorrelation
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Linear Regression with One Regression
Correlation and Regression Analysis
Aim: How do we use SPSS to create and interpret scatterplots? SPSS Assignment 1 Due Friday 2/12.
Simple Linear Regression Analysis
T-tests and ANOVA Statistical analysis of group differences.
Linear Regression Analysis
Correlation & Regression
MATH 1107 Elementary Statistics Lecture 6 Scatterplots, Association and Correlation.
Descriptive Methods in Regression and Correlation
Lecture 3-2 Summarizing Relationships among variables ©
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Simple Linear Regression
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Ch4 Describing Relationships Between Variables. Pressure.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Correlation Correlation is used to measure strength of the relationship between two variables.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Regression Regression relationship = trend + scatter
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
1 Quadratic Model In order to account for curvature in the relationship between an explanatory and a response variable, one often adds the square of the.
Simple Linear Regression In the previous lectures, we only focus on one random variable. In many applications, we often work with a pair of variables.
Statistics Bivariate Analysis By: Student 1, 2, 3 Minutes Exercised Per Day vs. Weighted GPA.
Correlation. Correlation Analysis Correlations tell us to the degree that two variables are similar or associated with each other. It is a measure of.
3.3 Correlation: The Strength of a Linear Trend Estimating the Correlation Measure strength of a linear trend using: r (between -1 to 1) Positive, Negative.
Chapter 12: Correlation and Linear Regression 1.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Correlation/Regression - part 2 Consider Example 2.12 in section 2.3. Look at the scatterplot… Example 2.13 shows that the prediction line is given by.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
Warm-up O Turn in HW – Ch 8 Worksheet O Complete the warm-up that you picked up by the door. (you have 10 minutes)
Psychology 202a Advanced Psychological Statistics October 22, 2015.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
Linear Regression Linear Regression. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Purpose Understand Linear Regression. Use R functions.
There is a hypothesis about dependent and independent variables The relation is supposed to be linear We have a hypothesis about the distribution of errors.
Regression Chapter 5 January 24 – Part II.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Chapter 12: Correlation and Linear Regression 1.
Ch. 10 – Linear Regression (Day 2)
Chapter 14: Correlation and Regression
I271B Quantitative Methods
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
EQ: How well does the line fit the data?
Simple Linear Regression
Algebra Review The equation of a straight line y = mx + b
Presentation transcript:

Rodent Complaints in Boston Number of rodent complaints in Boston per 2010 US census tract, Question: Is the spatial pattern of rodent complaints in Boston is related to other information in a) the Mayor’s Service Hotline data or b) the 2010 US census? Result: While statistically significant correlations are found, no clear causal relationship is suggested by the information at hand. Data: Boston Mayor Service Hotline: US census: (links to mass.gov website)

Linear Model of Rodent Complaints – OLS Ordinary Least Squares (OLS) model of rodent complaints Exogenous variables: 44 variables extracted from Mayor’s Service Hotline and 2010 census Question: Can other information in the Mayor’s Service Hotline and 2010 census explain the spatial variability in the rodent complaints? Note: Gray census tracts are those with < 500 residents and 1 outlier, located in Allston. The model captures some of the spatial pattern in rodent complaints, but the difference map reveals model deficiencies. Of particular importance are the large residuals in census tracts with high observed rodent counts.

Linear Model of Rodent Complaints – Poisson Generalized Linear Model (GLM), assuming Poisson distribution of rodent complaints Question: Can we make a better model using a generalized linear model (GLM) framework, assuming a Poisson distribution of rodent complaints? Note: Gray census tracts are those with < 500 residents and 1 outlier, located in Allston. This exercise is reasonable because rodent complaints in Boston follow something closer to a Poisson than a Gaussian distribution. Flipping between slides shows that red/blue tones in the difference map are somewhat muted in GLM. However, large residuals do persist.

Linear Model of Rodent Complaints – Poisson The GLM outperforms OLS at small values of rodent complaints, where OLS often predicts negative values. The Poisson regression also performs better at large values of rodent complaints, though there is still room for improvement. Improvement using the Poisson GLM may be difficult to visualize in the maps, so I plot true and modeled rodent complaints in ascending order. Robust interpretation of a model with many exogenous variables, some of which may exhibit strong colinearity, is difficult. I therefore seek a simpler model.

Linear Model of Rodent Complaints – Sparsity I perform OLS regression again, regularizing the vector of regression coefficients using its L1 norm. The strength of the regularization is controlled by a parameter, α. We select the variables associated with the first five regression coefficients to turn on using L1 regularization. We build a linear model from this smaller set of variables. OLS coefficients at different strengths of regularizationL1 regularization promotes sparse solutions, meaning that many regression coefficients are set to zero. The plot at right shows regression coefficients turning on as I relax the regularization constraint (moving from right to left on the x-axis). Perhaps I can make a simpler model using the first few coefficients to turn on.

Linear Model of Rodent Complaints – Conclusions A Poisson GLM using the five coefficients selected on the previous slide reveals nothing about rodent complaints in Boston. I skip showing the results because they are of no interest. Instead, I summarize my findings and move on to Part 2: unsupervised learning! Conclusions: The spatial distribution of rodent counts is not obviously causally related to most information in the data set. Assuming the correct functional form of y can impact regression results. L1 regularization can provide sparse estimates of regression coefficients, but this doesn’t necessarily facilitate interpretation of regressions. Other data may be more useful for understanding the spatial distribution of rodents in Boston. I would prefer to have data on the age of buildings, zoning information (more rats around more food waste?), and the population density of outdoor cats! Most importantly, if this were a serious investigation, I would first to speak with an expert in rodent control. Someone has put thought into this before, and that person could help facilitate this kind of analysis.