A linear approach to predicting house prices

Slides:



Advertisements
Similar presentations
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
Advertisements

Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
1 MF-852 Financial Econometrics Lecture 6 Linear Regression I Roy J. Epstein Fall 2003.
CHAPTER 3 Describing Relationships
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Economics 173 Business Statistics Lecture 20 Fall, 2001© Professor J. Petry
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Chapter 12: Correlation and Linear Regression 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
LESSON 6: REGRESSION 2/21/12 EDUC 502: Introduction to Statistics.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Individual observations need to be checked to see if they are: –outliers; or –influential observations Outliers are defined as observations that differ.
BPA CSUB Prof. Yong Choi. Midwest Distribution 1. Create scatter plot Find out whether there is a linear relationship pattern or not Easy and simple using.
A Project. Hypothesis Good things increase the values of houses Bad things reduce them Higher taxes reduce them Better quality improves them.
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Chapter 12: Correlation and Linear Regression 1.
BUS 308 Week 4 Problem Set Check this A+ tutorial guideline at Problem Set Week Four.
Chapter 15 Multiple Regression Model Building
Regression and Correlation of Data Summary
AP CSP: Cleaning Data & Creating Summary Tables
The Maximum Likelihood Method
CHAPTER 3 Describing Relationships
MATH-138 Elementary Statistics
CHAPTER 3 Describing Relationships
Topic 10 - Linear Regression
Linear Regression.
Statistics 101 Chapter 3 Section 3.
Data Analysis Module: Correlation and Regression
CHAPTER 3 Describing Relationships
Statistics 200 Lecture #5 Tuesday, September 6, 2016
QM222 Class 13 Section D1 Omitted variable bias (Chapter 13.)
Inference for Regression
CHS 221 Biostatistics Dr. wajed Hatamleh
The Maximum Likelihood Method
Cautions about Correlation and Regression
Chapter 11 Simple Regression
Regression and Residual Plots
The Maximum Likelihood Method
BA 275 Quantitative Business Methods
CHAPTER 29: Multiple Regression*
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Fall 2016 Room 150 Harvill Building 10: :50 Mondays, Wednesdays.
3.2 Day
Correlations: testing linear relationships between two metric variables Lecture 18:
STA 282 – Regression Analysis
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
10701 / Machine Learning Today: - Cross validation,
BASIC REGRESSION CONCEPTS
Chapter 2 Exploring Data with Graphs and Numerical Summaries
The Bias Variance Tradeoff and Regularization
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Analysis for Predicting the Selling Price of Apartments Pratik Nikte
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3 Describing Relationships
Correlation and Covariance
CHAPTER 3 Describing Relationships
Machine Learning in Business John C. Hull
CHAPTER 3 Describing Relationships
Presentation transcript:

A linear approach to predicting house prices Jeno Yamma

Key Takeaway Why exploring the data is very important Provide an understanding of the initial approach to a dataset What distributions could tell us about the data Using a linear regression as oppose to a more complicated model Understanding the model evaluation method, root mean square log error Reducing Bias and Variance Evaluating the model’s prediction

Why data exploration is important Business: Come up with an optimal business solution to the problem Data exploration allows us to: check if the data given is appropriate (integrity, missing values, outliers etc.) Brainstorm ideas See the behavior of the data Communicate the data to stakeholders to help explain your prediction Different tools you can use: R, Python, Tableau, Excel!

Initial approach

Sit and wait for it to load…

Did your tool read in the data correctly? Do a simple data type check after reading in the data Important to correct this first: leads to a better model performance and exploration Should we change anything?

Statistical summaries are important Gives us an idea of the mean and spread of each variables What is this and this ?

Missing Values Keep or remove?

Checking variable relationships – scatterplot Do you know how to read this?

Easier to read Can you find the most significant relationship?

Checking the spread of the data using boxplots

or Distributions A house with 8 bathrooms…

Always be curious about your data 8 bathrooms, only 4 rooms, built in 1928…damn 8 rooms, a small number of bathrooms, and the land size is close to the average land size…

Exploring the categorical section

Sales Date

Property Type

Seller

Exploring the categorical section

Do some more explorations and try to understand your data Do some more explorations and try to understand your data. For simplicity use Tableau or Excel 

Let’s do some Machine Learning… (just a fancy way of saying let’s do some linear regression)