Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma.
Introduction to Applied Spatial Econometrics Attila Varga DIMETIC Pécs, July 3, 2009.
Spatial Autocorrelation Basics NR 245 Austin Troy University of Vermont.
GIS and Spatial Statistics: Methods and Applications in Public Health
Correlation and Autocorrelation
QUANTITATIVE DATA ANALYSIS
Applied Geostatistics Geostatistical techniques are designed to evaluate the spatial structure of a variable, or the relationship between a value measured.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
Edpsy 511 Homework 1: Due 2/6.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
SA basics Lack of independence for nearby obs
Why Geography is important.
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Introduction to Regression Analysis, Chapter 13,
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Relationships Among Variables
IS415 Geospatial Analytics for Business Intelligence
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Chapter 9 Statistical Data Analysis
EPE/EDP 557 Key Concepts / Terms –Empirical vs. Normative Questions Empirical Questions Normative Questions –Statistics Descriptive Statistics Inferential.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
MEASURES OF CENTRAL TENDENCY TENDENCY 1. Mean 1. Mean 2. Median 2. Median 3. Mode 3. Mode.
Introduction to Linear Regression
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Geo479/579: Geostatistics Ch4. Spatial Description.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
PCB 3043L - General Ecology Data Analysis. PCB 3043L - General Ecology Data Analysis.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
PCB 3043L - General Ecology Data Analysis.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Statistical Analysis of Data. What is a Statistic???? Population Sample Parameter: value that describes a population Statistic: a value that describes.
Statistics. Descriptive Statistics Organize & summarize data (ex: central tendency & variability.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
PXGZ6102 BASIC STATISTICS FOR RESEARCH IN EDUCATION
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Descriptive Statistics Printing information at: Class website:
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Scatterplots & Correlations Chapter 4. What we are going to cover Explanatory (Independent) and Response (Dependent) variables Displaying relationships.
Statistics Use of mathematics to ORGANIZE, SUMMARIZE and INTERPRET numerical data. Needed to help psychologists draw conclusions.
Data Analysis.
Spatial statistics: Spatial Autocorrelation
Linear Regression.
PCB 3043L - General Ecology Data Analysis.
CHAPTER 2: PSYCHOLOGICAL RESEARCH METHODS AND STATISTICS
Summary descriptive statistics: means and standard deviations:
Chapter 12 Using Descriptive Analysis, Performing
Quantitative Data Analysis P6 M4
PENGOLAHAN DAN PENYAJIAN
Summary descriptive statistics: means and standard deviations:
15.1 The Role of Statistics in the Research Process
Review for Exam 1 Ch 1-5 Ch 1-3 Descriptive Statistics
Presentation transcript:

Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc. 2013

Beáta Stehliková, Bratislava 2 Information is currently besides financial, energy, material resources the main factor of progress.

Beáta Stehliková, Bratislava 3 How to obtain new knowledge? We want to answer the question: How to obtain new information, new knowledge from data?

Talk only about one method of spatial statistics Why spatial statistics ? Methods of spatial statistics are for spatial data Real estate data contain very often information about the geographic location – there are spatial data Beáta Stehliková, Bratislava 4

Variable and data A variable - a characteristic of population or sample that is of interest for us. Data - the actual values of variables Beáta Stehliková, Bratislava 5

6 Different kinds of data Cross-sectional data are data on one or more variables collected at a single point in time Time series data data are collected over a period of time on one or more variables Panel data – the same cross-section over time in real estate

Types of data (scale) We have said that data - the actual values of variables Types of data: Interval data are numerical observations Ordinal data are ordered categorical observations Nominal data are categorical observations Beáta Stehliková, Bratislava 7

Types of data (scale) Knowing the type of data (scale) is necessary to properly select the technique to be used when analyzing data. Beáta Stehliková, Bratislava 8

2.9 Descriptive statistics involves arranging, summarizing, and presenting a set of data in such a way that useful information is produced. Descriptive statistics

graphical techniques (histogram) numerical descriptive measures Mean (average) Median (middle value) Mode (most frequently ) Variance Standard deviation Beáta Stehliková, Bratislava 10

Beáta Stehliková, Bratislava 11 Descriptive statistics are not enough Average (17,8) Standard deviation (4,7) Coefficient of variation (26,4 %) n=25 It is necessary to know the probability distribution Consider two data sets A and B A B

Beáta Stehliková, Bratislava 12 Second example Consider two large data sets A and B

Beáta Stehliková, Bratislava 13 The location information It is not possible to identify differences between data sets without we take into account the location information

Beáta Stehliková, Bratislava 14 The location information Variograms quantify changes in values ​​ in the space there is no there is spatial autocorrelation small distances correspond to small changes in values small distances correspond to large changes in values

Spatial autocorrelation The degree to which near and more distant things are interrelated Measures of spatial autocorrelation attempt to deal with similarities in the location of spatial objects and their attributes

Spatial autocorrelation Positive (objects similar in location are similar in attribute) Negative (objects similar in location are very different) Zero (attributes are independent of location)

Spatial autocorrelation - measures. Several measures available: Moran’s coefficient I, Geary’s C coefficient, Getis-Ord coefficient G. These measures may be “global” - they apply to the study region or “local” - autocorrelation may exist in some parts of the region but not in others.

Moran’s coefficient I varies between –1.0 and indicates no spatial autocorrelation [1/(n-1)] (indicate random pattern) When autocorrelation is high, the I coefficient is close to 1 or -1 Negative values I indicate negative autocorrelation Positive values I indicate positive autocorrelation (indicate a tendency toward clustering)

Regression analysis is a technique for using data to identify relationships among variables and use these relationships to make predictions. Beáta Stehliková, Bratislava 19

Beáta Stehliková, Bratislava 20

Regression analyses that ignore spatial dependency can have unstable parameter estimates and unreliable significance tests. Solution: Spatial Autoregressive Models Lag model Spatial Error model Beáta Stehliková, Bratislava 21

Spatial Models 22 SPATIAL LAG SPATIAL ERROR Ordinary Least Squares No influence from neighbors Dependent variable influenced by neighbors Residuals influenced by neighbors Y = β 0 + Xβ Y = β 0 + λ WY + Xβ + εY = β 0 + Xβ + ρWε + ξ Lag model controls spatial autocorrelation in the dependent variable Error model controls spatial autocorrelation in the residuals, thus it controls autocorrelation in the dependent and the independent variables

Software GeoDa Beáta Stehliková, Bratislava 23

Compare different spatial models Neither R 2 nor Adjusted R 2 can be used to compare different spatial regression models We can used Akaike Information Criteria (the smaller the AIC value the better the model) 24

Example Beáta Stehliková, Bratislava 25 dependent variable y – price of dwelling independent variable x – living area Classical regression analysis

Residuals Beáta Stehliková, Bratislava 26 Moran´s I = Significance: P value= <0.05 This indicate positive spatial autocorrelation between residuals.

Spatial error model Beáta Stehliková, Bratislava 27

Local Moran’s coefficients Beáta Stehliková, Bratislava 28 Which values produce spatial autocorrelation ?

Spatial statistics Methods of spatial statistics very use full for data with the location information The art of looking for beauty, and science looking for true. Spatial statistics will help us find the true when we use the right methods Beáta Stehliková, Bratislava 29

Beáta Stehliková, Bratislava 30