METHODS OF SPATIAL ECONOMIC ANALYSIS LECTURE 02 Δρ. Μαρί-Νοέλ Ντυκέν, Αναπληρώτρια Καθηγήτρια, Τηλ. 24210-74438 Γραφείο Γ.6 UNIVERSITY.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Chapter 3, Numerical Descriptive Measures
BCOR 1020 Business Statistics Lecture 4 – January 29, 2008.
QUANTITATIVE DATA ANALYSIS
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Analysis of Research Data
Intro to Descriptive Statistics
1 Basic statistics Week 10 Lecture 1. Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 2 Meanings.
Slides by JOHN LOUCKS St. Edward’s University.
Basic Business Statistics 10th Edition
Introduction to Educational Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Statistics for Managers.
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 11 Descriptive Statistics Gay, Mills, and Airasian
Descriptive Statistics
Analyzing and Interpreting Quantitative Data
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
Chapter 2 Describing Data.
Descriptive Statistics
Lecture 3 Describing Data Using Numerical Measures.
Skewness & Kurtosis: Reference
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Business Statistics, A First Course.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Chapter 6: Analyzing and Interpreting Quantitative Data
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
LIS 570 Summarising and presenting data - Univariate analysis.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Statistical Methods © 2004 Prentice-Hall, Inc. Week 3-1 Week 3 Numerical Descriptive Measures Statistical Methods.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
Chapter 11 Summarizing & Reporting Descriptive Data.
Outline Sampling Measurement Descriptive Statistics:
Descriptive Statistics ( )
Business and Economics 6th Edition
Descriptive Statistics
Analysis and Empirical Results
Doc.RNDr.Iveta Bedáňová, Ph.D.
Data Mining: Concepts and Techniques
Analyzing and Interpreting Quantitative Data
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Descriptive Statistics
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
The European Parliament – voice of the people
The European Parliament – voice of the people
Basic Statistical Terms
BUS7010 Quant Prep Statistics in Business and Economics
EU: First- & Second-Generation Immigrants
European Union Membership
Ms. Saint-Paul A.P. Psychology
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Chapter Nine: Using Statistics to Answer Questions
Advanced Algebra Unit 1 Vocabulary
Business and Economics 7th Edition
Numerical Descriptive Measures
Presentation transcript:

METHODS OF SPATIAL ECONOMIC ANALYSIS LECTURE 02 Δρ. Μαρί-Νοέλ Ντυκέν, Αναπληρώτρια Καθηγήτρια, Τηλ Γραφείο Γ.6 UNIVERSITY OF THESSALY FACULTY OF ENGINEERING DEPARTMENT OF PLANNINGAND REGIONAL DEVELOPMENT MASTER «EUROPEAN REGIONAL DEVELOPMENT STUDIES» 1

METHODS OF SPATIAL ECONOMIC ANALYSIS: STATISTICAL TREATMENT OF SPATIAL DATA, A FIRST APPROACH

OBJECTIVE OF THE LECTURE Main terms used in StatisticsObjective of the Lecture 1.Exploratory statistical analysis of Regional data. 2.Familiarization with Eurostat regional data 3.Familiarization with Statistical Treatment through SPSS Population Complete set of data elements Ex. Census, Register Sample Portion of selected elements from a reference’s population Parameter Measured characteristic of the whole Population Statistic Estimated characteristic of the sample

STATISTICAL TREATMENT Data VisualizationTypes of Data Categorical data:  Non ordinal: family status, employment status, etc (no measurement meaning).  Ordinal: rating-score variable (Likert-scale). In this case measurement has meaning. Numeric data: they have a clear meaning as measurement  Discrete data  Continuous data Most of the data used in Regional analysis are numeric, allowing a cartographic visualization. Gross Domestic Expenditure on R&D (% of GDP) Source: Eurostat, EU 2020 indicators See LECTURE_02_DATA.xls

STATISTICAL TREATMENT Representation of Likert scaleA specific case of ordinal data: The Likert items Initially, the likert scale is a psychometric scale measuring the level of agreement or disagreement. This scale has a more general use and allows to evaluate characteristics according to objective or subjective criteria. Most common used scales are the five, seven, nine and sometimes eleven levels. Likert, R. (1932), "A Technique for the Measurement of Attitudes". Archives of Psychology 140: 1–55. Typical five-level psychometric scale Five level scale for regions’ classification

DATA FOR ANALYSIS Presentation of Data for statistical Treatment Source: Eurostat, EU 2020 indicators See LECTURE_02_DATA.xls Two Sheets 1.Analytical Data 2.Data_SPSS The second sheet has the appropriate format in order to open the data with SPSS, i.e.: The 1 st Row contains the variables’ names The following 28 rows concern the 28 countries without EU28 Each column concern one variable

STATISTICAL TREATMENT Central parameters for total R&D expenditures (% of GDP) Central Parameters [01] The two variables examined are: RD_TOT04 and RD_TOT12, i.e. the total R&D expenditures as % of GDP, in 2004 and Arithmetic Mean: Sum of all elements of the data set divided by the number of elements. Weighted Mean: Sum of the weighted scores Geometric Mean: The nth root of the product of data elements Conclusions: ______________________________________ Statistical Analysis with Excel E.U. 28 1,822,07 Arithmetic Mean =AVERAGE(..,..)1,341,67 Weigted Mean 1,561,83 Geometric Mean =GEOMEAN(..,..)1,081,41

STATISTICAL TREATMENT ExamplesCentral Parameters [02] Be careful, The “MODE” command gives us the highest value when mode is not a single value. In 2012, mode is effectively not a single value. Mode has a very limited interest Mode: The observed data that occurs most frequently. Most frequent value of the variable. Mode is not necessarily a single value Median: The value of the variable (arranged in order magnitude), below which 50% of the elements fall (50% of elements have a value lower than the Median). Median = Arithmetic Mean when the distribution follows the Laplace-Gauss distribution (Normal distribution). Country RD_TOT12 Cyprus0,46 Romania0,49 Bulgaria0,64 Latvia0,66 Greece0,69 Croatia0,75 Slovakia0,82 Malta0,84 Lithuania0,90 Poland0,90 Italy1,27 Spain1,30 Hungary1,30 Luxembourg1,46 Portugal1,50 Ireland1,72 United Kingdom1,72 Czech Republic1,88 Netherlands2,16 Estonia2,18 Belgium2,24 France2,29 Slovenia2,80 Austria2,84 Denmark2,98 Germany2,98 Sweden3,41 Finland3,55 Statistical Analysis with Excel Mode =MODE(..,..)0,512,98 Median =MEDIAN(..,..)1,081,48 In 2012, if mean = 1,67% of GDP, median is quite smaller!

STATISTICAL TREATMENT ExamplesMeasures of dispersion [01] Range: Difference between the highest and the lowest data element. Dispersion Ratio: Quotient between the highest and the lowest data element. Percentile (p%): The value of the variable of the variable below which p% of the elements falls. For dispersion analysis, the 5% and 95% are very useful. Statistical Analysis with Excel Minimum =MIN(..,..)0,370,46 Maximum =MAX(..,..)3,583,55 Range =MAX - MIN3,213,09 DR RATIO = MAX / MIN9,687,72 Conclusions: ______________________________________

STATISTICAL TREATMENT ExamplesMeasures of dispersion [02] Variance: The square average distance of each score from the mean. Weighted Variance: The square average weighted distance of each score from the mean. Standard deviation: σ = square root of variance Coefficient of Variation (CV): Conclusions: ______________________________________ Statistical Analysis with Excel Arithmetic Mean =AVERAGE(..,..)1,341,67 Variance =VAR(..,..)0,8090,880 Standard Deviation =STDEV(..,..)0,9000,938 CV coefficient =STDEV / AVERAGE1,111,07 111%107%

STATISTICAL TREATMENT ExamplesMeasures of dispersion [03] Weighted Coefficient of Variation wCV: With spatial units, w i is generally the population weight of the spatial unit i, in the total area under examination. Considering the 28 EU countries, Pop i = population of the country Pop. = EU population Conclusions: ______________________________________ Statistical Analysis with Excel Arithmetic Mean =AVERAGE(..,..)1,341,67 Variance =VAR(..,..)0,8090,880 Standard Deviation =STDEV(..,..)0,9000,938 CV coefficient =STDEV / AVERAGE1,111,07 111%107% Arithmetic Mean =AVERAGE(..,..)1,321,57 Weighted variance See calculation on columns J & K 0,6620,680 Weighted St. Deviation0,8130,825 wCV = wSTDEV / AVERAGE0,6070,494 61%49%

STATISTICAL TREATMENT Representation [01]Normal / Gaussian Distribution Perfectly symmetric distribution of the random variable around the mean value. Mean = Median = Mode. Standard Normal Distribution: If X  N(μ, σ 2 ) Normal distribution Consequently, the standardized variable Z  N(0, 1) where: P(X <μ) = 0,5 (50%)

STATISTICAL TREATMENT Representation [02]Normal / Gaussian Distribution The distribution shape of a Normal variable depends on the specific values of its two parameters: mean and variance. High value of variance  flattened curve (see blue curve): there is no concentration of values around the mean. Small value of variance  high concentration around the mean value, low degree of variability (see red curve).

STATISTICAL TREATMENT Confidence LevelConfidence Interval Confidence interval: It gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data. Confidence limits: The lower and upper boundaries of a confidence interval, that is, the values which define the range of a confidence interval. Confidence interval is very informative because its width gives us some idea about how uncertain we are about the unknown parameter. Confidence level: The probability value (1-α) associated with a confidence interval. If a = 5%, the confidence level is (1- 0,05) = 0,95 i.e. a 95% confidence level. Statistical Analysis with Excel Arithmetic Mean =AVERAGE(..,..)1,341,67 Margin of Error =CONFIDENCE (α,STDEV;sample size) 0,3330,347 Confidence Interval Lower born1,0081,322 Upper born1,6742,016 In this example, we choose α=0,05 (5%), i.e. 95% CI, sample size = 28

STATISTICAL TREATMENT Measures of Trends Kurtosis [a 4 ]: A measure of the “peakedness” of the probability distribution of a random variable. a 4 = 0 : Normal distribution a 4 > 0 : Peaked distribution a 4 < 0 : Flat distribution Skewness [a 3 ]: A measure of the asymmetry of the probability distribution of a random variable. a 3 = 0 : Normal distribution

STATISTICAL TREATMENT ExamplesMeasures of Correlation Question: In which extend the R&D expenditures in 2012 are strongly correlated with the R&D expenditures in 2004? Normally, we are waiting for a very positive coefficient. Countries with initial high expenditures will continue in tendency to have high expenditures. Pearson coefficient of correlation r p : It indicates the strength and the direction of a linear relationship between two random variables (X and Y). The Correlation coefficient does not indicate a cause and effect relationship Spearman Coefficient of correlation r s : It indicates the strength and the direction of a relationship (not necessarily linear) between two random variables Statistical Analysis with Excel Pearson Correlation =CORREL(D2:D29,E2:E29)0,914

INTRODUCTION TO METHODS OF SPATIAL ECONOMIC ANALYSIS: USING SPSS

DATA FOR ANALYSIS From EXCEL to SPSS Source: Eurostat, EU 2020 indicators The Excel file LECTURE_02_DATA.xls has to be closed The Worksheet with the appropriate format is: Data_SPSS The data that we are going to open through SPSS are in the range: A1:M29 The 1 st row has to contain the names of the variables. The names of the variables cannot contain special characters such as space, / etc. It is suggested to utilize short names for the variables, because you can define in detail the variable in the Label column in the specific sheet describing the variables. Population in 2004 = POP04 (POP 04 is not allowed because of the space)

DATA FOR ANALYSIS From EXCEL to SPSS Source: Eurostat, EU 2020 indicators The Excel file Data_LECTURE02.xls has to be closed The Worksheet with the appropriate format is: Data_SPSS The data that we are going to open through SPSS are in the range: A1:M29 The 1 st row has to contain the names of the variables 1. Select Excel type of File 2. Then select the file LECTURE02_DATA.xls 3. Open New window where we can select the appropriate Worksheet (Data_SPSS). You will have also to check the range 4. Command: File Open Data

DATA FOR ANALYSIS Data in SPSS As you can observe the names of the variables are in the initial row without number. Consequently the 1 st row gives us the data of the 1 st one country (Belgium). Each data file of SPSS has two sheets:  Data View with the data  Variable View where you can enter information about your data, and specify the nature and the meaning of the data.

DATA FOR ANALYSIS Statistical Treatment with SPSS How to obtain the most important statistical parameters of our variables in order to proceed to an exploratory analysis (Descriptive statistics)? Use the following command: Analyze Descriptive Statistics Explore

DATA FOR ANALYSIS Statistical Treatment with SPSS 1.Select the variables to be explored from the left-hands list. It is possible to select more than one variable and to produce all the results for the various selected variables. 2.Move the variables to the right pane: Dependent List. 3. With Explore, statistical parameters are calculated as well as the Box-Plot through which we can detect the presence of outliers. In some cases, we will examine the statistical parameters of one or more than one variables for sub-groups of the total population. In this case, we will have to move the variable defining the sub-groups in the pane: Factor List.

DATA FOR ANALYSIS Results from Explore The results appear in a new worksheet : Output which is completely independent from the data sheet. This sheet can be saved or convert in word, excel etc. All the results are summarized in the table.

DATA FOR ANALYSIS Results from Explore With Explore, we also obtain for each variable the Box-Plot. This diagram allows us to verify in which extend, the variables present a quite “Normal” distribution while it also allows to detect the presence of outliers (values that are below or above of the accepted thresholds). In this case, there is no outlier.

METHODS OF SPATIAL ECONOMIC ANALYSIS LECTURE 02 Δρ. Μαρί-Νοέλ Ντυκέν, Αναπληρώτρια Καθηγήτρια, Τηλ Γραφείο Γ.6 UNIVERSITY OF THESSALY FACULTY OF ENGINEERING DEPARTMENT OF PLANNINGAND REGIONAL DEVELOPMENT MASTER «EUROPEAN REGIONAL DEVELOPMENT STUDIES»