Presentation is loading. Please wait.

Presentation is loading. Please wait.

METHODS OF SPATIAL ECONOMIC ANALYSIS LECTURE 02 Δρ. Μαρί-Νοέλ Ντυκέν, Αναπληρώτρια Καθηγήτρια, Τηλ. 24210-74438 Γραφείο Γ.6 UNIVERSITY.

Similar presentations


Presentation on theme: "METHODS OF SPATIAL ECONOMIC ANALYSIS LECTURE 02 Δρ. Μαρί-Νοέλ Ντυκέν, Αναπληρώτρια Καθηγήτρια, Τηλ. 24210-74438 Γραφείο Γ.6 UNIVERSITY."— Presentation transcript:

1 METHODS OF SPATIAL ECONOMIC ANALYSIS LECTURE 02 Δρ. Μαρί-Νοέλ Ντυκέν, Αναπληρώτρια Καθηγήτρια, mdyken@prd.uth.gr Τηλ. 24210-74438 Γραφείο Γ.6 UNIVERSITY OF THESSALY FACULTY OF ENGINEERING DEPARTMENT OF PLANNINGAND REGIONAL DEVELOPMENT MASTER «EUROPEAN REGIONAL DEVELOPMENT STUDIES» 1

2 METHODS OF SPATIAL ECONOMIC ANALYSIS: STATISTICAL TREATMENT OF SPATIAL DATA, A FIRST APPROACH

3 OBJECTIVE OF THE LECTURE Main terms used in StatisticsObjective of the Lecture 1.Exploratory statistical analysis of Regional data. 2.Familiarization with Eurostat regional data 3.Familiarization with Statistical Treatment through SPSS Population Complete set of data elements Ex. Census, Register Sample Portion of selected elements from a reference’s population Parameter Measured characteristic of the whole Population Statistic Estimated characteristic of the sample

4 STATISTICAL TREATMENT Data VisualizationTypes of Data Categorical data:  Non ordinal: family status, employment status, etc (no measurement meaning).  Ordinal: rating-score variable (Likert-scale). In this case measurement has meaning. Numeric data: they have a clear meaning as measurement  Discrete data  Continuous data Most of the data used in Regional analysis are numeric, allowing a cartographic visualization. Gross Domestic Expenditure on R&D (% of GDP) Source: Eurostat, EU 2020 indicators See LECTURE_02_DATA.xls

5 STATISTICAL TREATMENT Representation of Likert scaleA specific case of ordinal data: The Likert items Initially, the likert scale is a psychometric scale measuring the level of agreement or disagreement. This scale has a more general use and allows to evaluate characteristics according to objective or subjective criteria. Most common used scales are the five, seven, nine and sometimes eleven levels. Likert, R. (1932), "A Technique for the Measurement of Attitudes". Archives of Psychology 140: 1–55. Typical five-level psychometric scale Five level scale for regions’ classification

6 DATA FOR ANALYSIS Presentation of Data for statistical Treatment Source: Eurostat, EU 2020 indicators See LECTURE_02_DATA.xls Two Sheets 1.Analytical Data 2.Data_SPSS The second sheet has the appropriate format in order to open the data with SPSS, i.e.: The 1 st Row contains the variables’ names The following 28 rows concern the 28 countries without EU28 Each column concern one variable

7 STATISTICAL TREATMENT Central parameters for total R&D expenditures (% of GDP) Central Parameters [01] The two variables examined are: RD_TOT04 and RD_TOT12, i.e. the total R&D expenditures as % of GDP, in 2004 and 2012. Arithmetic Mean: Sum of all elements of the data set divided by the number of elements. Weighted Mean: Sum of the weighted scores Geometric Mean: The nth root of the product of data elements Conclusions: ______________________________________ Statistical Analysis with Excel 20042012 E.U. 28 1,822,07 Arithmetic Mean =AVERAGE(..,..)1,341,67 Weigted Mean 1,561,83 Geometric Mean =GEOMEAN(..,..)1,081,41

8 STATISTICAL TREATMENT ExamplesCentral Parameters [02] Be careful, The “MODE” command gives us the highest value when mode is not a single value. In 2012, mode is effectively not a single value. Mode has a very limited interest Mode: The observed data that occurs most frequently. Most frequent value of the variable. Mode is not necessarily a single value Median: The value of the variable (arranged in order magnitude), below which 50% of the elements fall (50% of elements have a value lower than the Median). Median = Arithmetic Mean when the distribution follows the Laplace-Gauss distribution (Normal distribution). Country RD_TOT12 Cyprus0,46 Romania0,49 Bulgaria0,64 Latvia0,66 Greece0,69 Croatia0,75 Slovakia0,82 Malta0,84 Lithuania0,90 Poland0,90 Italy1,27 Spain1,30 Hungary1,30 Luxembourg1,46 Portugal1,50 Ireland1,72 United Kingdom1,72 Czech Republic1,88 Netherlands2,16 Estonia2,18 Belgium2,24 France2,29 Slovenia2,80 Austria2,84 Denmark2,98 Germany2,98 Sweden3,41 Finland3,55 Statistical Analysis with Excel 20042012 Mode =MODE(..,..)0,512,98 Median =MEDIAN(..,..)1,081,48 In 2012, if mean = 1,67% of GDP, median is quite smaller!

9 STATISTICAL TREATMENT ExamplesMeasures of dispersion [01] Range: Difference between the highest and the lowest data element. Dispersion Ratio: Quotient between the highest and the lowest data element. Percentile (p%): The value of the variable of the variable below which p% of the elements falls. For dispersion analysis, the 5% and 95% are very useful. Statistical Analysis with Excel 20042012 Minimum =MIN(..,..)0,370,46 Maximum =MAX(..,..)3,583,55 Range =MAX - MIN3,213,09 DR RATIO = MAX / MIN9,687,72 Conclusions: ______________________________________

10 STATISTICAL TREATMENT ExamplesMeasures of dispersion [02] Variance: The square average distance of each score from the mean. Weighted Variance: The square average weighted distance of each score from the mean. Standard deviation: σ = square root of variance Coefficient of Variation (CV): Conclusions: ______________________________________ Statistical Analysis with Excel 20042012 Arithmetic Mean =AVERAGE(..,..)1,341,67 Variance =VAR(..,..)0,8090,880 Standard Deviation =STDEV(..,..)0,9000,938 CV coefficient =STDEV / AVERAGE1,111,07 111%107%

11 STATISTICAL TREATMENT ExamplesMeasures of dispersion [03] Weighted Coefficient of Variation wCV: With spatial units, w i is generally the population weight of the spatial unit i, in the total area under examination. Considering the 28 EU countries, Pop i = population of the country Pop. = EU population Conclusions: ______________________________________ Statistical Analysis with Excel 20042012 Arithmetic Mean =AVERAGE(..,..)1,341,67 Variance =VAR(..,..)0,8090,880 Standard Deviation =STDEV(..,..)0,9000,938 CV coefficient =STDEV / AVERAGE1,111,07 111%107% 20042012 Arithmetic Mean =AVERAGE(..,..)1,321,57 Weighted variance See calculation on columns J & K 0,6620,680 Weighted St. Deviation0,8130,825 wCV = wSTDEV / AVERAGE0,6070,494 61%49%

12 STATISTICAL TREATMENT Representation [01]Normal / Gaussian Distribution Perfectly symmetric distribution of the random variable around the mean value. Mean = Median = Mode. Standard Normal Distribution: If X  N(μ, σ 2 ) Normal distribution Consequently, the standardized variable Z  N(0, 1) where: P(X <μ) = 0,5 (50%)

13 STATISTICAL TREATMENT Representation [02]Normal / Gaussian Distribution The distribution shape of a Normal variable depends on the specific values of its two parameters: mean and variance. High value of variance  flattened curve (see blue curve): there is no concentration of values around the mean. Small value of variance  high concentration around the mean value, low degree of variability (see red curve).

14 STATISTICAL TREATMENT Confidence LevelConfidence Interval Confidence interval: It gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data. Confidence limits: The lower and upper boundaries of a confidence interval, that is, the values which define the range of a confidence interval. Confidence interval is very informative because its width gives us some idea about how uncertain we are about the unknown parameter. Confidence level: The probability value (1-α) associated with a confidence interval. If a = 5%, the confidence level is (1- 0,05) = 0,95 i.e. a 95% confidence level. Statistical Analysis with Excel 20042012 Arithmetic Mean =AVERAGE(..,..)1,341,67 Margin of Error =CONFIDENCE (α,STDEV;sample size) 0,3330,347 Confidence Interval Lower born1,0081,322 Upper born1,6742,016 In this example, we choose α=0,05 (5%), i.e. 95% CI, sample size = 28

15 STATISTICAL TREATMENT Measures of Trends Kurtosis [a 4 ]: A measure of the “peakedness” of the probability distribution of a random variable. a 4 = 0 : Normal distribution a 4 > 0 : Peaked distribution a 4 < 0 : Flat distribution Skewness [a 3 ]: A measure of the asymmetry of the probability distribution of a random variable. a 3 = 0 : Normal distribution

16 STATISTICAL TREATMENT ExamplesMeasures of Correlation Question: In which extend the R&D expenditures in 2012 are strongly correlated with the R&D expenditures in 2004? Normally, we are waiting for a very positive coefficient. Countries with initial high expenditures will continue in tendency to have high expenditures. Pearson coefficient of correlation r p : It indicates the strength and the direction of a linear relationship between two random variables (X and Y). The Correlation coefficient does not indicate a cause and effect relationship Spearman Coefficient of correlation r s : It indicates the strength and the direction of a relationship (not necessarily linear) between two random variables Statistical Analysis with Excel Pearson Correlation =CORREL(D2:D29,E2:E29)0,914

17 INTRODUCTION TO METHODS OF SPATIAL ECONOMIC ANALYSIS: USING SPSS

18 DATA FOR ANALYSIS From EXCEL to SPSS Source: Eurostat, EU 2020 indicators The Excel file LECTURE_02_DATA.xls has to be closed The Worksheet with the appropriate format is: Data_SPSS The data that we are going to open through SPSS are in the range: A1:M29 The 1 st row has to contain the names of the variables. The names of the variables cannot contain special characters such as space, %,@,$,*, / etc. It is suggested to utilize short names for the variables, because you can define in detail the variable in the Label column in the specific sheet describing the variables. Population in 2004 = POP04 (POP 04 is not allowed because of the space)

19 DATA FOR ANALYSIS From EXCEL to SPSS Source: Eurostat, EU 2020 indicators The Excel file Data_LECTURE02.xls has to be closed The Worksheet with the appropriate format is: Data_SPSS The data that we are going to open through SPSS are in the range: A1:M29 The 1 st row has to contain the names of the variables 1. Select Excel type of File 2. Then select the file LECTURE02_DATA.xls 3. Open New window where we can select the appropriate Worksheet (Data_SPSS). You will have also to check the range 4. Command: File Open Data

20 DATA FOR ANALYSIS Data in SPSS As you can observe the names of the variables are in the initial row without number. Consequently the 1 st row gives us the data of the 1 st one country (Belgium). Each data file of SPSS has two sheets:  Data View with the data  Variable View where you can enter information about your data, and specify the nature and the meaning of the data.

21 DATA FOR ANALYSIS Statistical Treatment with SPSS How to obtain the most important statistical parameters of our variables in order to proceed to an exploratory analysis (Descriptive statistics)? Use the following command: Analyze Descriptive Statistics Explore

22 DATA FOR ANALYSIS Statistical Treatment with SPSS 1.Select the variables to be explored from the left-hands list. It is possible to select more than one variable and to produce all the results for the various selected variables. 2.Move the variables to the right pane: Dependent List. 3. With Explore, statistical parameters are calculated as well as the Box-Plot through which we can detect the presence of outliers. In some cases, we will examine the statistical parameters of one or more than one variables for sub-groups of the total population. In this case, we will have to move the variable defining the sub-groups in the pane: Factor List.

23 DATA FOR ANALYSIS Results from Explore The results appear in a new worksheet : Output which is completely independent from the data sheet. This sheet can be saved or convert in word, excel etc. All the results are summarized in the table.

24 DATA FOR ANALYSIS Results from Explore With Explore, we also obtain for each variable the Box-Plot. This diagram allows us to verify in which extend, the variables present a quite “Normal” distribution while it also allows to detect the presence of outliers (values that are below or above of the accepted thresholds). In this case, there is no outlier.

25 METHODS OF SPATIAL ECONOMIC ANALYSIS LECTURE 02 Δρ. Μαρί-Νοέλ Ντυκέν, Αναπληρώτρια Καθηγήτρια, mdyken@prd.uth.gr Τηλ. 24210-74438 Γραφείο Γ.6 UNIVERSITY OF THESSALY FACULTY OF ENGINEERING DEPARTMENT OF PLANNINGAND REGIONAL DEVELOPMENT MASTER «EUROPEAN REGIONAL DEVELOPMENT STUDIES»


Download ppt "METHODS OF SPATIAL ECONOMIC ANALYSIS LECTURE 02 Δρ. Μαρί-Νοέλ Ντυκέν, Αναπληρώτρια Καθηγήτρια, Τηλ. 24210-74438 Γραφείο Γ.6 UNIVERSITY."

Similar presentations


Ads by Google