Socio-economic status of the counties in the US By Jean Eric Rakotoarisoa GIS project / spring 2002.

Slides:



Advertisements
Similar presentations
Item Analysis.
Advertisements

Estimation of Means and Proportions
Sampling: Final and Initial Sample Size Determination
Confidence Intervals This chapter presents the beginning of inferential statistics. We introduce methods for estimating values of these important population.
N ational T ransfer A ccounts Data Review (Hands On) Amonthep Chawla East-West Center & Nihon University Population Research Institute.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Chapter 7 Sampling and Sampling Distributions
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Hypothesis Testing I.
Clustered or Multilevel Data
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Chapter Sampling Distributions and Hypothesis Testing.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
1.  Why understanding probability is important?  What is normal curve  How to compute and interpret z scores. 2.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Presenting information
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Measures of Central Tendency
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Chapter 10 Hypothesis Testing
Fundamentals of Hypothesis Testing: One-Sample Tests
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
+ Quantitative Analysis: Supporting Concepts EDTEC 690 – Methods of Inquiry Minjuan Wang (based on previous slides)
Chapter 1: Introduction to Statistics
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
There are two main purposes in statistics; (Chapter 1 & 2)  Organization & ummarization of the data [Descriptive Statistics] (Chapter 5)  Answering.
LECTURE 16 TUESDAY, 31 March STA 291 Spring
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
Chapter 7 Estimates and Sample Sizes
Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Copyright © Cengage Learning. All rights reserved. 2 Descriptive Analysis and Presentation of Single-Variable Data.
Introduction to Descriptive Statistics Objectives: 1.Explain the general role of statistics in assessment & evaluation 2.Explain three methods for describing.
Support the spread of “good practice” in generating, managing, analysing and communicating spatial information Introduction to GIS for the Purpose of Practising.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Confidence intervals and hypothesis testing Petter Mostad
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
LECTURE 25 THURSDAY, 19 NOVEMBER STA291 Fall
CHAPTER-6 Sampling error and confidence intervals.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Chapter 4: Variability. Variability Provides a quantitative measure of the degree to which scores in a distribution are spread out or clustered together.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Data Analysis.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Bangor Transfer Abroad Programme Marketing Research SAMPLING (Zikmund, Chapter 12)
Environmental GIS Nicholas A. Procopio, Ph.D, GISP
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 5 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
1 Part09: Applications of Multi- level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
“Neighborhood Social Planning and Development” NEBSOC WORK PACKAGES (DATA COLLECTION STRATEGY) & 3.2 (DEEPENING AND IDENTIFICATION OF THE SOCIAL.
Sampling: Distribution of the Sample Mean (Sigma Known) o If a population follows the normal distribution o Population is represented by X 1,X 2,…,X N.
Chapter 9 Estimation and Confidence Intervals. Our Objectives Define a point estimate. Define level of confidence. Construct a confidence interval for.
Chapter 11: Test for Comparing Group Means: Part I.
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Estimation and Confidence Intervals Chapter 9.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Stats Methods at IC Lecture 3: Regression.
POPULATION VERSUS SAMPLE
Regression Analysis AGEC 784.
Introduction to Statistics
CONCEPTS OF ESTIMATION
Daniela Stan Raicu School of CTI, DePaul University
Daniela Stan Raicu School of CTI, DePaul University
Presentation transcript:

Socio-economic status of the counties in the US By Jean Eric Rakotoarisoa GIS project / spring 2002

Background information GIS has been known as a system that allows storage and retrieval, analysis, and display of spatial data GIS is often used to assist in conducting socio-economic studies In these studies, attributes come from geographic areas which are the units and levels of the study (e.g. county, state, or country)

Objective Identifying the most prosperous counties in the US –Successful –Flourishing

Understanding the question Defining the key word: “prosperous” by – Per capita status Income Education –Social status of the county Crime Unemployment Health care facilities

Methods Data –Source: ArcUSA 1:2M, published by ESRI in 1997 –Characteristics 1:2,000,000 scale-data Albers conic Equal-area projection Lat / long

Criteria for choosing variables 1.Standardized variables (to avoid effect of area, population size; e.g. income per capita) 2.Variables that show enough variation (descriptive statistics) 3.Variables that can be seen as surrogates of other related variables (I.e. cause and effects relationship and simple correlation; for extrapolation of the results) 4.Ideally, data from the same year (some variables may be time sensitive)

Variables –Income: money per capita in 1985 –Education: percentage of people > 25 years old with 12 years or more education in 1980 –Unemployment: unemployment rate of civilian labor force in 1986 –Crime: serious crimes known to police per 100,000 population in 1985 –Health care facilities: number of hospital bed per 1000 population in 1985

Understanding each variable –Distribution (normal, skewed): information necessary for reclassification ( i.e. equal interval. quantile, SD) –Degree of variation Mean, min, max, variance, SD Scale to be used was chosen as a function of both the degree of variation of the variables and the desired resolution of the output theme (coarse or high resolution)

Relationship among variables To classify variables: “primary ” (cause) and “secondary “ (effects) Important when assigning weight (overlay)

IncomeEducation Crime Health care facilities Unemployment Primary Intermediate Secondary

GIS operations –Extract the data –Convert to grid themes –Construct the model (reclassify and weighted overlay)

Characteristics of the model Index model designed for unequal contribution of each variable Scale of 1 to 5 with 1 being the worst and 5 the best Assigning scale for each variable –Income: highest is given 5 –Education: highest is given 5 –Unemployment: highest is given 1 –Crime: highest is given 1 –Health care facilities: highest is given 5

Weight –Income: 30% (Primary variable) –Education: 30% (Primary variable) –*Crime: 20% (Secondary) –*Unemployment: 15% (Intermediate) –Health care facilities: 5% (secondary, not a very good variable) * Strong relationship which implies additive effects of weight

Expected output: counties that have… A higher income per capita, a higher percentage of people that have received at least 12 years of education, that are safe with a lower rate of unemployment, and that have more health care facilities.

Flowchart of the model IncomeEducationCrime Health care facilities Rec incomeRec educationRec crime Rec health facilities Weighted overlay Final map Reclassify 30% 5% 15% 20% Unemployment Rec unemployment Reclassify

Results Level of prosperity Restricted No Data 0500Miles Map of the county prosperity in the US

What is revealed by the map? Many counties meet our criteria Distribution of these counties follows a regional pattern The most prosperous regions are: New England, Upper Midwest, Great plains, western states (Arizona, Nevada, Colorado) There is not a huge difference between counties in terms of prosperity (based on our criteria) across the US (there are very few extremes values such as 1 and no 5, most counties fall into scale 3, 4)

Verifying the model Study question: Randomly chosen counties should belong to the level of prosperity assigned by the model GIS aspect: State-based study and county-based study should show the same pattern

Verifying the model (cont.) Map of the county prosperity in the US Map of the state prosperity in the US Scale: 1:2M

Discussions Resolution –County data were indeed appropriate given the fact that these variables are probably more uniform within counties than within states as shown by the map (e.g. income, rate of unemployment) Source of error –GIS Defining the extent of the output (decreases accuracy) Label (misleading) –Study question Data do not come from the same year

Discussions (cont.) Limitations of the results –Despite the number of variables used, the output mainly refer to counties that have higher income and higher proportion of people graduated from high school (weighted overlay) –High income does not necessarily imply better standard of living (e.g. need to look into cost of living)

Discussions (cont.) Did I have to use GIS ? –No !! –Simple equation: Y = aX 1 +bX 2 +cX 3 +dX 4 +eX 5 –Y= Counties –X i = variables (attributes) –a,b,c,d,e= weight –GIS was mostly used for visual purpose (e.g. distribution of the counties)

What can be improved? –Adding more variables to better characterize the feature of interest (e.g. number of doctors, nursing centers and hospitals) –Investigating relationship among variables (using inferential statistics) –Add other parameters (e.g. cost of living) Discussions (cont.)

- Theoretical background: choosing variables, understanding their behavior - GIS operations: understanding effects of different choices whenever options are being presented (e.g. equal interval, quantile, SD used for reclassification) - At every step of the analysis, try to understand the assumptions behind each option (e.g. defining scales) and always relate those to the objective i.e. how each option will affect the objective (how a choice for a particular scale will affect the objective) Difficulties Advice

Conclusions Study of interest : based on our criteria, the most prosperous counties in the US are in New England, Upper Midwest, Great plains, western states GIS is only a tool. A good understanding of the study phenomenon is crucial before any GIS operations can be undertaken A good understanding of the different options given through the GIS operations is important Poor knowledge of the study phenomenon or misuse of GIS only results in artifacts