Presentation is loading. Please wait.

Presentation is loading. Please wait.

I. Introduction to Data and Statistics A. Basic terms and concepts Data set - variable - observation - data value.

Similar presentations


Presentation on theme: "I. Introduction to Data and Statistics A. Basic terms and concepts Data set - variable - observation - data value."— Presentation transcript:

1 I. Introduction to Data and Statistics A. Basic terms and concepts Data set - variable - observation - data value

2 5625786535 8912657825 7889581434 2598341953 TX > 65$< 19Rent $age LA AL MS Central Gulf States

3 B. Primary and Secondary data 1. Primary data - original data - collected for a specific purpose - sample design and procedures - time and $

4 2. Secondary data - archival data - agency or organization - organized in a set format - time and $ - data quality an issue - sample design

5 C. Individual and spatially aggregated data State 1 State 4State 3 State 2 State 1 State 4State 3 State 2 Region

6 D. Discreet and Continuous data 1. Discreet

7 2. Continuous

8 E. Qualitative and Quantitative data 1. Qualitative (categorical) Ex: land cover, sex, political party, race 2. Quantitative Ex: population, precipitation, grades

9 II. Scales of Measurement A. Nominal B. Ordinal C. Interval D. Ratio for comparison must use the same scale of measurement

10 A. Nominal Name: George = 1, Wanda = 2, Bob = 3 Land Cover: Forested = 45, urban = 39, etc... Climate regimes: polar = 1, temperate = 2, tropical = 3 Sex: Male = 1, Female = 2 - Mutually exclusive - Exhaustive Ex:

11 B. Ordinal - ranked data - arbitrary - comparisons - not a set interval between rankings Ex: Places rated (cities, beaches…) Level of satisfaction (poor, ok, good)

12 C. Interval - separated by absolute differences - does not have an absolute zero Ex: - temperature - elevation

13 D. Ratio - separated by absolute differences - absolute zero Ex: - precipitation - tree growth - income

14 III. Graphing procedures (univariate) A. frequency histogram B. cumulative histogram

15 100050 A. frequency histogram Freq. (#, %) income, grades (-) (+) (frequency polygon)

16 050 B. Cumulative frequency histogram Cumu- lative Freq. (#, %) (-) (+) 100 (cumulative frequency polygon)

17 IV. Descriptive Statistics (univariate) - summary of data characteristics - inferential; extend sample to a larger population A. Measures of Central Tendency B. Measures of Dispersion C. Measures of Shape

18 A. Measures of Central Tendency attempt to define the most typical value of a larger data set 1. Mode 2. Median 3. Mean (average)

19 Mode (nominal only) value that occurs most frequently only measure of central tendency appropriate for nominal level data works better for grouped data, not raw values many data sets will not have two exact data sets

20 2. Median the middle value from a set of ranked observations equal number of observations on either side appropriate when data is heavily skewed interval or ratio level data, not nominal

21 3. Mean (average), .x i / n most commonly used value of central tendency interval or ratio level data sensitive to outliers most easily understood assumptions: unimodal symmetric distribution

22 (-) (+) 0100 mode median mean Normal distribution 50

23 (-) (+) 010050 mode median mean

24 B. Measures of Dispersion provide information about distribution of data 1. Range 2. Standard deviation 3. Coefficient of variation

25 1. Range difference between largest and smallest value simplest measure of dispersion easy to calculate can be misleading ignores all other values does not take into account clustering of data

26 2. Standard deviation the average deviation of each value from the mean based on the mean better indicator of the dispersion of the entire sample (in comparison to the range) scale dependent value

27 3. Coefficient of variation standard deviation / mean allows you to compare dispersion independent of scale should be used to make comparisons where there are differences in mean

28 (-) (+) 15 85 50 Range: 85 - 15 = 70 100 0 Std. dev. ~ .x i - X X = 50 C.V. = Std. dev. / mean

29

30 C. Measures of Shape 1. Skewness 2. Kurtosis

31 Leptokurtic Mesokurtic Platykurtic

32 (-) skew (+) skew Symmetrical (bell shaped)

33 Mean Center

34 06 4 B (1.6, 3.8) A (2.8, 1.5) C (3.5, 3.3) D (4.4, 2.0) E (4.3, 1.1) G (4.9, 3.5) F (5.2, 2.4) 54321 1 2 3

35 06 B (1.6, 3.8) A (2.8, 1.5) C (3.5, 3.3) D (4.4, 2.0) E (4.3, 1.1) G (4.9, 3.5) F (5.2, 2.4) Mean Center (3.81, 2.51) 54321 1 2 3 4

36 Weighted Mean Center

37 06 B (20) A (5) C (8) D (4) E (6) G (3) F (5) 54321 1 2 3 4

38

39 06 B (20) A (5) C (8) D (4) E (6) G (3) F (5) 54321 1 2 3 4 Weighted Mean Center (3.10, 2.88)

40

41 Correlation 1. Direction negative or positive 2. Strength of relationship perfect, strong, weak, no - Bivariate relationship Scattergrams

42 (-)(+) Positive (direct) correlation

43 (-)(+) Negative (inverse) correlation

44 (-)(+) Perfect correlation

45 (-)(+) Strong correlation

46 (-)(+) Weak correlation

47 (-)(+) No correlation ??

48 (-)(+) Controlled Correlation

49 (-)(+) Controlled correlation (clumping)

50 (-)(+)

51 (-)(+) Threshold

52 (-)(+) Curvilinear

53

54

55

56

57

58


Download ppt "I. Introduction to Data and Statistics A. Basic terms and concepts Data set - variable - observation - data value."

Similar presentations


Ads by Google