Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sociology 601(Martin) Lecture for week 2: September 9 - 11 Chapter 3.1: –Making Charts Chapter 3.2 – 3.5 (if time permits) –Measures of central tendency.

Similar presentations


Presentation on theme: "Sociology 601(Martin) Lecture for week 2: September 9 - 11 Chapter 3.1: –Making Charts Chapter 3.2 – 3.5 (if time permits) –Measures of central tendency."— Presentation transcript:

1 Sociology 601(Martin) Lecture for week 2: September 9 - 11 Chapter 3.1: –Making Charts Chapter 3.2 – 3.5 (if time permits) –Measures of central tendency –Measures of variation Walk-through of the STATA graphic user interface.

2 Definitions for charts frequency distribution: a graph listing intervals of possible values for a variable (on the x-axis), and number of observations in each interval (on the y-axis). relative frequency distribution: as above, but the y-axis has the percent or proportion of observations in each interval. bar graph: the variable is ordinal or nominal scale. –The bars should not touch histogram: the variable is interval scale. –The bars should touch

3 General Rules for Relative Frequency Distributions Whether you are making a bar graph or histogram: –Make sure each observation is in one and only one category. –Use categories of equal width. –Choose an appealing number of categories. –Decide whether to provide labels –Double-check your graph. If you use fewer bars to describe the distribution of a variable, you lose information but gain clarity.

4 Example from Text, p. 36 Murders per 100,000 population, by State for 1993 Alabama11.6Louisiana20.3Ohio6.0 Alaska9.0Maine1.6Oklahoma8.4 Arizona8.6Maryland12.7Oregon4.6 Arkansas10.2Massachusetts3.9Pennsylvania6.8 California13.1Michigan9.8Rhode Island3.9 Colorado5.8Minnesota3.4South Carolina10.3 Connecticut6.3Mississippi13.5South Dakota3.4 Delaware5.0Missouri11.3Tennessee10.2 Florida8.9Montana3.0Texas11.9 Georgia11.4Nebraska3.9Utah3.1 Hawaii3.8Nevada10.4Vermont3.6 Idaho3.5New Hampshire2.0Virginia8.3 Illinois11.4New Jersey5.3Washington5.2 Indiana7.5New Mexico8.0West Virginia6.9 Iowa2.3New York13.3Wisconsin4.4 Kansas6.4North Carolina11.3Wyoming3.4 Kentucky6.6North Dakota1.7

5 Frequency Distribution Murders per 100,000 population for 1993, by State What have we lost? What have we gained?

6 Relative Frequency Distribution Murders per 100,000 population, by State

7 Collapsed Relative Frequency Distribution Murders per 100,000 population, by State What have we lost? What have we gained?

8 3.2: Measuring central tendency - mean Mean: sum of measurements divided by number of measurements. Equation for the mean of a sample: or, if you don’t have an equation editor, Y bar = SUM(Y i ) / n where… Y bar is the sample mean (Y i ) is a measurement of Y for case i n is the number of cases in the sample

9 Weighted means Weighted sample mean: the sum of measurements divided by the number of observations, adjusted for the number of cases in each observation –Example: we could weight the state murder rates by the number of persons in each state in 1993 to get the mean murder rate for persons in the US If n = 2 the equation for the weighted mean is

10 3.3 Other measures of central tendency Median: the measurement that falls in the middle of an ordered sample –the median is the value of the 50th percentile Percentile: the number such that p% of scores fall below it and (100-p)% of scores fall above it Mode: the value that occurs most frequently

11 3.4: Measures of variation range: the difference between the largest and smallest observations interquartile range: the difference between the 25th and 75th percentile observation deviation: for any observation, the difference between that observation and the sample mean D i = Y i - Y bar (one averaged measure of variation for a sample would be to take the mean of the absolute values of all the deviations for the sample)

12 Variance and standard deviation: the most common measures of variation variance: the mean of the squared deviations for a sample, labeled s 2. standard deviation: the square root of the variance, or the root mean squared deviation, labeled s.

13 Practice: Calculate the mean, variance, and standard deviation. yiyi y bar y i - y bar (y i – y bar ) 2 yiyi y bar y i - y bar (y i – y bar ) 2 11 22 33 33 44 44 77 848 ΣyiΣyi Σ(y i – y bar ) 2 ΣyiΣyi y bar: s2:s2: s2:s2: s:

14 Interpreting the standard deviation. s is (formally) the root mean squared deviation. s is one version of the typical distance of an observation from the sample mean. Because s accounts for squared deviations, it is affected by extreme scores. –Is this a desirable property? –Compare these samples: (-3,-3,+3,+3) vs (-2,-2,-2,+6) Generally, for a continuous quantitative variable Y about 68% of scores fall between Y bar - s and Y bar + s.

15 Interpreting sample statistics. Recall that… –A statistic is a single number estimated from a sample –A parameter is a single number that summarizes some quality of a variable in a population. For means: –the population mean is  (mu) –The sample mean Y bar is an estimator of . For standard deviations –the population standard deviation is  (sigma), –The sample standard deviation s is an estimator of .

16 A conceptual map of STATA source---------interface----------output.do file outside data set command window log file data editorresults window graphics interactive data entry pull-down menus active data set icons

17 The STATA windows environment - icons –Open (use) –Save –Print Results –Begin Log –Start viewer –Bring results window to front –Bring graph window to front –Do-file editor –Data editor –Data browser –Clear –Break

18 The.do file: interface of choice for social research Icons within the.do file: –New –Open –Save –Print –Find –Cut –Copy –Paste –Undo –Do current file –Run current file

19 Sample commands in a.do file use "I:\601Fall08\socy601data.dta", clear summarize AGE summarize AGE [weight=ADULTS] tabulate AGE tabulate AGE [weight=ADULTS] clear

20 How to create a log file One approach is to use the log icon to start and stop a log. Another approach is to type the log-starting command into a.do file : log using I:\601Fall08\week01hmwk.txt, replace *... (your work here)... log close


Download ppt "Sociology 601(Martin) Lecture for week 2: September 9 - 11 Chapter 3.1: –Making Charts Chapter 3.2 – 3.5 (if time permits) –Measures of central tendency."

Similar presentations


Ads by Google