Presentation is loading. Please wait.

Presentation is loading. Please wait.

Descriptive Statistics Renan Levine. Frequency Table One can easily display all of the responses to survey questions in a frequency table. Public Agenda.

Similar presentations


Presentation on theme: "Descriptive Statistics Renan Levine. Frequency Table One can easily display all of the responses to survey questions in a frequency table. Public Agenda."— Presentation transcript:

1 Descriptive Statistics Renan Levine

2 Frequency Table One can easily display all of the responses to survey questions in a frequency table. Public Agenda Foundation Poll: Stand By Me--What Teachers Think About Unions, Merit Pay, and Other Professional Matters. March 17-April 31, Mail survey conducted by Robison and Muenster with a sample provided by Market Data Retrieval (MDR). Available at the Roper Center Data Archive. How much do you agree or disagree with the following statements: Without a union, teachers would be vulnerable to school politics or administrators who abuse their power Strongly agree: 56.5% Somewhat agree: 31.2% Somewhat disagree: 8.2% Strongly disagree: 4.1%

3 Describe the distribution of the responses! Let me suggest: More than half of all teachers strongly agree that without a union, teachers would be vulnerable to school politics or administrators. A further _____ somewhat agree that they would be vulnerable without a union. Only ____ disagree. From this we would conclude that teachers are {badly split}/{tend to agree/___} on their vulnerability without unions. How much do you agree or disagree with the following statements: Without a union, teachers would be vulnerable to school politics or administrators who abuse their power Strongly agree: 56.5% Somewhat agree: 31.2% Somewhat disagree: 8.2% Strongly disagree: 4.1%

4 Professional sounding descriptions Univariate descriptive statistics exist to succinctly give people a mental picture of the distribution of the observations. Primary focus on what is the “typical” observation.  Average (mean) response  Middle (median) response  Most common response (mode) Secondary: how typical is “typical” (or, are many observations different from “typical”).

5 Typical? Measures of central tendency Mode = Most frequent observation.  Just look at which category has the most observations. Median = observation in the middle  Order observations by category in ascending or descending order.  Look at which category has the “middle” observation, so that, half of all observations are higher, half are lower. Mean = Average

6 Calculating an Average Order the observations in ascending or descending order. Value 1 * Number of Observations = X Value 2 * Number of Observations = Y Value 3 * Number of Observations = Z Average = (X+Y+Z) ÷ by the total number of observations. Mean is just a technical name for an average.

7 What is typical? Choosing the right descriptive statistic depends primarily on the level of measurement of the variable. To ascertain what is “typical” one must first assess what level of measurement is used.

8 Summary: Levels of Measurement Nominal – no order (usually includes dichotomous) Ordinal – ordered, but no set distance between categories/values. Interval/ratio- order with a constant, standard distance between categories/values.

9 Levels of Measurement: Nominal Nominal – categories are unordered  Only differentiates categories.  Categories are presented in an arbitrary order.  Usually includes dichotomous variables.  Examples: Provinces (QC, ON, NB, BC…) Occupation (Teacher, Manager, Retail clerk…) Which party did you vote for in the last election? (Liberals, NDP, Greens…) Do you approve of the performance of the Prime Minister? (Yes, No, I don’t know)

10 Other Key Terms Observation = a single datum or case in a set of data. Category = each possible value that the observations can take or in which the observations are assigned. Value = the numerical value assigned to an observation or category. Outlier = an observation with an extreme value relative to the other observations.

11 Nominal? Use mode. Do you approve of the job performance of the current Mayor of San Diego ?  Yes – 37%  No - 48%  I don’t know – 15%^ Nominal variables are unordered, so one cannot order the categories in order to find the median or the mean. Only “typical” measure one can rely on is mode, most frequent observation.  In example above, mode = “No” with 48%  Mode is more concise than saying, “just under half of all survey respondents approve of the Mayor of San Diego, with 15% saying they do not know…”

12 Mode Every variable has a mode or modal category. Can be identified simply by looking at the number of frequencies in each category. If two categories are tied for the honor of having the most observations, then the variable is said to be “bimodal”

13 Example I: Find the mode? USA Today/Gallup Poll # : January – Economy / Obama: What should be the primary goal for the United States in Afghanistan: 1 Building a stable democratic government in Afghanistan 300 (30.6%) 2 Weakening terrorists ability to stage attacks against the USA… 557 (56.8%) 3 Both equally… 123 (12.6%)

14 Example II: Find the mode? Worldviews 2002: American and European Public Opinion on Foreign Policy (Chicago Council on Foreign Relations): There has been some discussion about whether the US should use its troops to invade Iraq and overthrow the government of Saddam Hussein. Which of the following positions is closest to yours: 1 The US should not invade Iraq…108 (15.3%) 2 The US should only invade Iraq with UN approval & the support of its allies … 452 (64.1%) 3 The US should invade Iraq even if they have to go it alone… 145 (20.6%)

15 Levels of Measurement: Ordinal Ordinal – ordered, but no set distance between categories/values. Examples:  Any question that presents a statement and asks respondents to indicate: Strongly agree, agree, neither agree nor disagree, disagree, or strongly disagree, like: We have gone too far in pushing equal rights in this country (Canadian Election Survey 2004, MBS_A1) People who don’t vote have no right to criticize the government (Canadian Election Survey 2004, MBS_E1)

16 Ordinal? Find the median (usually) The median is the value of the middle observation in an ordered distribution.  If there is an even number of observations, take the average of the middle two observations. The mean is also often reported, especially if the ordinal variable has many categories and there are no values that are unusually high compared to the other observations.

17 Median Example: College Shootings There were nine shootings at US universities and colleges between January 2010 and December 2013.*  The number of people killed in these shootings were (chronological order): 3, 3, 2, 7, 0, 0, 3, 1, 0 First, order the observations in ascending order: 0, 0, 0, 1, 2, 3, 3, 3, 7 There are nine observations, so the median is the fifth one in order (red box).

18 Median Example: Shootings The number of wounded at those shootings were (in chronological order): 3,0,0,3,4,2,0,0,3 What was the median number of people wounded in those nine shooting events? A. 0 B. 2 C. 3 D. 4

19 Median? Confidence in Unions 2004 Canadian Election Study, MBS_D5 Please indicate how much confidence you have in the following institutions? Unions. What is the median?  Are most Canadians confident in Unions? Note: Unweighted responses are not reflective of the population.

20 Median? Unions Example There are 1632 non-missing observations, so the median observation is the 816 th. Look at the frequency column.  There are only 86 observations in the first row, plus 445 in the second row = 531. So, the 816 th observation must be among the 735 observations in the 3 rd row. Conclude that the median is 3 = Not very much.

21 Median? Unions Example in SPSS Remember, the median observation is where half of all observations are below, and half of all observations are below. Look at the column on the far-right, “Cum[ulative] Percent. Find the row that surpasses 50%.  The second row is 32.54%, so the median must be higher than the second value.  The third row is 77.57%, so the median, the observation that puts the distribution over 50% must be here, since 50% is greater than 32.54% and less than 77.57% Any statistical package will also report the median for you below this table. In this case the median is ‘3’ = Not very much. Note: Unweighted responses

22 Median? Follow news about economic stimulus USA Today/Gallup Poll # : January – Economy / Obama: How closely have you been following the news about new economic stimulus proposals announced by President Obama and considered by Congress this past week? What is the median?  Did most Americans say they were paying close attention to one of President Obama’s first policy initiatives? Note: Unweighted responses are not reflective of the population.

23 Categories and modes In the example above, looking at attention towards Obama’s economic stimulus, what is the mode – “Very Closely”, “Somewhat Closely,” “Not too closely” or “Not at all”? Because the mode is the most common observation value, remember that the mode is sensitive to the number of possible categories. What would be the mode if you combine everyone who says “Very Closely” or “Somewhat Closely,” into one category and everyone who says “Not too closely” or “Not at all” into a second category?

24 Levels of Measurement: Interval/Ratio Interval/ratio- ordered with standardized distances between categories/values.  Sometimes called “continuous” variables (along with some ordinal variables with plentiful categories).  Examples: Temperature (F or C) Income Gross Domestic Product (GDP)

25 Continuous? Look at the mean (usually) For interval/ratio data, the mean should be reported.  Survey data is rarely interval/ratio, but also look at the mean when the data is ordinal with many categories.

26 Calculating an Average Order the observations in ascending or descending order. Value 1 * Number of Observations = X Value 2 * Number of Observations = Y Value 3 * Number of Observations = Z Average = (X+Y+Z) ÷ by the total number of observations. Mean is just a technical name for an average.

27 Ex: Population living on $2 a day Source: Quality of Government (QoG) v6, April 2011 Mean=42.6

28 Feelings towards Hillary Clinton Source: American National Election Study, 2008, v [recoded into 11 categories, and weighted] Mean = 5.9

29 Example: Feelings towards Hillary Clinton (2008) FrequencyPercentCum. % Strongly dislike Strongly like Total 2, Source: American National Election Study, 2008, v [recoded into 11 categories, and weighted] Mean = 5.9

30 Check the median too. The mean is more sensitive to extreme values.  When there are one or more observations that are very different than most of the other observations, the mean will be very different than the median.  You may need to use your judgement as to whether to report the mean or the median. Best to also check the median.  If there are no extreme outliers, the median and mean will be similar.

31 Trimmed Mean With continuous (interval/ratio) variables, some scholars will report the “10% trimmed mean.” To solve the problem of extreme outliers making the mean atypical of the observations, the trimmed mean calculates the average of all the observations except the highest and lowest 10 percent of the observations.  In a perfectly symmetrical distribution, the mean is the same as the median and the trimmed mean.

32 Ex: Not much difference between Mean & Median FrequencyPercentCum. % Strongly dislike Strongly like Total 2, Source: American National Election Study, 2008, v [recoded into 11 categories, and weighted] Mean = 5.9 Median = 6 Mean = 5.9 Median = 6

33 Example: Real GDP – Large Differences Source: Gleditsch, K. S via Quality of Government (QoG) v6, April 2011 Mean = $9, Median = $5, Mean is sensitive to a few very wealthy countries Trimmed Mean = $7549


Download ppt "Descriptive Statistics Renan Levine. Frequency Table One can easily display all of the responses to survey questions in a frequency table. Public Agenda."

Similar presentations


Ads by Google