Presentation is loading. Please wait.

Presentation is loading. Please wait.

Descriptive Statistics Renan Levine. Frequency Table One can easily display all of the responses to survey questions in a frequency table. Ipsos-Reid.

Similar presentations


Presentation on theme: "Descriptive Statistics Renan Levine. Frequency Table One can easily display all of the responses to survey questions in a frequency table. Ipsos-Reid."— Presentation transcript:

1 Descriptive Statistics Renan Levine

2 Frequency Table One can easily display all of the responses to survey questions in a frequency table. Ipsos-Reid Canadian Online Year End Poll, Dec. 3-10, 2012. N=1021 registered voters nationwide. Margin of error ± 3.5% Results accessed on-line at: http://www.globaltvedmonton.com/canadians+losing+confidence+in+rcmp+poll/6442780370/story.html http://www.globaltvedmonton.com/canadians+losing+confidence+in+rcmp+poll/6442780370/story.html As you may know, the Royal Canadian Mounted Police, otherwise known as the RCMP, has been in the news lately. When it comes to the most senior leadership at the highest levels of the RCMP in its management and accountability of the force, overall, do you think they are doing a: Great job: 9.0% Good job: 37.0% Fair job: 38.0% Terrible job: 16.0%

3 Describe the distribution of the responses! Let me suggest: More than half of all Canadians think that the RCMP is doing a ________ including _____ who think the RCMP is doing a terrible job. _____ think the RCMP is doing a good or great job. From this we would conclude that Canadians are {badly split}/{tend to agree/___} on the RCMP. As you may know, the Royal Canadian Mounted Police, otherwise known as the RCMP, has been in the news lately. When it comes to the most senior leadership at the highest levels of the RCMP in its management and accountability of the force, overall, do you think they are doing a: Great job: 9.0% Good job: 37.0% Fair job: 38.0% Terrible job: 16.0%

4 Professional sounding descriptions Univariate descriptive statistics exist to succinctly give people a mental picture of the distribution of the observations. Primary focus on what is the “typical” observation.  Average (mean) response  Middle (median) response  Most common response (mode) Secondary: how typical is “typical” (or, are many observations different from “typical”).

5 Typical? Measures of central tendency Mode = Most frequent observation.  Just look at which category has the most observations. Median = observation in the middle  Order observations by category in ascending or descending order.  Look at which category has the “middle” observation, so that, half of all observations are higher, half are lower. Mean = Average

6 Calculating an Average Order the observations in ascending or descending order. Value 1 * Number of Observations = X Value 2 * Number of Observations = Y Value 3 * Number of Observations = Z Average = (X+Y+Z) ÷ by the total number of observations. Mean is just a technical name for an average.

7 What is typical? Choosing the right descriptive statistic depends primarily on the level of measurement of the variable. To ascertain what is “typical” one must first assess what level of measurement is used.

8 Summary: Levels of Measurement Nominal – no order (usually includes dichotomous) Ordinal – ordered, but no set distance between categories/values. Interval/ratio- order with a constant, standard distance between categories/values.

9 Levels of Measurement: Nominal Nominal – categories are unordered  Only differentiates categories.  Categories are presented in an arbitrary order.  Usually includes dichotomous variables.  Examples: Provinces (QC, ON, NB, BC…) Occupation (Teacher, Manager, Retail clerk…) Which party did you vote for in the last election? (Liberals, NDP, Greens…) Do you approve of the performance of the Prime Minister? (Yes, No, I don’t know)

10 Other Key Terms Observation = a single datum or case in a set of data. Category = each possible value that the observations can take or in which the observations are assigned. Value = the numerical value assigned to an observation or category. Outlier = an observation with an extreme value relative to the other observations.

11 Nominal? Use mode. Do you approve of the job performance of the current Minister of Human Resources and Social Development ?  Yes – 37%  No - 48%  I don’t know – 15%^ Nominal variables are unordered, so one cannot order the categories in order to find the median or the mean. Only “typical” measure one can rely on is mode, most frequent observation.  In example above, mode = “No” with 48%  Mode is more concise than saying, “just under half of all Canadians approve of the Minister of Human Resources, with 15% saying they do not know…”

12 Mode Every variable has a mode or modal category. Can be identified simply by looking at the number of frequencies in each category. If two categories are tied for the honor of having the most observations, then the variable is said to be “bimodal”

13 Example: Find the mode? Canadian Election Study, MBS_B1: Please circle the number that best reflects your opinion. The government should: 1. See to it that everyone has a decent standard of living……1090 (65.7%) 2. Leave people to get ahead on their own… 384 (23.1%) 8. Not sure...................185 (11.1% Note: Unweighted responses are not reflective of the population.

14 Levels of Measurement: Ordinal Ordinal – ordered, but no set distance between categories/values. Examples:  Any question that presents a statement and asks respondents to indicate: Strongly agree, agree, neither agree nor disagree, disagree, or strongly disagree, like: We have gone too far in pushing equal rights in this country (Canadian Election Survey 2004, MBS_A1) People who don’t vote have no right to criticize the government (Canadian Election Survey 2004, MBS_E1)

15 Ordinal? Find the median (usually) The median is the value of the middle observation in an ordered distribution.  If there is an even number of observations, take the average of the middle two observations. The mean is also often reported, especially if the ordinal variable has many categories and there are no values that are unusually high compared to the other observations.

16 Median Example What is the median number of years between each of the last ten elections (going back to 1980, Liberal Pierre Trudeau’s last triumph)? There were ten elections, with gaps* of 4 years, 4 years, 5, 4, 3, 4, 2, 2, and 3 years. First, order the observations in ascending order: 2,2,3,4,4,4,4,4,5 There are nine observations, so the median is the fifth one in order (red box).

17 Median? Confidence in Unions 2004 Canadian Election Study, MBS_D5 Please indicate how much confidence you have in the following institutions? Unions. What is the median?  Are most Canadians confident in Unions? Note: Unweighted responses are not reflective of the population.

18 Median? Unions Example There are 1632 non-missing observations, so the median observation is the 816 th. Look at the frequency column.  There are only 86 observations in the first row, plus 445 in the second row = 531. So, the 816 th observation must be among the 735 observations in the 3 rd row. Conclude that the median is 3 = Not very much.

19 Median? Unions Example in SPSS Remember, the median observation is where half of all observations are below, and half of all observations are below. Look at the column on the far-right, “Cum[ulative] Percent. Find the row that surpasses 50%.  The second row is 32.54%, so the median must be higher than the second value.  The third row is 77.57%, so the median, the observation that puts the distribution over 50% must be here, since 50% is greater than 32.54% and less than 77.57% Any statistical package will also report the median for you below this table. In this case the median is ‘3’ = Not very much. Note: Unweighted responses

20 Median Example II: Bilingualism WE HAVE GONE TOO FAR IN PUSHING BILINGUALISM IN CANADA. Value LabelValueFrequencyPercent^Valid PercentCum Percent STRONGLY AGREE1.004069.3925.84 AGREE2.0047410.9630.1756.02 DISAGREE3.0048811.2931.0687.08 STRONGLY DISAGREE4.002034.7012.92100.00 TOTAL.1571 What is the median observation? In other words, do most Canadians agree that we have gone too far in pushing bilingualism in Canada? Note: Unweighted responses

21 Categories and modes In the example above, looking at attitudes towards bilingualism, what is the mode – “Agree”, “Disagree” or “Strongly Agree”? Because the mode is the most common observation value, remember that the mode is sensitive to the number of possible categories. What would be the mode if you combine everyone who says “Agree” or Strongly Agree” into one category and everyone is says “Disagree” or “Strongly Disagree” into a second category?

22 Levels of Measurement: Interval/Ratio Interval/ratio- ordered with standardized distances between categories/values.  Sometimes called “continuous” variables (along with some ordinal variables with plentiful categories).  Examples: Temperature (F or C) Income Gross Domestic Product (GDP)

23 Continuous? Look at the mean (usually) For interval/ratio data, the mean should be reported.  Survey data is rarely interval/ratio, but also look at the mean when the data is ordinal with many categories.

24 Calculating an Average Order the observations in ascending or descending order. Value 1 * Number of Observations = X Value 2 * Number of Observations = Y Value 3 * Number of Observations = Z Average = (X+Y+Z) ÷ by the total number of observations. Mean is just a technical name for an average.

25 Ex: Population living on $2 a day Source: Quality of Government (QoG) v6, April 2011 Mean=42.6

26 Feelings towards Conservative Party Source: Canadian Election Study, 2008, CES_MBS_I10a [National Weight] Mean = 4.8

27 Example: Feelings toward Conservatives FrequencyPercentCum. % Strongly dislike 01008.64 1766.5515.19 213611.726.89 3877.5134.41 4857.3341.74 518215.6357.37 61089.2666.63 714612.5779.2 814312.391.5 9524.596 Strongly like 10464100 Total1,162100 Source: Canadian Election Study, 2008, CES_MBS_I10a [National Weight] Mean = 4.8

28 Check the median too. The mean is more sensitive to extreme values.  When there are one or more observations that are very different than most of the other observations, the mean will be very different than the median.  You may need to use your judgement as to whether to report the mean or the median. Best to also check the median.  If there are no extreme outliers, the median and mean will be similar.

29 Trimmed Mean With continuous (interval/ratio) variables, some scholars will report the “10% trimmed mean.” To solve the problem of extreme outliers making the mean atypical of the observations, the trimmed mean calculates the average of all the observations except the highest and lowest 10 percent of the observations.  In a perfectly symmetrical distribution, the mean is the same as the median and the trimmed mean.

30 Ex: Not much difference between Mean & Median FrequencyPercentCum. % Strongly dislike 01008.64 1766.5515.19 213611.726.89 3877.5134.41 4857.3341.74 518215.6357.37 61089.2666.63 714612.5779.2 814312.391.5 9524.596 Strongly like 10464100 Total1,162100 Source: Canadian Election Study, 2008, CES_MBS_I10a [National Weight] Mean = 4.8 Median = 5 Mean = 4.8 Median = 5

31 Example: Real GDP – Large Differences Source: Gleditsch, K. S. 2002 via Quality of Government (QoG) v6, April 2011 Mean = $9,089.82 Median = $5,194.48 Mean is sensitive to a few very wealthy countries Trimmed Mean = $7549


Download ppt "Descriptive Statistics Renan Levine. Frequency Table One can easily display all of the responses to survey questions in a frequency table. Ipsos-Reid."

Similar presentations


Ads by Google