Presentation is loading. Please wait.

Presentation is loading. Please wait.

Understanding Basic Statistics

Similar presentations


Presentation on theme: "Understanding Basic Statistics"— Presentation transcript:

1 Understanding Basic Statistics
Chapter 1 Getting Started Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze

2 What is Statistics? Collecting data Organizing data Analyzing data
Interpreting data

3 Individuals and Variables
Individuals are people or objects included in the study. Variables are characteristics of the individual to be measured or observed.

4 Variables Quantitative Variable – The variable is numerical, so operations such as adding and averaging make sense. Qualitative Variable – The variable describes an individual through grouping or categorization.

5 Data Population Data – The data are from every individual of interest.
Sample Data – The data are from only some of the individuals of interest.

6 Data Which of the following Venn diagrams shows the relationship between population data and sample data? a). b). c). d). P S S P P S S P

7 Levels of Measurement Nominal Level – The data consists of names, labels, or categories. Ordinal Level – The data can be ordered, but the differences between data values are meaningless.

8 Levels of Measurement Interval Level – The data can be ordered and the differences between data values are meaningful. Ratio Level – The data can be ordered, differences and ratios are meaningful, and there is a meaningful zero value.

9 Levels of Measurement The freezing points of four liquids are 32°F, 6°F, 13°F, and 20°F. What is the level of these measurements? a). Nominal b). Ordinal c). Interval d). Ratio

10 Levels of Measurement The freezing points of four liquids are 32°F, 6°F, 13°F, and 20°F. What is the level of these measurements? a). Nominal b). Ordinal c). Interval d). Ratio

11 Two Branches of Statistics
Descriptive Statistics: Organizing, summarizing, and graphing information from populations or samples. Inferential Statistics: Using information from a sample to draw conclusions about a population.

12 Sampling Techniques sample Simple Random Sampling, Sample size = n
Each member of the population has an equal chance of being selected. Each sample of size n has an equal chance of being selected. Stratified sampling Population Subgroup 4 Subgroup 3 sample Subgroup 2 Subgroup 1

13 Sampling Techniques Systematic sampling
Number every member of the population. Select every kth member. Cluster sampling Population is naturally divided into pre-existing segments. Make a random selection of clusters, then select all members of each cluster. Convenience sampling - Collect sample data from a readily-available population database.

14 Critical Thinking Which of the following sampling strategies is likely to lead to a non-sampling error? Individuals are selected at random from… a). A database of social security numbers. b). A cluster of phone books. c). A collection of birth certificates. d). None of these is likely to introduce non-sampling error.

15 Critical Thinking Which of the following sampling strategies is likely to lead to a non-sampling error? Individuals are selected at random from… a). A database of social security numbers. b). A cluster of phone books. c). A collection of birth certificates. d). None of these is likely to introduce non-sampling error. Not everyone has a phone. Sampling from phone books may introduce bias.

16 Guidelines For Planning a Statistical Study
Identify individuals or objects of interest. Specify the variables. Determine if you will use the entire population. If not, determine an appropriate sampling method Determine a data collection plan, addressing privacy, ethics, and confidentiality if necessary.

17 Guidelines For Planning a Statistical Study
Collect data. Analyze the data using appropriate statistical methods. Note any concerns about the data and recommend any remedies for further studies.

18 Census vs. Sample In a census, measurements or observations are obtained from the entire population (uncommon and often impractical). In a sample, measurements or observations are obtained from part of the population (common).

19 Observational Studies and Experiments
Observational Study – Measurements are obtained in a way that does not change the response or the variable being measured. (No treatment is applied.) Experiment – A treatment is applied in order to observe its effect on the variable being measured.

20 Experiment Used to determine the effect of a treatment.
Experimental design needs to control for other possible causes of the effect. Placebo effect. Lurking variables. To minimize these confounds, create one or more control groups that receive no treatment.

21 Experiment Designs Double-Blinding – minimizes the unintentional transfer of bias between researcher and subject.

22 Surveys Collecting data from respondents by asking them questions.
Survey Pitfalls Nonresponse → undercoverage of population. Truthfulness – respondents sometimes lie. Faulty recall of respondent Hidden bias – due to poor question wording. Vague wording – “sometimes”, “often”, “seldom” Interviewer influence – who is asking the questions and in what manner. Voluntary response – relatively interested individuals are more likely to participate.

23 Understanding Basic Statistics
Chapter 2 Organizing Data Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze

24 Frequency Tables A frequency table organizes quantitative data.
partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students 61-70 4 71-80 8 81-90 15 91-100 7

25 Data Classes and Class Frequency
Class: an interval of values. Example: 61  x  70 Frequency: the number of data values that fall within a class. “Five data fall within the class 61  x  70”. Relative Frequency: the proportion of data values that fall within a class. “18% of the data fall within the class 61  x  70”.

26 Structure of a Data Class
A “data class” is basically an interval on a number line. It has: A lower limit a and an upper limit b. A width. A lower boundary and an upper boundary (integer data). A midpoint.

27 Structure of a Data Class
A “data class” is basically an interval on a number line. If a = 60 and b = 69 for integer data, what is the value of the lower boundary? a). 60 b). 59.5 c). 9 d). 64.5

28 Constructing Data Classes
Find the class width. Increase the computed value to the next higher whole number. Find the class limits. The lower limit of the “leftmost” class is set equal to the smallest value in the data set.

29 Constructing Data Classes, cont’d
Find the class boundaries (integer data). Subtract 0.5 from the lower class limit and add 0.5 to the upper class limit. For a certain data set, the minimum value is 25 and the maximum value is 58. If you wish to partition the data into 5 classes, what would be the class width? a). 5 b). 6 c). 7 d). 8

30 Constructing Data Classes, cont’d
Find the class boundaries (integer data). Subtract 0.5 from the lower class limit and add 0.5 to the upper class limit. For a certain data set, the minimum value is 25 and the maximum value is 58. If you wish to partition the data into 5 classes, what would be the class width? a). 5 b). 6 c). 7 d). 8

31 Histograms Histogram – graphical summary of a frequency table.
Uses bars to plot the data classes versus the class frequencies.

32 Making a Histogram Make a frequency table.
Place class boundaries on horizontal axis. Place frequencies on vertical axis. For each class, draw a bar with height equal to the class frequency and width equal to the class width plus 1.

33 Making a Histogram

34 Distribution Shapes Symmetric Uniform Bimodal Skewed Left Skewed Right

35 Graphical Displays… … represent the data.
… induce the viewer to think about the substance of the graphic. …should avoid distorting the message of the data.

36 Bar Graphs Used for qualitative or quantitative data.
Can be vertical or horizontal. Bars are uniformly spaced and have equal widths. Length/height of bars indicate counts or percentages of the variable. “Good practice” requires including titles and units and labeling axes.

37 Bar Graphs Example:

38 Pareto Charts A bar chart with two specific features:
Heights of bars represent frequencies. Bars are vertical and are ordered from tallest to shortest.

39 Circle Graphs/Pie Charts
Used for qualitative data Wedges of the circle represent proportions of the data that share a common characteristic. “Good practice” requires including a title and either wedge labels or legend.

40 Time-Series Shows data measurements in chronological order.
Data are plotted in order of occurrence at regular intervals over a period of time.

41 Critical Thinking – which type of graph to use?
Bar graphs are useful for quantitative or qualitative data. Pareto charts identify the frequency in decreasing order. Circle graphs display how a total is dispersed into several categories. Time-series graphs display how data change over time.

42 Critical Thinking – which type of graph to use?
What type of graph would be best for showing the ice cream flavor preferences of a group of 100 children? a). Histogram b). Pareto graph c). Time series graph d). Circle graph

43 Critical Thinking – which type of graph to use?
What type of graph would be best for showing the ice cream flavor preferences of a group of 100 children? a). Histogram b). Pareto graph c). Time series graph d). Circle graph

44 Stem and Leaf Plots Displays the distribution of the data while maintaining the actual data values. Each data value is split into a stem and a leaf.

45 Stem and Leaf Plot Construction

46 Critical Thinking Large gaps between stems containing leaves, especially at the top or bottom, suggest the existence of outliers. Watch the outliers – are they data errors or simply unusual data values?

47 Averages and Variation
Chapter 3 Averages and Variation Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze

48 Measures of Central Tendency
Average – a measure of the center value or central tendency of a distribution of values. Three types of average: Mode Median Mean

49 Mode The mode is the most frequently occurring value in a data set.
Example: Sixteen students are asked how many college math classes they have completed. {0, 3, 2, 2, 1, 1, 0, 5, 1, , 0, 2, 2, 7, 1, 3} The mode is 1.

50 Median Finding the median:
1). Order the data from smallest to largest. 2). For an odd number of data values: Median = Middle data value 3). For an even number of data values:

51 Mean Sample mean Population mean

52 Resistant Measures of Central Tendency
A resistant measure will not be affected by extreme values in the data set. The mean is not resistant to extreme values. The median is resistant to extreme values. A trimmed mean is also resistant.

53 Critical Thinking Four levels of data – nominal, ordinal, interval, ratio (Chapter 1) Mode – can be used with all four levels. Median – may be used with ordinal, interval, of ratio level. Mean – may be used with interval or ratio level.

54 Critical Thinking Mound-shaped data – values of mean, median and mode are nearly equal.

55 Measures of Variation Three measures of variation: range variance
standard deviation Range = Largest value – smallest value Only two data values are used in the computation, so much of the information in the data is lost.

56 The Coefficient of Variation
For Samples For Populations

57 Percentiles and Quartiles
For whole numbers P, 1 ≤ P ≤ 99, the Pth percentile of a distribution is a value such that P% of the data fall below it, and (100-P)% of the data fall at or above it. Q1 = 25th Percentile Q2 = 50th Percentile = The Median Q3 = 75th Percentile

58 Quartiles and Interquartile Range (IQR)

59 Computing Quartiles

60 Five Number Summary Minimum, Q1, Median, Q3, Maximum
A listing of the following statistics: Minimum, Q1, Median, Q3, Maximum Box-and-Whisder plot – represents the five-number summary graphically.

61 Box-and-Whisker Plot Construction

62 Critical Thinking Box-and-whisker plots display the spread of data about the median. If the median is centered and the whiskers are about the same length, then the data distribution is symmetric around the median. Fences – may be placed on either side of the box. Values lie beyond the fences are outliers. (See problem 10)

63 Problems Pg. 109 #4, 5


Download ppt "Understanding Basic Statistics"

Similar presentations


Ads by Google