Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2: Statistics of One Variable

Similar presentations


Presentation on theme: "Chapter 2: Statistics of One Variable"— Presentation transcript:

1 Chapter 2: Statistics of One Variable
Grade 12 Data Management

2 Data Analysis With Graphs
Statistics is the gathering, organization, analysis, and presentation of numerical information Unprocessed information collected for a study is called raw data The quantity being measured is a variable A continuous variable can have any value within a given range A discrete variable can have only certain separate values (often integers)

3 Frequency tables and frequency diagrams can give a convenient overview of the distribution of values of the table and reveal trends in the data A histogram is a special form of bar graph in which the area of the bars are proportional to the frequencies of the values of the variable Connected represents continuous range of values

4 A frequency polygon can illustrate the same information as a histogram or bar graphs

5 Example 1: Frequency Tables and Diagrams
Here are the sum of two numbers from 50 rolls of a pair of standard dice.

6 Use a graph to illustrate the information in the frequency table.
Bar Graph Frequency Polygon

7 Create a cumulative – frequency table and graph for the data

8 Example of Histogram Frequency Polygon may be superimposed onto
the same grid as the histogram.

9 Setting up Frequency – distribution table
Incorrect Correct The values in the intervals should not overlap, otherwise, a value belonging to two intervals would create a consistency error For example, suppose an individual is 38 years old, would that individual be placed in the “33 – 38” or “38 – 42” interval? It would be impossible to determine and inconsistent

10 Indices An index relates the value of a variable (or a group of variables) to a base label, which is often the value on a particular date CPI, TSE 300 Time-series graphs are often used to show how indices change over time Determine rate of change by determining slope using two points on the graph Rate of change = rise/run

11 Sampling Techniques Population refers to all individuals who belong to a group being studied Sample refers to the segment of the population used in a study Sampling frame refers to the group of individuals who actually have a chance of being selected

12 Simple Random Sample Every member of the population has an equal chance of being selected and the selection of any particular individual does not affect the chances of any other individual being chosen

13 Systematic Sample In a systematic sample, go through the population sequentially and select members at regular intervals The sample size and the population size determine the sampling interval Interval = population size/sample size Suppose a study determines the interval to be 3040, an individual may be selected from any of the first 3040 individuals, and select every 3040th individual from that point on

14 Stratified Sample If a population includes groups of members who share common characteristics, such as gender, age, or education level Such a group are called strata A stratified sample has the same proportion of members from each stratum as the population does

15 Example 2: Designing a Stratified Sample
Before booking bands for the high school dances, the students’ council at Statsville High School wants to survey the music preferences of the student body. The following table shows the enrolment at the school. Design a stratified sample for a survey of 25% of the student body.

16 Example 2: Solution To obtain a stratified sample with the correct proportions, simple select 25% of the students in each grade level

17 Other Sampling Techniques
Cluster Sample – If certain groups are likely to be representative of the entire population, you can use a random selection of such groups as cluster sample Multi – Stage Sample – Uses several levels of random sampling Voluntary – Response Sample – Researcher simply invites any member of the population to participate Convenience Sample – Often, a sample is selected simply because it is easily accessible

18 Measures of Central Tendency
Often convenient to use a central value to summarize a set of data Various methods exists to find values around which a set of data tends to cluster These are known as measures of central tendency

19 Mean Commonly referred to as “average”
Population mean (N = Entire population) µ = x1 + x2 + … + xN N µ = ∑ x Sample mean (n = Sample size) x = x1 + x2 + … + xn n x = ∑ x

20 The mode is the value that occurs most frequently in the distribution
The median is the middle value of the data when they are ranked from highest to lowest When there is an even number of values, the median is the midpoint between the two middle values The mode is the value that occurs most frequently in the distribution Some distributions do not have a mode, while others have several

21 Weighted Mean A weighted mean gives a measure of central tendency that reflects the relative importance of the data: xw = w1x1 + w2x2 + … + wnxn w1 + w2 + … wn x = ∑ wixi ∑ wi Differs from standard mean calculation because it gives a stronger weight (importance) to certain categories

22 Example 3: Weighted Mean
The HR manager for Statsville Marketing Limited considers five criteria when interviewing a job applicant. The manager gives each applicant a score between 1 and 5 in each category, with 5 being the highest score. Each category has a weighting between 1 and 3. The following table lists a recent applicant’s score and the company’s weighting factors.

23 Determine the weighted mean score for this job applicant.

24 Example 3: Solution xw = 2(4) + 2(2) + 3(5) + 3(5) + 1(4)
xw = 11 xw = 46 xw = 4.2 Therefore, applicant has a weighted mean of approx. 4.2.

25 Grouped Data When a set of data has been grouped into intervals, it is possible to approximate the mean using the formula: Population mean µ = ∑ fimi ∑ fi Sample mean x = ∑ fimi Where mi is the midpoint value of an interval and fi the frequency for that interval Estimate the median for grouped data by taking the midpoint of the interval within which the median is found

26 Example 4: Calculate the Mean and Median for Grouped Data
A group of children were asked how many hours a day they spend watching television. The table at the right summarizes their response. Determine the mean and median number of hours for this distribution.

27 Example 4: Solution x = ∑ fimi ∑ fi x = 49 18 x = 2.7
Therefore, the mean time the children spent watching television is approximately 2.7 h a day.

28 It should be noted the values for the mean and median are approximate because where the data lie within each interval cannot be accurately determined

29 Example 5: Determine the error of the Frequency – Distribution Table
Explain the problem with the intervals in the following table. Missing values between intervals

30 Homework Page 101 #1a, 2, 3ab, 8, 12, 15 Page 117 #1, 2, 3, 7, 9
Reminders: Mid – Term Exam (Thursday) Chapter 2 Quiz (Entire chapter, next Monday)


Download ppt "Chapter 2: Statistics of One Variable"

Similar presentations


Ads by Google