Presentation on theme: "22 22.2Sampling Methods 22.3Statistical Investigations Chapter Summary Case Study Uses and Abuses of Statistics 22.1Statistical Surveys."— Presentation transcript:
22 22.2Sampling Methods 22.3Statistical Investigations Chapter Summary Case Study Uses and Abuses of Statistics 22.1Statistical Surveys
P. 2 Mr. and Mrs. Chan want to buy a car. The salesman claims that the number of Alpha cars sold is much higher than that of Sonic’s. As shown in the bar chart, the height of the bar representing the sales of Alpha is almost a double of Sonic’s. However, the vertical axis does not start from zero. Let’s see the graph here. The sales of our car are much better than Sonic’s… The sales of this car are almost a double of Sonic’s! Isn’t it good? Wait, it seems there’s something wrong with the graph… The figure gives a wrong impression that there is a big difference between the sales of the two brands. Case Study
P. 3 The following shows the major steps of conducting surveys and the major points that must be considered in each step. 22.1 Statistical Surveys 2.Next, we should define the ‘population’ of the survey clearly. The population is the target of the survey. People conduct different kinds of surveys to collect useful data for statistical investigations. An effective survey can help people gather information for policy formulation on public issues, business decision-making purposes and social studies. 3.Then, we should set the budget for conducting the survey. It is important to have sufficient resources such as time, money and manpower to carry out the survey. Step 1:Planning the Survey 1.When planning the survey, first we must clearly specify the objectives of the survey.
P. 4 After planning the survey, we have to choose an appropriate data-collection method, such as 1.interviews; 2.questionnaires; 3.observation; 4.direct testing or experiment; 5.collection of data from existing statistical reports. 22.1 Statistical Surveys Step 2:Choosing an Appropriate Data-collection Method The most common way to collect data is using questionnaires. The following are some general principles for designing a questionnaire: (a)The questions must be relevant to the objectives of the survey. (b)Long questionnaires are undesirable. (c)The questions must be clear and easy to answer. (d)Questions that lead respondents’ opinions towards certain answers must be avoided.
P. 5 22.1 Statistical Surveys All of them affect the reliability and validity of the questionnaire. Reliability is concerned with the stability and consistency of the data collected. Validity is concerned with the relevance of the data collected to the objective of the survey. (e)The data collected must be easy to interpret. (f)Questions should be arranged in a proper order. (g)The language used should be appropriate. (h)Questions should be appropriate, specific and precise. (i)Embarrassing questions should be avoided. (j)Composite and double negative questions should be avoided. (k)Questions which rely on respondents’ memory should be avoided. (l)Options such as ‘Don’t know / No opinion / Others’ should be included as appropriate.
P. 6 22.1 Statistical Surveys Step 3:Selecting the Sample Step 4:Collecting the Raw Data By using the questionnaires, we can collect information in the following ways: 1.Personal interviews 2.Telephone interviews 3.Self-administered questionnaires by mail/email Since it is often very time-consuming to collect information from all the members of a population, most surveys are conducted on samples of the whole population. After designing the questionnaires, we have to decide a suitable sampling method to select samples. After designing the questionnaires and selecting the samples, we can move on to collecting the data.
P. 7 22.1 Statistical Surveys Step 5:Analysing the Data and Interpreting the Findings Step 6:Presenting the Investigation All raw data collected have to be checked carefully before being compiled with suitable statistical techniques. Also, the data should be organized first before analysis. After compiling the statistical data, the survey results will be sent to the relevant parties or organization. If the subject of the survey is of public interest, the results may be published.
P. 8 In many real-life cases, the population is very large or inaccessible. Collecting data from the whole population would be very expensive and time consuming. 22.2 Sampling Methods So in these cases, we will use a sampling method to choose some samples from the population at RANDOM, and use the results obtained from these samples to estimate the results for the whole population. Therefore, we can hardly carry out a statistical survey on the whole population. There are two main types of sampling methods, probability sampling and non-probability sampling.
P. 9 There are three important methods of probability sampling: simple random sampling, systematic sampling and stratified random sampling. 22.2 Sampling Methods (a)Simple Random Sampling Simple random sampling is a method of selecting a sample such that each item in the population has an equal probability of being chosen. A. Probability Sampling In order to use the method of simple random sampling, we should first list all the items in the population and assign a unique identification number to each of them, and then group all the numbers in a table. This number list is called the sampling frame of the population.
P. 10 Example 22.1T Solution: Helen is a committee member of a youth centre. She wants to select 100 members at random and investigate their family status. Suggest how she can form the sampling frame using simple random sampling. She can form the sampling frame by using the member numbers. 22.2 Sampling Methods A. Probability Sampling
P. 11 (b)Systematic Sampling Systematic sampling is a method by which we first select a starting point at random, then select every kth (such as 10th or 50th) item in the population. 22.2 Sampling Methods Compared with simple random sampling, systematic sampling is much more efficient because we do not need to know the size of the population. A. Probability Sampling However, when the items share some regular pattern, systematic sampling may lead to a biased sampling result.
P. 12 Example 22.2T An insurance company has 10 000 policyholders. The company wants to conduct a survey on their clients’ spending habits. The marketing department selects 500 clients by systematic sampling and sends questionnaires to them. (a) How can the company form a sampling frame? (b) Will a client be selected more than once? Solution: (a) They can use the policy numbers to form a sampling frame. (b) Yes, some clients may have more than one policy. 22.2 Sampling Methods A. Probability Sampling
P. 13 (c)Stratified Random Sampling Stratified random sampling is a sampling method that divides the population into at least two subgroups (called strata) that share the same characteristics (such as gender, age), and then select samples from each stratum. 22.2 Sampling Methods Random samples are selected from each stratum and the sizes of the samples in each stratum are proportional to the stratum size. A. Probability Sampling Notes: The stratified random sampling method may reflect the characteristics of a population more accurately than the other two sampling methods. However, as detailed information about individual items in the population are needed, this method may be time consuming and expensive.
P. 14 Example 22.3T The students’ union of a university conducts a survey on the annual travel expenses of students in the university. They select 100 students for an interview by stratified random sampling. The following shows the number of students in each year. How many students should be selected from year 4? Solution: Total number of students 2000 Number of students selected from year 4 400 + 500 + 500 + 600 22.2 Sampling Methods A. Probability Sampling YearNumber of Students Year 1400 Year 2500 Year 3500 Year 4600
P. 15 There are several types of non-probability sampling methods: convenience sampling, voluntary response sampling, judgment sampling, quota sampling and snowball sampling. 22.2 Sampling Methods (a)Convenience Sampling Convenience sampling is also called haphazard or accidental sampling. B. Non-probability Sampling The sample is chosen at the convenience of the researcher. (b)Voluntary Response Sampling In this method, respondents themselves choose to take part in the survey. Write-in and call-in opinion polls use this kind of sampling method. (c)Judgment sampling When using this method, the sample is chosen based on the judgment or experience of the researcher. Judgment sampling is also called purposive sampling because the sample is chosen with a purpose. The researcher tries to obtain a sample that appears to be representative of the population.
P. 16 22.2 Sampling Methods (d)Quota Sampling In quota sampling, interviewers have been given quotas to fill from specified sub-groups of the population and the interviewers select the sample. B. Non-probability Sampling This is similar to stratified sampling, but in quota sampling, the choice of the sample is non-random. (e)Snowball Sampling This sampling technique is often used in hidden populations which are difficult for researchers to access. To start with, the researcher compiles a short list of sample members from various sources. Each of these respondents is contacted to provide the names of other probable respondents. Because sample members are not selected from a sampling frame, snowball samples are subject to numerous biases.
P. 17 22.2 Sampling Methods The following table gives the differences between probability and non-probability sampling. C. Comparing Probability and Non-probability Sampling Sampling Sampling MethodProbability samplingNon-probability sampling Chance for each datum to be selected EqualUnequal Nature of resultUnbiasedMay be biased Nature of sampleRepresentativeMay not be representative Random selectionInvolvedNot involved Knowledge on the population Relatively lowRelatively high ComplexityRelatively highRelatively low Time neededRelatively longerRelatively shorter Budget neededRelatively highRelatively low
P. 18 In recent years, statistics is widely used in different aspects. People get much benefit from the use of statistical methods. 22.3 Statistical Investigations Everyday, there are many statistical reports presented in the media such as newspapers, journals, magazines, television and the internet. A. Uses of Statistics Such reports are often presented in the form of different statistical graphs according to the nature of the data. The following are four types of commonly used graphs: 1.Pie chart2.Histogram / Bar chart 3.Broken line graph4.Stem-and-leaf diagram
P. 19 Statistical data are often presented in ways that favour the producers but not the users. The common ways used to mislead users are: 1.Using the average to mislead readers 2.Misinterpreted percentages 3.Misrepresentation of data by graphs 22.3 Statistical Investigations B. Abuses of Statistics
P. 20 Example 22.4T Solution: A group of students wanted to know the average amount of pocket money per month of the students in their school. They interviewed 40 S6 students, and 30 of them have pocket money over $1000 each month. They claimed that 75% of the students have pocket money over $1000 each month. Do you think it is misleading? Give a reason. Yes, it is misleading since S6 students may have more pocket money than S1 students in general. 22.3 Statistical Investigations B. Abuses of Statistics
P. 21 Example 22.5T Solution: The figure shows the sales of two brands of orange juice. (a)Find the ratio of the sales. (b)Does the advertisement over-emphasize the sales of Miss Orange? (a) Sales of ‘Sun’ : Sales of ‘Miss Orange’ (b) Area of figure for Sun : Area of figure for Miss Orange 22.3 Statistical Investigations B. Abuses of Statistics 3 : 12 1 : 4 Yes, the advertisement over-emphasizes the sales of Miss Orange.
P. 22 There are some criteria for assessing statistical investigations presented in different sources. A good statistical investigation should consider the following: 22.3 Statistical Investigations C. Assessing the Statistical Investigations 1.Sponsorship of the survey The sponsor of a survey might affect the response rate. Generally the response rate of a survey would be higher if it is sponsored by a university. 2.Population covered The researcher is responsible to define the target population clearly. The population is defined in keeping with the objectives of the study. 3.Sampling method As the population is too large, a sample is always used to represent the population. The sample chosen should reflect the characteristics of the population from which it is drawn.
P. 23 4.Mode of data collection There are different ways to get data and all these methods have their advantages and disadvantages which may affect the results of the survey. 22.3 Statistical Investigations C. Assessing the Statistical Investigations 5.Time period of data collection The time period of data collection also affects the reliability of the survey. 6.Wording of questions The wording of a question is very important. Words like ‘usually’, ‘often’, ‘sometimes’, ‘occasionally’, ‘seldom’ and ‘rarely’ are commonly used in questionnaires. But they do not have the same meaning to everyone. 7.Sample size and response rate Response rate indicates how much confidence can be placed in the results of a survey. A low response rate will ruin the reliability of a study.
P. 24 22.1 Statistical Surveys Data-collection methods: (a) Interviews (b) Questionnaires (c) Observation (d) Experiment (e) Existing statistical reports Chapter Summary The steps in conducting a survey are: 1.Planning the survey 2.Choosing an appropriate data-collection method 3.Selecting the sample 4.Collecting the raw data 5.Analysing the data and interpreting the findings 6.Presenting the investigation
P. 25 1.A population in statistics refers to the entire set of individuals under study. A sample refers to a carefully chosen and representative part of the population. Chapter Summary 22.2 Sampling Methods 2.Probability Sampling (a)Simple random sampling is a method of selecting a sample such that each item in the population has an equal chance of being chosen. (b)Systematic sampling is a method that selects a starting point randomly, then selects every kth item in the population. (c)Stratified random sampling is a method that divides the population into strata, each of which is composed of data sharing the same characteristics, and then samples are selected from each stratum.
P. 26 Chapter Summary 22.2 Sampling Methods 3.Non-probability Sampling (a)Convenience sampling is a method of selecting a sample at the convenience of the researcher. (b)Voluntary response sampling is a method in which the respondents themselves choose to take part in the survey. (c)Judgment sampling is a method to choose a sample based on the judgment or experience of the researcher. (d)Quota sampling is a method in which interviewers have been given quotas to fill from specific sub-groups of the population and the interviewers select the sample. (e)Snowball sampling is a method to select a sample by first contacting a few potential respondents, and then relying on the referrals from the initial respondents to generate additional subjects.
P. 27 Chapter Summary 22.3 Statistical Investigations 1. Uses of Statistics In the media, there are many statistical reports which are presented according to the nature of the data in different types of graphs, such as a (a)pie chart (b)broken line graph (c)histogram / bar chart(d)stem-and-leaf diagram 2. Abuses of Statistics Statistical data are often presented in ways that favour the producers but not the users. The common ways used to mislead users are: (a)Misuse of the ‘averages’ (b)Misinterpreted percentages (c)Misrepresentation of data by graphs 3. Assessing the Statistical Investigations There are some criteria for assessing statistical investigations presented in different sources.
Follow-up 22.1 Solution: Connie is a committee member of the school movie club and she wants to investigate the frequency of watching movies each month. There are 1000 students in the school. She assigns a unique number to each student and selects 100 students at random to form a sample. (a) Why does she assign a number to each of the students instead of using their class numbers? (b) If she uses the random number table shown on the right, how many digits of the random number would she need to choose each time? (a) Because the class numbers are not unique. (b) 3 digits 22.2 Sampling Methods A. Probability Sampling
Follow-up 22.2 Mary arranges the names of her students in alphabetical order and assigns a unique number to each student. She chooses one from every 20 students after selecting a random starting point. (a) Identify the method used in this case. (b) How can she arrange the names of the students in alphabetical order? Solution: (a) Systematic sampling (b) She can get the name list of each class and sort the name list by computer software. 22.2 Sampling Methods A. Probability Sampling
Follow-up 22.3 A credit card company conducts a survey to find the monthly expenses of its customers. The customers are classified into different monthly income levels as shown below: If 240 customers are selected by stratified random sampling, how many customers in each sector should be selected? Solution: Total number of customers SalaryFrequency Less than $10 0001000 $10 000 – $24 9992500 $25 000 – $49 9993000 $50 000 or more1500 1000 + 2500 + 3000 + 1500 8000 Number of customers selected from ‘less than $10 000’ level Number of customers selected from ‘$10 000 – $24 999’ level 22.2 Sampling Methods A. Probability Sampling
Follow-up 22.3 22.2 Sampling Methods A. Probability Sampling A credit card company conducts a survey to find the monthly expenses of its customers. The customers are classified into different monthly income levels as shown below: If 240 customers are selected by stratified random sampling, how many customers in each sector should be selected? Solution: Total number of customers SalaryFrequency Less than $10 0001000 $10 000 – $24 9992500 $25 000 – $49 9993000 $50 000 or more1500 1000 + 2500 + 3000 + 1500 8000 Number of customers selected from ‘$25 000 – $49 999’ level Number of customers selected from ‘$50 000 or more’ level
Follow-up 22.4 Solution: The personnel department collects the data about the monthly salaries of 5 managers in the company. The average salary is $35 000. The personnel department claims that the average salary of the company is $35 000. Do you think it is misleading? Give a reason. Yes, it is misleading since the salary of a manager is normally higher than the staff. / The sample size is too small. 22.3 Statistical Investigations B. Abuses of Statistics
Follow-up 22.5 The figure shows the number of television sets sold by a company in the last two months. (a) What is the linear magnification between the figures? (b) Find the percentage increase between the areas of the figures. (c) What is the actual percentage increase of the sales? Solution: (a) Linear magnification (b) Area of figure for Oct: Area of figure for Sept 18 : 2 9 : 1 Let k be the area of the figure for Sept. The area of the figure for Oct 9k Percentage increase (c) The actual percentage increase 22.3 Statistical Investigations B. Abuses of Statistics