Presentation on theme: "22 Uses and Abuses of Statistics Case Study 22.1 Statistical Surveys"— Presentation transcript:
122 Uses and Abuses of Statistics Case Study 22.1 Statistical Surveys 22.2 Sampling Methods22.3 Statistical InvestigationsChapter Summary
2Case Study Mr. and Mrs. Chan want to buy a car. The sales of this car are almost a double of Sonic’s! Isn’t it good?Let’s see the graph here. The sales of our car are much better than Sonic’s…Wait, it seems there’s something wrong with the graph…Mr. and Mrs. Chan want to buy a car.The salesman claims that the number of Alpha cars sold is much higher than that of Sonic’s.As shown in the bar chart, the height of the bar representing the sales of Alpha is almost a double of Sonic’s. However, the vertical axis does not start from zero.The figure gives a wrong impression that there is a big difference between the sales of the two brands.
322.1 Statistical SurveysPeople conduct different kinds of surveys to collect useful data for statistical investigations. An effective survey can help people gather information for policy formulation on public issues, business decision-making purposes and social studies.The following shows the major steps of conducting surveys and the major points that must be considered in each step.Step 1: Planning the Survey1. When planning the survey, first we must clearly specify the objectives of the survey.2. Next, we should define the ‘population’ of the survey clearly. The population is the target of the survey.3. Then, we should set the budget for conducting the survey. It is important to have sufficient resources such as time, money and manpower to carry out the survey.
422.1 Statistical SurveysStep 2: Choosing an Appropriate Data-collection MethodAfter planning the survey, we have to choose an appropriate data-collection method, such as1. interviews;2. questionnaires;3. observation;4. direct testing or experiment;5. collection of data from existing statistical reports.The most common way to collect data is using questionnaires.The following are some general principles for designing a questionnaire:(a) The questions must be relevant to the objectives of the survey.(b) Long questionnaires are undesirable.(c) The questions must be clear and easy to answer.(d) Questions that lead respondents’ opinions towards certain answers must be avoided.
522.1 Statistical Surveys(e) The data collected must be easy to interpret.(f) Questions should be arranged in a proper order.(g) The language used should be appropriate.(h) Questions should be appropriate, specific and precise.(i) Embarrassing questions should be avoided.(j) Composite and double negative questions should be avoided.(k) Questions which rely on respondents’ memory should be avoided.(l) Options such as ‘Don’t know / No opinion / Others’ should be included as appropriate.All of them affect the reliability and validity of the questionnaire.Reliability is concerned with the stability and consistency of the data collected.Validity is concerned with the relevance of the data collected to the objective of the survey.
622.1 Statistical Surveys Step 3: Selecting the Sample Since it is often very time-consuming to collect information from all the members of a population, most surveys are conducted on samples of the whole population.After designing the questionnaires, we have to decide a suitable sampling method to select samples.Step 4: Collecting the Raw DataAfter designing the questionnaires and selecting the samples, we can move on to collecting the data.By using the questionnaires, we can collect information in the following ways:1. Personal interviews2. Telephone interviews3. Self-administered questionnaires by mail/
722.1 Statistical SurveysStep 5: Analysing the Data and Interpreting the FindingsAll raw data collected have to be checked carefully before being compiled with suitable statistical techniques.Also, the data should be organized first before analysis.Step 6: Presenting the InvestigationAfter compiling the statistical data, the survey results will be sent to the relevant parties or organization.If the subject of the survey is of public interest, the results may be published.
822.2 Sampling MethodsIn many real-life cases, the population is very large or inaccessible. Collecting data from the whole population would be very expensive and time consuming.Therefore, we can hardly carry out a statistical survey on the whole population.So in these cases, we will use a sampling method to choose some samples from the population at RANDOM, and use the results obtained from these samples to estimate the results for the whole population.There are two main types of sampling methods, probability sampling and non-probability sampling.
922.2 Sampling Methods A. Probability Sampling There are three important methods of probability sampling: simple random sampling, systematic sampling and stratified random sampling.(a) Simple Random SamplingSimple random sampling is a method of selecting a sample such that each item in the population has an equal probability of being chosen.In order to use the method of simple random sampling, we should first list all the items in the population and assign a unique identification number to each of them, and then group all the numbers in a table.This number list is called the sampling frame of the population.
10Example 22.1T 22.2 Sampling Methods Solution: A. Probability Sampling Helen is a committee member of a youth centre. She wants to select 100 members at random and investigate their family status. Suggest how she can form the sampling frame using simple random sampling.Solution:She can form the sampling frame by using the member numbers.
1122.2 Sampling Methods A. Probability Sampling (b) Systematic Sampling Systematic sampling is a method by which we first select a starting point at random, then select every kth (such as 10th or 50th) item in the population.Compared with simple random sampling, systematic sampling is much more efficient because we do not need to know the size of the population.However, when the items share some regular pattern, systematic sampling may lead to a biased sampling result.
12Example 22.2T 22.2 Sampling Methods Solution: A. Probability Sampling An insurance company has policyholders. The company wants to conduct a survey on their clients’ spending habits. The marketing department selects 500 clients by systematic sampling and sends questionnaires to them.(a) How can the company form a sampling frame?(b) Will a client be selected more than once?Solution:(a) They can use the policy numbers to form a sampling frame.(b) Yes, some clients may have more than one policy.
1322.2 Sampling Methods A. Probability Sampling (c) Stratified Random SamplingStratified random sampling is a sampling method that divides the population into at least two subgroups (called strata) that share the same characteristics (such as gender, age), and thenselect samples from each stratum.Random samples are selected from each stratum and the sizes of the samples in each stratum are proportional to the stratum size.Notes:The stratified random sampling method may reflect the characteristics of a population more accurately than the other two sampling methods.However, as detailed information about individual items in the population are needed, this method may be time consuming and expensive.
14Example 22.3T 22.2 Sampling Methods Solution: A. Probability Sampling The students’ union of a university conducts a survey on the annual travel expenses of students in the university. They select 100 students for an interview by stratified random sampling. The following shows the number of students in each year. How many students should be selected from year 4?YearNumber of StudentsYear 1400Year 2500Year 3Year 4600Solution:Total number of students 2000Number of students selected from year 4
1522.2 Sampling Methods B. Non-probability Sampling There are several types of non-probability sampling methods: convenience sampling, voluntary response sampling, judgment sampling, quota sampling and snowball sampling.(a) Convenience SamplingConvenience sampling is also called haphazard or accidental sampling.The sample is chosen at the convenience of the researcher.(b) Voluntary Response SamplingIn this method, respondents themselves choose to take part in the survey.Write-in and call-in opinion polls use this kind of sampling method.(c) Judgment samplingWhen using this method, the sample is chosen based on the judgment or experience of the researcher.Judgment sampling is also called purposive sampling because the sample is chosen with a purpose.The researcher tries to obtain a sample that appears to be representative of the population.
1622.2 Sampling Methods B. Non-probability Sampling (d) Quota Sampling In quota sampling, interviewers have been given quotas to fill from specified sub-groups of the population and the interviewers select the sample.This is similar to stratified sampling, but in quota sampling, the choice of the sample is non-random.(e) Snowball SamplingThis sampling technique is often used in hidden populations which are difficult for researchers to access.Because sample members are not selected from a sampling frame, snowball samples are subject to numerous biases.To start with, the researcher compiles a short list of sample members from various sources. Each of these respondents is contacted to provide the names of other probable respondents.
1722.2 Sampling Methods C. Comparing Probability and Non-probability The following table gives the differences between probability and non-probability sampling.Sampling MethodProbability samplingNon-probability samplingChance for each datum to be selectedEqualUnequalNature of resultUnbiasedMay be biasedNature of sampleRepresentativeMay not be representativeRandom selectionInvolvedNot involvedKnowledge on the populationRelatively lowRelatively highComplexityTime neededRelatively longerRelatively shorterBudget needed
1822.3 Statistical Investigations A. Uses of StatisticsIn recent years, statistics is widely used in different aspects. People get much benefit from the use of statistical methods.Everyday, there are many statistical reports presented in the media such as newspapers, journals, magazines, television and the internet.Such reports are often presented in the form of different statistical graphs according to the nature of the data.The following are four types of commonly used graphs:1. Pie chart 2. Histogram / Bar chart3. Broken line graph 4. Stem-and-leaf diagram
1922.3 Statistical Investigations B. Abuses of StatisticsStatistical data are often presented in ways that favour the producers but not the users. The common ways used to mislead users are:1. Using the average to mislead readers 2. Misinterpreted percentages 3. Misrepresentation of data by graphs
20Example 22.4T 22.3 Statistical Investigations Solution: B. Abuses of StatisticsExample 22.4TA group of students wanted to know the average amount of pocket money per month of the students in their school. They interviewed 40 S6 students, and 30 of them have pocket money over $1000 each month. They claimed that 75% of the students have pocket money over $1000 each month. Do you think it is misleading? Give a reason.Solution:Yes, it is misleading since S6 students may have more pocket money than S1 students in general.
21Example 22.5T 22.3 Statistical Investigations Solution: B. Abuses of StatisticsExample 22.5TThe figure shows the sales of two brands of orange juice.(a) Find the ratio of the sales.(b) Does the advertisement over-emphasize the sales ofMiss Orange?Solution:(a) Sales of ‘Sun’ : Sales of ‘Miss Orange’(b) Area of figure for Sun : Area of figure for Miss Orange 3 : 12 1 : 4Yes, the advertisement over-emphasizes the sales of Miss Orange.
2222.3 Statistical Investigations C. Assessing the Statistical InvestigationsThere are some criteria for assessing statistical investigations presented in different sources. A good statistical investigation should consider the following:1. Sponsorship of the surveyThe sponsor of a survey might affect the response rate.Generally the response rate of a survey would be higher if it is sponsored by a university.2. Population covered The researcher is responsible to define the target population clearly.The population is defined in keeping with the objectives of the study.3. Sampling method As the population is too large, a sample is always used to represent the population. The sample chosen should reflect the characteristics of the population from which it is drawn.
2322.3 Statistical Investigations C. Assessing the Statistical Investigations4. Mode of data collection There are different ways to get data and all these methods have their advantages and disadvantages which may affect the results of the survey.5. Time period of data collection The time period of data collection also affects the reliability of the survey.6. Wording of questions The wording of a question is very important. Words like ‘usually’, ‘often’, ‘sometimes’, ‘occasionally’, ‘seldom’ and ‘rarely’ are commonly used in questionnaires. But they do not have the same meaning to everyone.7. Sample size and response rate Response rate indicates how much confidence can be placed in the results of a survey. A low response rate will ruin the reliability of a study.
24Chapter Summary 22.1 Statistical Surveys Data-collection methods: (a) Interviews(b) Questionnaires(c) Observation(d) Experiment(e) Existing statistical reportsThe steps in conducting a survey are:Planning the surveyChoosing an appropriate data-collection methodSelecting the sampleCollecting the raw dataAnalysing the data and interpreting the findingsPresenting the investigation
25Chapter Summary 22.2 Sampling Methods 1. A population in statistics refers to the entire set of individuals under study. A sample refers to a carefully chosen and representative part of the population.2. Probability Sampling(a) Simple random sampling is a method of selecting a sample such that each item in the population has an equal chance of being chosen.(b) Systematic sampling is a method that selects a starting point randomly, then selects every kth item in the population.(c) Stratified random sampling is a method that divides the population into strata, each of which is composed of data sharing the same characteristics, and then samples are selected from each stratum.
26Chapter Summary 22.2 Sampling Methods 3. Non-probability Sampling (a) Convenience sampling is a method of selecting a sample at the convenience of the researcher.(b) Voluntary response sampling is a method in which the respondents themselves choose to take part in the survey.(c) Judgment sampling is a method to choose a sample based on the judgment or experience of the researcher.(d) Quota sampling is a method in which interviewers have been given quotas to fill from specific sub-groups of the population and the interviewers select the sample.(e) Snowball sampling is a method to select a sample by first contacting a few potential respondents, and then relying on the referrals from the initial respondents to generate additional subjects.
27Chapter Summary 22.3 Statistical Investigations 1. Uses of Statistics In the media, there are many statistical reports which are presented according to the nature of the data in different types of graphs, such as a(a) pie chart (b) broken line graph (c) histogram / bar chart (d) stem-and-leaf diagram2. Abuses of StatisticsStatistical data are often presented in ways that favour the producers but not the users. The common ways used to mislead users are: (a) Misuse of the ‘averages’ (b) Misinterpreted percentages (c) Misrepresentation of data by graphs3. Assessing the Statistical InvestigationsThere are some criteria for assessing statistical investigations presented in different sources.
28Follow-up 22.1 22.2 Sampling Methods Solution: A. Probability Sampling Connie is a committee member of the school movie club and she wants to investigate the frequency of watching movies each month. There are 1000 students in the school. She assigns a unique number to each student and selects 100 students at random to form a sample.(a) Why does she assign a number to each of the students instead ofusing their class numbers?(b) If she uses the random number table shown on the right, how manydigits of the random number would she need to choose each time?Solution:(a) Because the class numbers are not unique.(b) 3 digits
29Follow-up 22.2 22.2 Sampling Methods Solution: A. Probability Sampling Mary arranges the names of her students in alphabetical order and assigns a unique number to each student. She chooses one from every 20 students after selecting a random starting point.(a) Identify the method used in this case.(b) How can she arrange the names of the students in alphabetical order?Solution:(a) Systematic sampling(b) She can get the name list of each class and sort the name listby computer software.
30Follow-up 22.3 22.2 Sampling Methods Solution: A. Probability Sampling A credit card company conducts a survey to find the monthly expenses of its customers. The customers are classified into different monthly income levels as shown below:If 240 customers are selected by stratified random sampling, how manycustomers in each sector should be selected?SalaryFrequencyLess than $10 0001000$ – $24 9992500$ – $49 9993000$ or more1500Solution:Total number of customers 8000Number of customers selectedfrom ‘less than $10 000’ levelNumber of customers selectedfrom ‘$ – $24 999’ level
31Follow-up 22.3 22.2 Sampling Methods Solution: A. Probability Sampling A credit card company conducts a survey to find the monthly expenses of its customers. The customers are classified into different monthly income levels as shown below:If 240 customers are selected by stratified random sampling, how manycustomers in each sector should be selected?SalaryFrequencyLess than $10 0001000$ – $24 9992500$ – $49 9993000$ or more1500Solution:Total number of customers 8000Number of customers selectedfrom ‘$ – $49 999’ levelNumber of customers selectedfrom ‘$ or more’ level
32Follow-up 22.4 22.3 Statistical Investigations Solution: B. Abuses of StatisticsFollow-up 22.4The personnel department collects the data about the monthly salaries of 5 managers in the company. The average salary is $ The personnel department claims that the average salary of the company is $ Do you think it is misleading? Give a reason.Solution:Yes, it is misleading since the salary of a manager is normally higher than the staff. / The sample size is too small.
33Follow-up 22.5 22.3 Statistical Investigations Solution: B. Abuses of StatisticsFollow-up 22.5The figure shows the number of television sets sold by a company in the last two months.(a) What is the linear magnificationbetween the figures?(b) Find the percentage increase betweenthe areas of the figures.(c) What is the actual percentage increaseof the sales?Solution:(a) Linear magnification(b) Area of figure for Oct: Area of figure for Sept 18 : 2 9 : 1Let k be the area of the figure for Sept.The area of the figure for Oct 9k Percentage increase(c) The actual percentage increase