Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 1 of 22 Chapter 1 Section 2 Observational Studies, Experiments, and Simple Random Sampling

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 2 of 22 Chapter 1 – Section 2 ●Learning objectives  Distinguish between an observational study and an experiment  Obtain a simple random sample 2 1

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 4 of 22 Chapter 1 – Section 2 ●There are different ways to collect data  Census  Existing sources  Survey sampling  Designed experiments ●These are good methods of data collection, if done correctly

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 5 of 22 Chapter 1 – Section 2 ●A census is a list  Of all the individuals in a population  That records the characteristics of the individuals  An example is the US Census held every 10 years (this is only an example though) ●A census is a list  Of all the individuals in a population  That records the characteristics of the individuals  An example is the US Census held every 10 years (this is only an example though) ●Advantages  Answers have 100% certainty ●A census is a list  Of all the individuals in a population  That records the characteristics of the individuals  An example is the US Census held every 10 years (this is only an example though) ●Advantages  Answers have 100% certainty ●Disadvantages  May be difficult or impossible to obtain  Costs may be prohibitive

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 6 of 22 Chapter 1 – Section 2 ●An existing source is  An appropriate data set has already been collected  That can be used for this study ●An existing source is  An appropriate data set has already been collected  That can be used for this study ●Advantages  Saves time and money ●An existing source is  An appropriate data set has already been collected  That can be used for this study ●Advantages  Saves time and money ●Disadvantages  There may not be an applicable data set

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 7 of 22 Chapter 1 – Section 2 ●A survey sample is  A study when only a subset of the population is considered  A study where there is no attempt to influence the value of the variable of interest ●A survey sample is  A study when only a subset of the population is considered  A study where there is no attempt to influence the value of the variable of interest ●Advantages  Saves time and money ●A survey sample is  A study when only a subset of the population is considered  A study where there is no attempt to influence the value of the variable of interest ●Advantages  Saves time and money ●Disadvantages  Choosing an appropriate sample could be difficult

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 8 of 22 Chapter 1 – Section 2 ●A survey sample is an example of an observational study  An observational study is one where there is no attempt to influence the value of the variable  An observational study is also called an ex post facto (after the fact) study ●A survey sample is an example of an observational study  An observational study is one where there is no attempt to influence the value of the variable  An observational study is also called an ex post facto (after the fact) study ●Advantages  It can detect associations between variables ●A survey sample is an example of an observational study  An observational study is one where there is no attempt to influence the value of the variable  An observational study is also called an ex post facto (after the fact) study ●Advantages  It can detect associations between variables ●Disadvantages  It cannot isolate causes to determine causation

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 9 of 22 Chapter 1 – Section 2 ●A designed experiment is an experiment  That applies a treatment to individuals  Often compares the treated group to a control (untreated) group  Where the variables can be controlled ●A designed experiment is an experiment  That applies a treatment to individuals  Often compares the treated group to a control (untreated) group  Where the variables can be controlled ●Advantages  Can analyze individual factors ●A designed experiment is an experiment  That applies a treatment to individuals  Often compares the treated group to a control (untreated) group  Where the variables can be controlled ●Advantages  Can analyze individual factors ●Disadvantages  Cannot be done when the variables cannot be controlled  Cannot apply in cases for moral / ethical reasons

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 10 of 22 Chapter 1 – Section 2 ●Observational studies and designed experiments have some fundamental differences  Observational studies do not control the variable under analysis while designed experiments do ●Observational studies and designed experiments have some fundamental differences  Observational studies do not control the variable under analysis while designed experiments do  Because variables are uncontrolled in an observational study, the results can only be associations ●Observational studies and designed experiments have some fundamental differences  Observational studies do not control the variable under analysis while designed experiments do  Because variables are uncontrolled in an observational study, the results can only be associations  Because variables are controlled in a designed experiment, the results can be conclusions of causation

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 11 of 22 Chapter 1 – Section 2 ●A danger in observational studies are lurking variables ●In an observational study, two variables can be determined to be associated ●Associated does not mean that one causes the other ●A simple observational study may find that smoking and cancer are associated  Cannot conclude that smoking causes cancer  Cannot conclude that cancer causes people to smoke

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 12 of 22 Chapter 1 – Section 2 ●Usually only a part of the population can be analyzed ●How do you choose your sample? ●The process is called sampling ●We will discuss  Simple random sampling  Stratified sampling  Systematic sampling  Cluster sampling  Convenience sampling

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 14 of 22 Chapter 1 – Section 2 ●A simple random sample is when every possible sample of size n out of a population of N has an equally likely chance of occurring ●Examples  For a simple random sample of size n = 1 from a population size of N = 5, each of the 5 possible samples has an equally likely chance of occurring  For a simple random sample of size n = 2 from a population size of N = 4, each of the 6 possible samples has an equally likely chance of occurring

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 15 of 22 Chapter 1 – Section 2 ●Simple random sampling requires that we have a list of all the individuals within a population ●This list is called a frame ●If we do not have a frame, then a different sampling method must be used

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 16 of 22 Chapter 1 – Section 2 ●A simple (but not foolproof) method  Write each individual’s name on a separate piece of paper  Put all the papers into a hat  Draw a random paper from the hat ●A simple (but not foolproof) method  Write each individual’s name on a separate piece of paper  Put all the papers into a hat  Draw a random paper from the hat ●Physical methods have some issues  Are the papers sufficiently mixed?  Are some of the papers folded?

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 17 of 22 Chapter 1 – Section 2 ●A method using a table of random numbers  List and number the individuals  Decide on a way to pick the random numbers (how to choose the starting point and what rule to use to select which digits to choose after that)  Select the random numbers  Match the numbers to the individuals ●With the technology available today, this method is outdated

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 18 of 22 Chapter 1 – Section 2 ●A method using technology  List and number the individuals  Use software (a calculator, software such as MINITAB or Excel) to generate random numbers  Match the random numbers to the individuals ●The method must be decided in advance … it is not statistically correct to choose a series of samples until a “good” one comes up

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 19 of 22 Chapter 1 – Section 2 ●Simple random sampling example ●We wish to select a random sample of 3 out of a group of 30 students ●Simple random sampling example ●We wish to select a random sample of 3 out of a group of 30 students ●We generate a two digit random number to choose the first student ●Simple random sampling example ●We wish to select a random sample of 3 out of a group of 30 students ●We generate a two digit random number to choose the first student ●Examples  If the number is “63”, we disregard this random number and choose another one ●Simple random sampling example ●We wish to select a random sample of 3 out of a group of 30 students ●We generate a two digit random number to choose the first student ●Examples  If the number is “63”, we disregard this random number and choose another one  If the number is “17”, we pick student number 17

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 20 of 22 Chapter 1 – Section 2 ●We generate another two digit random number to choose the second student  Assume that the first student was number 17 ●We generate another two digit random number to choose the second student  Assume that the first student was number 17 ●Examples  If the second number is “17”, we disregard this random number and choose another one (we want 2 different people) ●We generate another two digit random number to choose the second student  Assume that the first student was number 17 ●Examples  If the second number is “17”, we disregard this random number and choose another one (we want 2 different people)  If the number is “65”, we disregard this random number and choose another one ●We generate another two digit random number to choose the second student  Assume that the first student was number 17 ●Examples  If the second number is “17”, we disregard this random number and choose another one (we want 2 different people)  If the number is “65”, we disregard this random number and choose another one  If the number is “8”, we choose student 8 as our second student

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 21 of 22 Chapter 1 – Section 2 ●We generate another two digit random number to choose the third student  Assume that the first student was number 17 and the second was number 8 ●We generate another two digit random number to choose the third student  Assume that the first student was number 17 and the second was number 8 ●Examples  If the third number is “17”, or “8”, or anything “31” and higher, we disregard this random number and choose another one (we want 3 different people)  If the number is “2”, we choose student 2 as our third and final student ●We generate another two digit random number to choose the third student  Assume that the first student was number 17 and the second was number 8 ●Examples  If the third number is “17”, or “8”, or anything “31” and higher, we disregard this random number and choose another one (we want 3 different people)  If the number is “2”, we choose student 2 as our third and final student ●Our sample is {student 17, student 8, student 2}

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 1 Section 2 – Slide 22 of 22 Summary: Chapter 1 – Section 2 ●There are different ways of collecting data  A census uses the entire population  An existing source use an existing data set  An observational study measures the characteristics of a sample without influencing the variable of interest  A designed experiment applies a treatment to a sample to isolate the effects of a variable ●The method of simple random sampling can be used to select the sample

