Chapter 4 How to get the Data Part1 n In the first 3 lectures of this course we spoke at length about what care we should take in conducting a study ourselves.

1 Chapter 4 How to get the Data Part1 n In the first 3 lectures of this course we spoke at length about what care we should take in conducting a study ourselves or in interpreting the results of someone else’s study. n What we didn’t mention was how to actually conduct the study, that will be the topic of today’s lecture. n We saw that studies can take three forms: n Experiments n Observational Studies n Sample Surveys

2 n We saw earlier that experiments are possibly the best way to conduct studies as the researcher usually has complete control over the elements of the study. And experiments allow a determination of cause and effect. n Since this is a Psychology course and not a Biology or Chemistry course we will restrict ourselves to experiments involving humans. n No not what you may think, experiments these days rely on volunteer subjects. Experiments

3 n The Experimental procedure involves manipulating something called the Explanatory Variable and seeing the effect on something called the Outcome Variable n Example: In an experiment to test the effect of a new drug designed to reduce blood pressure, the explanatory variable would be the amount of drug administered and the outcome variable would be the reduction in blood pressure. n The experiment has to be designed to eliminate to any extraneous effects and to determine only the results of the explanatory variable on the outcome variable.

4 n The way that this is accomplished is that participants are randomly assigned to one of two groups: n One group receives the treatment the other receives a placebo - ie no treatment at all. n This random assignment to a treatment group or control group is the way most clinical trials are conducted today.

5 n This form of study is similar to an experiment except that the treatment occurs naturally and is not imposed on the subjects. n It is much harder to establish a cause and effect relationship using an observational study than using an experiment because we cannot create control and treatment groups to eliminate confounding effects. n One attempt to isolate the explanatory variable is to conduct what is called a case control study. n We will examine this type of study in detail later. Observational Studies

6 n FLUNK.NET'S STUDENT SEX SURVEY n Total Poll: 1,041 n Men: 603 (57.9%) n Women: 438 (42.1%) n Virgins: 142 (13.7%) n Average age: 20

7 n Average Age of Virginity Loss: 17 n Average Age of Virginity Loss: 17.3 n Average Number of Sexual Partners: 4.2 n Average Number of Sexual Partners 2.7 n % Who have had sex on Campus 34% n % Who have had sex on Campus approx 3% WEB vs 2000 Class

8 n % Who would sleep with a lecturer: 49% n % Who would sleep with a lecturer: 49.5% n % Who have slept with someone in their class: 43% n % Who have slept with someone in their class: approx 4% n % Who would sleep with more than one partner: 64% n % Who would sleep with more than one partner: ?????? WEB vs 2000 Class

9 Some interesting facts % Virgins: 31% % Virgins: 31% n Non-Virgins: n 14 0.01 n 15 0.01 n 16 0.22 n 17 0.32 n 18 0.27 n 19 0.13 n 20 0.02

10 Some interesting facts n # Partners n 1930.47 n 2470.24 n 3260.13 n 440.02 n 570.04 n 640.02 n 730.02 n 810.01 n 910.01 n 1020.01 n 1140.02 n 1410.01 n 1520.01 n 2410.01 n 2510.01

11 n We will concentrate for the rest of this lecture on sample surveys. n First some definitions: u A Unit is a single individual or object to be measured. u A Population is the entire collection of Units about which we would like information. u A Sample is the collection of Units we actually measure. u A Sampling Frame is a list of Units from which the sample is chosen. Ideally the sampling frame includes the whole Population. Sample Surveys

12 n In a Sample Survey measurements are taken on a sample chosen from the Population n In a Census the entire Population is surveyed.

13 n Resources are needed to conduct a Census n CSO Spends about £20million to conduct the 5 year Census of Population n Sometimes the measuring process destroys the thing being measured, e.g. if we were to test the strength of a weld or in testing an individuals blood - who among us would be willing to donate all of our blood in a test? n Because of the work involved in a Census it is much faster to conduct a survey, sometimes it is important to have results fast. Why is Sampling used

14 n There are accuracy advantages to be had in conducting a sample survey: u It is easier to get complete coverage of a sample than of a population. u Easier to train a small number of interviewers for a sample survey than to train a large number for a census. n OK but a sample is still a sample and is bound to be inaccurate by its very nature, isn’t it??? n British General Election Accuracy in Surveys

15 n We mentioned before that if a sample was chosen to be representative of the target population then it could be very accurate.How accurate? n For surveys conducted to measure a sample proportion as an estimate for a population proportion we can define a Margin of Error. n The sample proportion will differ from the population proportion by more than the margin of error less than 5% of the time. n The Margin of Error for a sample of size n is 1/  n Accuracy in Sampling

16 n For Example with a sample of 1600 the margin of error is 1/40 or 2.5% n So a survey conducted using a sample of size 1600 will be accurate to within 2.5% more than 95% of the time. Accuracy in Sampling

17 n We saw already that in order for the sampling procedure to work properly the sample must be representative of the target population n There are several ways to get a representative sample: n Simple Random Sampling n Stratified Random Sampling n Cluster Sampling n Systematic Sampling n Random Digit Dialing n Multi Stage Sampling How to choose a Sample

18 n The simplest form of sampling procedure. n Each group of individuals has the same chance of getting chosen. n Therefore each individual has the same chance of being chosen. n Use Random Number Tables or a random number generator. n Or put names in a hat or roll a die. Simple Random Sampling

19 n Polling companies don’t have a list of all adults and select from that list randomly. n Instead they use other methods like Stratified Random Sampling n We first divide the population into different strata, then sample randomly within those strata. n Example: To conduct an opinion poll we might divide the population into different age groups or sexes or by County of residence. Stratified Random Sampling

20 n Advantages of this method are: n We can get individual estimates for each strata n We can use different interviewers for each strata and train them appropriately n If strata are different geographic regions it may be cheaper to sample them separately. Stratified Random Sampling

21 n Divide the Population into similar groups or clusters. Then choose a random sample of clusters. n The analysis of this type of survey is more complicated than for simple random sampling. n NOTE: This is not the same as Stratified Sampling, in Cluster Sampling the Clusters are chosen so that the resemble each other as much as possible. Cluster Sampling

22 n This is where a plan is used to chose the participants in the study. n For Example: We might decide to survey every 3rd person we meet. Or to choose every 5th House to be surveyed. n Sometimes this procedure can be very biased. n What happens if every 5th house is an end house? Systematic Sampling

23 n Used very much in the US n Phone numbers in certain area codes are dialled randomly by computer, then when someone answers an interviewer asks questions Random Digit Dialling Multi- Stage Sampling Used for large surveys Involves using a combination of the methods described above.

24 n Here are 5 ways to make a mess of the sampling procedure: n Use the wrong sampling frame n Fail to reach the individuals selected n Get no response n Get a sample of volunteers n Use a convenient or haphazard sample n The last 2 of these are disastrous What can go wrong in Sampling

25 n 1936 US election n Literary Digest had been extremely successful in predictions n 1936 predicted 3-2 victory for Rep Landon over Dem. FDR n George Gallup American Institute of Public Opinion predicted FDR correctly and also predicted what Literary Digest would predict. n Literary Digest 10 million n Gallup 50,000 n LD- Magazine Subscribers, Phone Directories, Car Owners - Wealthy n Most serious though: 23% Volunteer response. What went wrong in Sampling

