Presentation on theme: "An Introduction to Web-based Experimentation Stian Reimers."— Presentation transcript:
An Introduction to Web-based Experimentation Stian Reimers
Overview Why bother with web testing? Different ways of implementing web tests Design issues: How to get top-quality data Ethics and good scientific practice The future of experimenterless experiments
Why bother with web testing? Cheap - don’t need to pay participants Time saving - once set up, experiment can be left to run Thousands of participants Wider range of people than traditional undergraduate subject pool Possible to target low-frequency subgroups Reduces experimenter bias Encourages dialogue between academic and public
Please go to the following URL: snipurl.com/128xy
Case Study: Task Switching at the BBC Set up in 2003, to tie in with TV series Coded in Flash, mainly using Actionscript Visitors to BBC S&N website click through Data passed to our server at end of experiment
Age effects on specific and general switch costs
Summary: BBC Task Switching Experiment Around 50,000 participants so far Possible to examine task switching in ages 10-66 Reported in Reimers and Maylor (2005; 2006) Experiment has changed several times Allows us to test new theories with minimal effort Gets participants for further experiments
HTML Perl script Save data Generate dynamic response page Email experimenter
Please return to: snipurl.com/128xy and follow the link to example 2
Please return to: snipurl.com/128xy and follow the link to example 3
Java: Advantages Can be used for fairly accurate RT measurement. Sandboxed, so can’t damage a user’s computer Ostensibly platform independent, so one code could be used for web-, lab-, and, say, mobile- based execution Costs nothing. Client-side implementation Ideal for experiments measuring RT.
Java: Disadvantages Relatively difficult language to master, particularly for running on multiple platforms Need skill to make programs intuitive and ergonomic Slow start-up time Issues of different versions (Sun vs. Microsoft) Not always installed/enabled Not good for novice programmers, one-off experiments
You can return to: snipurl.com/128xy to retry the task switching experiment
Adobe Flash: Advantages Similar advantages to Java –Client side processing, sandboxed, platform independent Designed for web implementation, so easier than Java to make good-looking experiments Can combine code written in Actionscript with animation- style features More ubiquitous plug-in Ideal for ‘fun’ or multi-stage experiments.
Adobe Flash: RT Measurement Reimers, S., & Stewart, N. (in press). Adobe Flash as a medium for online experimentation: A test of RT measurement capabilities. Behavior Research Methods.
Adobe Flash: Disadvantages Requires plug-in (but 97.3% of computers have it installed already) Commercial software, so costs money Easily decompilable Awkward stimulus timing May be blocked by advert-filtering software Possible differences in performance across platform Not good for very low spec machines, tachistoscopic presentation, sequences of rigidly timed stimuli.
Please return to: snipurl.com/128xy and follow the link to example 4
Adobe Authorware: Advantages Similar advantages to Flash And very user friendly Quite similar to testing applications like Superlab Ideal for experimenters who aren’t very confident programmers but want to run web experiments.
Adobe Authorware: Disadvantages Not cheap to buy Requires plug-in which most people won’t have Quite a niche product - harder to find casual programmers to code up experiments Relatively untested with respect to measurement accuracy, display consistency etc Not good for uncommitted participants or cash-strapped researchers.
Key Differences Between Web- and Lab- based Testing Less social pressure –May reduce demand issues –But also increases drop-out rate, lying Unverifiable demographics Less control over experimental setting –Loud music, monitor size, drunkenness Less control over multiple submissions
Multiple Submissions Historically not that big a problem (Krantz & Dalal, 2000; Musch & Reips, 2000) –But likely to be more so if participants are paid Ask people if they’ve taken part before Get unique identifier (email address, NI number) Set a cookie Log their IP address
Dropout in most studies is a minor problem Sample is not representative –But still better than undergraduates Ideally, should log the number of participants who start the experiment relative to number who finish it. Gives useful info on how much people are enjoying your study
Dropout in experiments can lead to sampling biases and misleading results Lazy People Committed People Easy Condition Lazy People Committed People Hard Condition
So, to prevent this sort of problem, use the ‘high-hurdle’ approach Dull, irrelevant task Easy Condition Hard Condition
And generally, try to prevent drop-out by making things fun, easy, and interesting Make it fun to do and nice to look at Implement as a game where possible Sunk cost effect: Put the dull stuff at the end Ask people to complete the entire test Feedback –Tell people about themselves –Comparisons with rest of population Describe the experiment’s aims and the science behind it
Dishonesty, carelessness, misunderstanding Not as big a problem as you might imagine –3.5-6.3% junk /1% split-half inconsistencies (Johnson, 2005) –1-5% inconsistency in sex differences study (Reimers, in press) –Cf. 0.7% of pencil and paper (Gough & Bradley, 1996) Make submission of demographic data voluntary –Or give option of ‘I’d rather not say’ Ask the same questions at start and end –Check for consistency, but may look sneaky Put in equivalent of a ‘lie scale’ Obviously, remove people who aren’t responding honestly
Mental state of participants Can’t screen for people in abnormal mental states Relatively small proportion of experimental population Remove egregious datasets at analysis stage Ask people directly (and sensitively) Include screening questions to show general competence
Getting participants to do your (ergonomic, well-designed) experiment Get links (e.g. from department or study index site) Advertise (banners etc) –Costs money. Unproven effectiveness, but great potential. Set up email list of willing participants Pay participants –Costs money. Multiple submissions, careless participation. Hassle to implement. Use a reward scheme like ipoints –Effective, can pay little, no multiple submissions, select appropriate demographic, easy to run
Ethics of Web-based Experiments Key differences Informed consent Sensitive material / personal questions Unflattering feedback Deception Debrief
Key Differences between lab- and web-based research You are not present –Can’t offer feedback and reassurance –Can’t check a participant is in a suitable mood –Can’t tell how old a participant is –Can’t answer any questions or concerns Broader demographic –More lonely or socially isolated participants –More participants with mental illnesses
Informed consent: Pros and cons Follows ethical guidelines Explains things that may otherwise have caused concern to the participant –Dropping out is okay, data are anonymous Makes the experiment look more authoritative and serious But may scare off people who’d otherwise have enjoyed the experiment Seems to be more of a back-covering exercise than an attempt to ensure the participant is protected
Do I need informed consent? Kraut R., Olson J., Banaji M., Bruckman A., Cohen J., & Couper M. (2004) Psychological research online: Report of board of scientific affairs' advisory group on the conduct of research on the Internet, American Psychologist 59, 105-117.
Sensitive material / Personal questions You may offend people or evoke unpleasant thoughts or memories. Warn people at the start of the experiment Remind people that responding is optional Say ‘Adults only’ or better still get people to enter their age, and skip sensitive questions if under 18 Be sensitive in wording of questions and implications of particular ways of framing information Offer contact details for further information
Feedback risks making a participant feel stupid or establishing apparent norms Don’t tell people they’re in the bottom decile for performance on a cognitive/IQ task ‘all the women were strong, all the men were good looking and all the children were above average’ Use broad categories for giving feedback, but better not to lie about actual performance. ‘You did better than 20%...’ Include caveats about how poor a measure or performance your test is And how performance varies a lot intraindividually And how the other participants may not be representative
Deception is not recommended online Always a sensitive issue Difficult online, because debrief is harder Need to reassure participants that they are not being mocked or exploited when experimenter is not present Get ethics board input before running
Debrief Try to explain the aim of the experiment in simple terms –Run it past your friends and family first to make sure it’s easily understandable Thank the participant for their time Give them an email address to contact you if they want further information or to see the final results
Sixteen standards for web-based experimenting (Reips, 2002)
Sixteen standards Consider a software tool for development Pretest for clarity Decide on HTML vs. plugins Check for errors Link to several sites to check self selection Run online and offline for comparison Use warm-up technique to avoid dropout (maybe) Use dropout to check motivational confounding
Sixteen standards Minimise dropout Highlight seriousness of experiment Check for obvious naming of files or passwords Avoid multiple submissions Perform consistency checks Keep full details for others to analyse Report and analyse dropout curves Keep experiment available online
Massive longitudinal panel experiments Already used to running experiments with >250,000 participants Possible to get thousands of people from ever broader demographic to participate repeatedly Look at, for example, cognitive aging of individuals Set up panels of reliable participants –Choose demographic, etc. Cross-tabulate results from many experiments, get vast amounts of data
New devices Run experiments using WAP on mobile phones If you know Java, it’s relatively easy to adapt an application to, say, Series 60 Nokias E.g., memory task. Participants download application. Every hour the phone vibrates and participants see another item. Test at end of day. Send results by SMS. Give people a task to do at unpredictable points, check effect of time of day, mood, etc.
Conclusion Web-based testing can be a powerful tool for investigating issues hard to investigate in the lab Web-based testing has some core differences from lab- based testing These differences have advantages and disadvantages In years to come there will be new ways to test people outside the laboratory Web-based testing is now accepted in the research community