Running Experiments with Amazon Mechanical-Turk Gabriele Paolacci, Jesse Chandler, Jesse Chandler Judgment and Decision Making, Vol. 5, No. 5, August 2010.

Slides:



Advertisements
Similar presentations
Psychological biases In Negotiation. Anchoring and adjustment In the face of uncertainty, people fix on the first piece of information and subconsciously.
Advertisements

The Behavioral Side of Pricing MKT 750 Dr. West. Agenda Issues associated with product pricing Defining terms Capturing value Behavioral pricing Discuss.
Psyc 235: Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT!
An Exploration of Decision Processes in an Evolutionary Perspective: the Case of the Framing Effect.
Day 2 Evolution of Decision-Making.  Tversky and Kahneman, 1974  Heuristics – general rules of thumb, or habits  Generally result in decent estimates.
Heuristics and Biases in Human Decision Making
Samantha Nicholas & Khrys Nugent Hanover College
1 Intuitive Irrationality: Reasons for Unreason. 2 Epistemology Branch of philosophy focused on how people acquire knowledge about the world Descriptive.
Risk Thomas Lumley Department of Statistics University of Auckland.
Rationality Alan Kaylor Cline Department of Computer Sciences The University of Texas at Austin Based upon classic decision puzzlers collected by Gretchen.
Misconceptions and Fallacies Concerning Probability Assessments.
Survey.
CHAPTER 14 Utility Axioms Paradoxes & Implications.
Decision making and economics. Economic theories Economic theories provide normative standards Expected value Expected utility Specialized branches like.
© POSbase 2005 The Conjunction Fallacy Please read the following scenario: (by Tversky & Kahneman, 1983)Tversky & Kahneman, 1983 Linda is 31 years old,
Fallacies in Probability Judgment Yuval Shahar M.D., Ph.D. Judgment and Decision Making in Information Systems.
© 2003 Prentice-Hall, Inc.Chap 1-1 Business Statistics: A First Course (3 rd Edition) Chapter 1 Introduction and Data Collection.
Stat 321 – Day 11 Review. Announcements Exam Thursday  Review sheet on web  Review problems and solutions on web  Covering chapters 1, 2; HW 1-3; Lab.
Or Why We’re Not Really As Rational As We’d Like to Believe.
Heuristics and Biases. Normative Model Bayes rule tells you how you should reason with probabilities – it is a normative model But do people reason like.
The Psychology of Security ….a work in progress Bruce Schneier DIMACS Workshop on Information Security Economics Rutgers University 18 January 2007.
Heuristics & Biases. Bayes Rule Prior Beliefs Evidence Posterior Probability.
Decision Making. Test Yourself: Decision Making and the Availability Heuristic 1) Which is a more likely cause of death in the United States: being killed.
Rationality Alan Kaylor Cline Department of Computer Sciences The University of Texas at Austin Based upon classic decision puzzlers collected by Gretchen.
Today’s Topic Do you believe in free will? Why or why not?
Introduction to Mechanized Labor Marketplaces: Mechanical Turk Uichin Lee KAIST KSE.
Review of Related Literature Different decision-making: – Budget decisions of managers – Irrationality of continuing the risk of losing a prospect – Decision-making.
Good thinking or gut feeling
Decision making Making decisions Optimal decisions Violations of rationality.
Research Design. Research is based on Scientific Method Propose a hypothesis that is testable Objective observations are collected Results are analyzed.
Surveys and Sampling. Midpoint/Don’t Know Midpoint- allows for neutral response Advantage- might be more accurate Advantage- might be more accurate Disadvantage-
Decision Making choice… maximizing utility framing effects
Framing Effects From Chapter 34 ‘Frame and Reality’ of Thinking Fast and Slow, by D. Kahneman.
Using the Scientific Method and Appropriate Measurements The Scientific Method Question Hypothesis Testing Method Conclusion  Law  Theory.
Chapter 2 Doing Social Psychology Research. Why Should You Learn About Research Methods?  It can improve your reasoning about real-life events  This.
A Heuristic Solution To The Allais Paradox And Its Implications Seán Muller, University of Cape Town.
RISK BENEFIT ANALYSIS Special Lectures University of Kuwait Richard Wilson Mallinckrodt Professor of Physics Harvard University January 13th, 14th and.
RISK BENEFIT ANALYSIS Special Lectures University of Kuwait Richard Wilson Mallinckrodt Professor of Physics Harvard University January 13th, 14th and.
Lecture 15 – Decision making 1 Decision making occurs when you have several alternatives and you choose among them. There are two characteristics of good.
FIN 614: Financial Management Larry Schrenk, Instructor.
Rationality Alan Kaylor Cline Department of Computer Sciences The University of Texas at Austin Based upon classic decision puzzlers collected by Gretchen.
Judgement Judgement We change our opinion of the likelihood of something in light of new information. Example:  Do you think.
Past research in decision making has shown that when solving certain types of probability estimation problems, groups tend to exacerbate errors commonly.
Psychology 485 March 23,  Intro & Definitions Why learn about probabilities and risk?  What is learned? Expected Utility Prospect Theory Scalar.
PSY 323 – Cognition Chapter 13: Judgment, Decisions & Reasoning.
Exercise 2-6: Ecological fallacy. Exercise 2-7: Regression artefact: Lord’s paradox.
1 DECISION MAKING Suppose your patient (from the Brazilian rainforest) has tested positive for a rare but serious disease. Treatment exists but is risky.
1 of 29Visit UMT online at Prentice Hall 2003 Chapter 1, STAT125Basic Business Statistics STATISTICS FOR MANAGERS University of Management.
RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.
5 MARCH 2015 TOK LECTURE TRUTH: TNML. ECONOMICS  ECONOMISTS HAVE A VERY SHAKY RELATIONSHIP WITH TRUTH.  AT THE HEART OF THE FINANCIAL CRISIS OF 2008.
1 BAMS 517 – 2011 Decision Analysis -IV Utility Failures and Prospect Theory Martin L. Puterman UBC Sauder School of Business Winter Term
Rationality Alan Kaylor Cline Department of Computer Sciences The University of Texas at Austin Based upon classic decision puzzlers collected by Gretchen.
A. Judgment Heuristics Definition: Rule of thumb; quick decision guide When are heuristics used? - When making intuitive judgments about relative likelihoods.
Heuristics and Biases Thomas R. Stewart, Ph.D. Center for Policy Research Rockefeller College of Public Affairs and Policy University at Albany State University.
Implicit Decision making Dr Magda Osman Room 2.25 Office hours Mondays.
The Representativeness Heuristic then: Risk Attitude and Framing Effects Psychology 355: Cognitive Psychology Instructor: John Miyamoto 6/1/2016: Lecture.
Preference Assessment 1 Measuring Utilities Directly
Rationality Alan Kaylor Cline Department of Computer Sciences
Effects of Foreign Language on Decision Making
PSY 323 – Cognition Chapter 13: Judgment, Decisions & Reasoning.
Skepticism and Empiricism in Psychology
1st: Representativeness Heuristic and Conjunction Errors 2nd: Risk Attitude and Framing Effects Psychology 355:
These slides are preview slides
Conceptions and Misconceptions
DIS 280 Social Science Research Methodology: Problem Framing
Choices, Values and Frames
Decision making Decision making is an important area within cognitive psychology, because of the applied interest: everyone would like to make better decisions,
Business and Management Research
HEURISTICS.
For Thursday, read Wedgwood
Presentation transcript:

Running Experiments with Amazon Mechanical-Turk Gabriele Paolacci, Jesse Chandler, Jesse Chandler Judgment and Decision Making, Vol. 5, No. 5, August 2010 KSE 801: Human Computation and Crowdsourcing

Practical Advantages of M-Turk Supportive infrastructure: – Fast recruiting – Convenient to run experiments – External site could be used (e.g., validation code) Subject identifiability and prescreening: – M-Turk workers can be required to earn “qualifications” (or prescreening questions) prior to completing a HIT Subject identifiability and longitudinal studies: – Worker IDs can be used to explicitly re-contact former subjects or code can be written that restricts the availability of a HIT to a predetermined list of workers Cultural diversity: – Cross-cultural comparisons feasible (e.g., country, language, currency) Subject anonymity (not easy though) – Ensuring worker’s anonymity (if external site is used) – M-Turk studies can be exempted for the review of IRBs (Institutional Review Boards) if anonymity is guaranteed

Tradeoffs of Different Recruiting Methods

A Comparative Study Tested various Judgment and Decision Making (JDM) findings – M-Turk, a traditional subject pool at a large Midwestern US university, and visitors of online discussion boards – During April to May 2010 Survey: – Asian disease problem – Linda problem – Physician problem

Survey (Asian Disease Problem) Asian disease problem (called framing, Tversky and Kahnerman, 1981) Subjects read one of two hypothetical scenarios – Imagine that the United States is preparing for the outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat the disease have been proposed. Assume that the exact scientific estimates of the consequences of the programs are as follows: – Problem 1: If Program A is adopted, 200 people will be saved. If Program B is adopted, there is 1/3 probability that 600 people will be saved and 2/3 probability that no people will be saved. Which of the two programs would you favor? – Problem 2: If Program A is adopted, 400 people will die. If Program B is adopted, there is 1/3 probability that nobody will die, and 2/3 probability that 600 people will die. Two scenarios are numerically identical, but the subjects responded very differently In the scenario framed in terms of gains, subjects were risk-averse (72% chose Program A); in the scenario framed in terms of losses, 78% of subjects preferred Program B (Tversky and Kahnerman, 1981)

Survey (Linda Problem) Example: “Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.” Which is more probable? – Linda is a bank teller – Linda is a bank teller and is active in the feminist movement Linda problem (Tversky & Kahneman, 1983) – Demonstrates the conjunction fallacy – People often fail to regard a combination of events as less probable than a single event in the combination Probability of two events occurring together (in “conjunction”) is always less than or equal to the probability of either one occurring alone

Survey (Physician Problem) Physician problem demonstrates the outcome bias: a surgeon deciding whether or not to do a risky surgery on a patient. – The surgery had a known probability of success (e.g., 8%) – Subjects were presented with either a good or bad outcome (in this case living or dying), and asked to rate the quality of the surgeon's pre-operation decision. Judgment of quality of a decision is often dependent on the valence of the outcome (Baron and Hershey, 1988) Subjects rated the quality of a physician’s decision to perform an operation on a patient (on a 7-point scale) – 1: incorrect and inexcusable, 7: clearly correct, and the opposite decision would be inexcusable – Those presented with bad outcomes rated the decision worse than those who had good outcomes.

After Survey After survey, subjects completed the subjective numeracy scale (SNS, 2007) called SNS score – An eight-item self-report measure of perceived ability to perform various mathematical tasks and preference for the use of numerical vs. prose information – Used as a parsimonious measurement of an individual’s quantitative abilities Additional “catch trial” question: to test whether subjects were attending to the questions (by requiring precise and obvious answers) – E.g., “while watching the television, have you ever had a fatal heart attack?” (w/ six-point scale anchored on “Never” and “Often”)

Configuration M-Turk: – Pay: $0.10 (N=318 participated) – Title: “Answer a short decision survey” – Description: “Make some choices and judgments in this 5- minute survey” Estimated completion time is included to provide workers with a rough assessment of the reward/effort ratio (e.g., $1.71/hour) Lab subject pool: – N=141 students from an introductory subject pool at a large university Internet discussion board: – Posted a link to the survey to several online discussion boards that host online experiments in psychology – Online for 2 weeks; and N=137 visitors took part in the survey

Subject Pools: Characteristics Subjects recruited from online discussion forums were significantly less likely to complete the survey than the subjects on M-Turk (69.3% vs. 91.6%, X 2 =20.915, p<.001) # of respondents who failed the catch trial is low, and not significantly different across subject pools (X 2 (2,301)=0.187, p=091) Subjects in the three subject pools did not differ significantly in the SNS score: F(2, 299) = 1.193, p=0.30

Results on Experimental Tasks M-Turk is a reliable source of experimental data in JDM

Labor Supply Economic theory predicts that increasing the price paid for labor will increase the supply of labor in most cases M-Turk experiment: after completing the demographic survey and the first task (transcription), subjects were randomly assigned to one of the four treatment groups and offered the chance to perform another transcription for p cents: 1, 5, 15, or 25 Workers receiving high offers were more likely to accept