Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluation Eyal Ophir CS 376 4/28/09. Readings Methodology Matters (McGrath, 1994) Practical Guide to Controlled Experiments on the Web (Kohavi et al.,

Similar presentations


Presentation on theme: "Evaluation Eyal Ophir CS 376 4/28/09. Readings Methodology Matters (McGrath, 1994) Practical Guide to Controlled Experiments on the Web (Kohavi et al.,"— Presentation transcript:

1 Evaluation Eyal Ophir CS 376 4/28/09

2 Readings Methodology Matters (McGrath, 1994) Practical Guide to Controlled Experiments on the Web (Kohavi et al., 2007)

3 Methodology Matters

4 Methods for Research in the Behavioral and Social Sciences Different methods have strengths and weaknesses Tradeoff between:  Generalizability  Precision  Realism Credibility requires consistency, convergence across methods

5 Study Design Find baserates, correlations, or differences Randomization of selection, assignment to conditions Statistical significance Validity (internal, statistical, construct, external)

6 Measures Self report Trace measures Observation (by a visible or hidden observer) Archival records (public or private)

7 Manipulation Selection Direct intervention Induction (indirect intervention: confederates, deception)

8 Case Study: Multitasking UI Users play two simultaneous instantiations of a game Does making the two instantiations visually different make it easier to switch back and forth?

9 Case Study

10

11 Tradeoffs: Generalizability, Precision, Realism Design: baserates, correlations, differences Random selection, assignment Validity: internal, statistical, construct, external Measures: self-report, trace measures, observation, archival records Manipulation: selection, intervention, induction

12 General Question Has social psychology resisted formal theory, and if so, why?

13 Practical Guide to Controlled Experiments on the Web

14 Web Experiments OEC: Overall Evaluation Criterion

15 Web Experiments Hypothesis testing and sample size  Confidence, power  Reducing the standard error Sufficiently large sample size OEC with inherently low variability Reduce variability by excluding irrelevant cases

16 Web Experiments Extensions for Online Experiments  Treatment ramp-up  Automation  Software Migration

17 Web Experiments Limitations of web experiments  No explanation of mechanism  Focus on short term effects  Primacy/newness  Must implement treatments

18 Web Experiments Implementation  Randomization Pseudorandom with caching Hash and partition  Assignment Traffic splitting Server-side Client-side

19 Lessons learned (i.e.- tips for the researcher): Analysis  Mine the Data  Time matters  Multi-factor experiments

20 Lessons Learned Trust and Execution  Run A/A tests (test your system)  Ramp-up and abort  Correct sample size  Assign 50% to treatment  Beware day of week effects

21 Lessons Learned Culture and Business  Agree on OEC upfront  Beware “harmless” features  Weigh performance vs. maintenance cost  Data-driven (vs. opinion-driven) culture

22 Extended Case Study Assume the game UI from the first case study was an actual gaming site The website is interested in promoting multiple simultaneous games between users, but users complain that it’s difficult to manage multiple games Design a web-based study informed by the reading to test the new design

23 Case Study OEC Sample size, reducing error Ramp-up, automation Mechanism explanation, short vs. long-term effects, primacy/newness Randomization/assignment Mine the data, multi-factor experiments A/A tests, sample size, day of week effects

24 Data-Oriented Culture Pros? Cons? How can we best use user tests to inform design and innovation? Trade-offs of experimentation vs. intuition Why the OEC? What are good measures for non-commerce sites? Do online tests maximize all McGrath’s parameters?


Download ppt "Evaluation Eyal Ophir CS 376 4/28/09. Readings Methodology Matters (McGrath, 1994) Practical Guide to Controlled Experiments on the Web (Kohavi et al.,"

Similar presentations


Ads by Google