Presentation is loading. Please wait.

Presentation is loading. Please wait.

Internet Enabled Human Computation CSE 454 Daniel Weld.

Similar presentations


Presentation on theme: "Internet Enabled Human Computation CSE 454 Daniel Weld."— Presentation transcript:

1 Internet Enabled Human Computation CSE 454 Daniel Weld

2 To do Challenge - Mechanisms for deterring vandals Reputation Gold standard answers Randomized redundancy Balloon challenge More on foldit Game design, plateaus & levels Aardvark & quora

3 Crowdsourcing “a neologistic compound of Crowd and Outsourcing for the act of taking tasks traditionally performed by an employee or contractor, and outsourcing them to a group of people or community, through an "open call" to a large group of people (a crowd) asking for contributions” ---[Wikipedia]

4 Built in 1770 by Wolfgang von Kempelen

5 56/17/2015

6 Powerset

7 Your sentence is: The term silver dollar is often used for any large white metal coin issued by the United States with a face value of one dollar ; although purists insist that a dollar is not silver unless it contains some of that metal. Enter one term per box. $0.05

8 Fast & Cheap, but is it Good? [Snow et al. EMNLP-08]

9 How Cheap + Fast? In our experiment we ask for 10 annotations each of the full 30 word pairs, at an offered price of $0.02 for each set of 30 annotations (or, equivalently, at the rate of 1500 annotations per USD). The most surprising aspect of this study was the speed with which it was completed; the task of 300 annotations was completed by 10 annotators in less than 11 minutes … 1724 annotations / hour. [Snow et al. EMNLP-08]

10 Turker Demographics March, 2008 (Panos Ipeirotis)

11 Turker Demographics February, 2010 (Panos Ipeirotis)

12 Turker Demographics May, 2010 (Crowdflower) http://blog.crowdflower.com/2010/05/amazon-mechanical-turk-survey/

13 Complex Jobs TurkIt [Little 09] Casting Words

14 TurKit [Little et al. 09] Determine a fixed allowance Money spent in a problem Each improvement iteration Ask two workers to vote A third is asked if the first two disagree Keep the artifact by majority vote 146/17/2015

15 Iterative Improvement ?

16 Iterative Improvement Version 7 A close-up photograph of the following items: A CASIO multi-function, solar-powered scientific calculator. A blue ball point pen with a blue rubber grip and the tip extended. British coins, two of 1 value, three of 20p value and one of 1p value. Seems to be a theme illustration for a brochure or document cover treating finance – probably personal finance.”

17 Limitation: Workflow is Fixed Number of iterations is determined By the allowance Not by the quality of the answers or the workers Number of votes / iter is almost fixed Not based on the difficulty of the job 176/17/2015

18 TurKontrol [Dai AAAI10] Learner Problem Solution HITs Answers 186/17/2015 Model Planner Input a picture an initial description Output a high quality description

19 TurKontrol Workflow Improvement needed? Generate improvemen t HIT Generat e ballot HIT More voting needed? bkbk Y N Y N 196/17/2015

20 Evaluation Measures Quality measure Quality improvement probability (QIP) An artifact has QIP q 1-Pr(an average worker improves the artifact) Never exactly known Can be estimated by a random variable Q Utility function U(q) 206/17/2015

21 Control Problem is a POMDP 216/17/2015

22 Comparison with Fixed Workflows Cost = (30,10) Allowance of TurKit = 400 226/17/2015 0.250.524 182.84 152.66

23 Money How Motivate People to Help?

24 DARPA Network Challenge $40k 10 Moored Weather Balloons 10am ET Saturday 12/5/09

25 Winner MIT Red Balloon Challenge Team All 10 Balloons – 8:52 Also notable: Groundspeak Geocachers 7 Balloons – 6:02 https://networkchallenge.darpa.mil/ProjectReport.pdf

26 Selected competitors The MIT Media Lab team (http://balloon.mit.edu/) was the winning team, correctly identifying the locations of all 10 balloons in 8 hrs and 52 min. The MIT Media Lab team was organized within Professor Alex “Sandy” Pentland’s Human Dynamics Laboratory. The team designed and launched a recursive incentive recruiting method that reached almost 5,400 individuals in approximately 36 hours. The ingenuity of the recruiting method was that the incentive to join the effort was transferred undiminished with each subsequent layer of network nodes. MIT also enjoyed name recognition and mass media coverage (CNN Headline News) on execution day that helped them become one of the preferred sources to receive balloon reports. MIT collected extensive network structure data during the Challenge and plans several scientific studies of human dynamics and social networks using data from the DNC. George Hotz George Hotz learned about the Challenge the day before the balloon launch. He announced his personal effort and website (http://dudeitsaballoon.com/) in a Tweet an hour before the start of the DNC. Hotz has an existing Twitter network of almost 50,000 followers, due in no small part to his fame as a hacker (including the first untethering of the iPhone when he was 17 years old). With only an hour of preparation before the Challenge, Hotz was able to locate 8 balloons (4 from direct reports of his existing Twitter network, 4 through trades with other teams). The Groundspeak team (http://www.10balloonies.com/) mobilized their extensive, pre ‐ existing network of active geocachers using email alerts one and two days prior to balloon launch. Groundspeak is the largest geocache coordinator with an estimated active network of premium users in the hundreds of thousands (plus several hundred thousand additional free content members). Groundspeak was able to use their member database to do very effective geographic targeting of reported balloon locations for verification.

27 Successful Tools Marketing + media broadcast strategies to get team members Recursive, incentivized recruiting of networks to build team Extraction of reported locs from open iNet sources (eg Twitter) Automated means of extracting data, e.g. Twitter crawler Deployment of automatic reporting capability, e.g. iPhone apps Dispatching team members as spotters to confirm Website design that motivates, encourages recruitment, or allows easy, secure reporting Search engine rank optimization of website

28 Recursive Incentivizing method that reached almost 5,400 individuals in approximately 36 hours. The ingenuity of the recruiting method was that the incentive to join the effort was transferred undiminished with each subsequent layer of network nodes. MIT also enjoyed name recognition and mass media coverage (CNN Headline News) on execution day

29 Money Altruism Esteem Self-Interest Fun How Motivate People to Help?

30 Altruism Self-Esteem

31

32 Collaborative Geomapping State Troopers Reaction to Trapster Motivation & Vandalism Control Other Applications North Korea Uncovered (Google Earth) DARPA Network Challenge

33 Self-Interest

34 Hybrid Models

35 StackOverflow

36

37 StackOverflow Optional Reputation Answer voted up+10 Question voted up+ 5 Answer accepted+15 (+2 to acceptor) Post voted down- 2 (-1 to voter) Max 30 votes / user / day

38 Reputation  Privileges 15vote up 15flag offensive 50leave comments 100edit community wiki posts 125vote down (costs 1 rep) 500retag questions 1000create new tags 2000edit other people’s posts Etc…

39 Motivating People Money Fun

40 IMAGE SEARCH ON THE WEB USES FILENAMES AND HTML TEXT Slides by Luis von Ahn

41 ACCESSIBILITY LESS THAN 10% OF THE WEB IS ACCESSIBLE TO THE VISUALLY IMPAIRED REASON:MOST IMAGES DON’T HAVE A CAPTION Slides by Luis von Ahn

42 LABELING IMAGES WITH WORDS STILL A COMPLETELY OPEN PROBLEM FACE MAN SUPER SEXY Slides by Luis von Ahn

43 DESIDERATA A METHOD THAT CAN LABEL ALL IMAGES ON THE WEB FAST AND CHEAP Slides by Luis von Ahn

44 TWO-PLAYER ONLINE GAME PARTNERS DON’T KNOW EACH OTHER AND CAN’T COMMUNICATE OBJECT OF THE GAME: TYPE THE SAME WORD THE ONLY THING IN COMMON IS AN IMAGE THE ESP GAME Slides by Luis von Ahn

45 PLAYER 1PLAYER 2 GUESSING: CARGUESSING: BOY GUESSING: CAR SUCCESS! YOU AGREE ON CAR SUCCESS! YOU AGREE ON CAR GUESSING: KID GUESSING: HAT THE ESP GAME Slides by Luis von Ahn

46 © 2004 Carnegie Mellon University, all rights reserved. Patent Pending. Slides by Luis von Ahn

47 MANY PEOPLE PLAY OVER 20 HOURS A WEEK 3.2 MILLION LABELS WITH 22,000 PLAYERS THE ESP GAME IS FUN Slides by Luis von Ahn

48 LABELING THE ENTIRE WEB INDIVIDUAL GAMES IN YAHOO! AND MSN AVERAGE OVER 10,000 PLAYERS AT A TIME 5000 PEOPLE PLAYING SIMULTANEOUSLY CAN LABEL ALL IMAGES ON GOOGLE IN 30 DAYS! Slides by Luis von Ahn

49 9 BILLION MAN-HOURS OF SOLITAIRE WERE PLAYED IN 2003 EMPIRE STATE BUILDING PANAMA CANAL 7 MILLION MAN-HOURS (6.8 HOURS OF SOLITAIRE) 20 MILLION MAN-HOURS (LESS THAN A DAY OF SOLITAIRE) Slides by Luis von Ahn

50 GWAP Problem?

51 PhotoCity Reconstructing the World in 3D Bringing Games with a Purpose Indoors

52 PhotoCity Gameplay

53 30 Photo Seed with Holes

54 Mobile App

55

56 Hybrid Models Revisited Effect of Pay on Job Completion

57 Hybrid Models Revisited

58

59 Hybrids What else could you add to a MT Task? Leaderboards Raffles ????

60 Money Altruism Esteem Self-Interest Fun Motivation


Download ppt "Internet Enabled Human Computation CSE 454 Daniel Weld."

Similar presentations


Ads by Google