Presentation is loading. Please wait.

Presentation is loading. Please wait.

Task and Workflow Design II KSE 801 Uichin Lee. Contents Turkomatic: divide and conquer strategy for performing more “challenging tasks” in M- Turk TurKontrol:

Similar presentations


Presentation on theme: "Task and Workflow Design II KSE 801 Uichin Lee. Contents Turkomatic: divide and conquer strategy for performing more “challenging tasks” in M- Turk TurKontrol:"— Presentation transcript:

1 Task and Workflow Design II KSE 801 Uichin Lee

2 Contents Turkomatic: divide and conquer strategy for performing more “challenging tasks” in M- Turk TurKontrol: decision-theoretic approach for work-flow control (e.g., how many improve/vote tasks?) Turkalytics: monitoring workers’ behavior remotely

3 Turkomatic: Automatic Recursive Task and Workflow Design for Mechanical Turk CHI'11 WIP

4 Turkomatic Turkomatic interface accepts task requests written in natural language Subdivide phase: – For each request, it posts a HIT to M- Turk, asking workers to break the task down into a set of logical subtasks – Each subtask is then automatically reposted to M-Turk; subtask can be further broken down Merge phase: – Once all subtasks are completed, HITs are posted asking workers to combine subtask solutions into a coherent whole The end result will then be delivered to the requester

5 Subdivide Phase Decomposition of tasks, and the creation of solution elements

6 Divide and Merge

7

8 Evaluation Tasks: – Producing a written essay in response to a prompt: “please write a five-paragraph essay on the topic of your choice” – Solving an example SAT test “Please solve the 16-question SAT located at http://bit.ly/SATexam”http://bit.ly/SATexam – Payment: $0.10 to $0.40 per HIT Each “subdivide” or “merge” HIT received answers within 4 hours; solutions to the initial task were completed within 72 hours Essay: the final essay (about “university legacy admissions”) displayed a reasonably good understanding of a topic; yet the writing quality is often mixed SAT: the task was divided into 12 subtasks (containing 1-3 questions); the score was 12/17

9 Decision-Theoretic Control of Crowd-Sourced Workflows Peng Dai, Mausam, Daniel S. Weld AAAI 2010

10 Motivation Iterative workflow (i.e., improve and vote) used in TurKit has the following problems: – What is the optimal number of iterations? – How many ballots (votes) should we use? – How do answers change if the workers are more/less skilled? Iterative workflow

11 TurKontrol: Computation Model Text α is improved to text α’ (after improve task) Given a pair (α, α’), a series of votes can be received (b k ) to judge which one is better

12 TurKontrol: Computation Model Text α: quality density function: f Q (q) – prior A worker x takes an improvement job and submits α‘ Text α‘ done by worker x: quality density function: f Q’|q,x (q’) – posterior Quality density function of text α‘

13 TurKontrol: Computation Model Voting: – A series of n votes: b = b 1, b 2, …, b n where b i ∈ {1, 0} – Posterior probability after n votes: f Q|b (q) and f Q’|b (q’) Difficulty : – Closer the two results the more difficult to judge – d(q, q’) = 1 - |q-q’| M where M is constant; and d ∈ [0, 1], Accuracy (of a worker x) – a x (d) = ½ [1+(1-d) r ] where r is a knob for controlling accuracy dist If the i-th worker x i has accuracy a xi (d),

14 TurKontrol: Computation Model For a given pair (α, α’), its posterior probabilities (Q, Q’) are f Q|b (q) and f Q’|b (q) where α Given that we don’t know the worker, an average worker is used

15 TurKontrol: Computation Model Improve αα‘α‘ Cost: c_imp Vote α Cost: c_b α'α' f Q (q)f Q’ (q’) f Q|b (q) f Q’|b (q’) f Q|b+1 (q) f Q’|b+1 (q’) Utility function: utility quality

16 TurKontrol: Computation Model Utility estimation of a pair (α, α’), for (1) improve and (2) voting task – (2) utility of a vote task – (1) utility of an improve task Decision making: – Three options: (a) vote, (b) improve, or(c) accept – k-step lookahead: evaluate all sequences of k decisions, and find the sub-sequence with the highest utility U: utility function c b : vote cost c imp : improve cost

17 Numerical Results Convex utility function with max 1000 Fixed cost (improve, vote) = (30, 10) Net utility: utility of submitted artifact –payment to workers TurKit: performs as many iterations as possible (max allowance 400) TurKontrol (2): lookahead of 2 cf: accuracy of workers a x (d) = ½[1+(1-d) r ]

18 Turkalytics: Real-time Analytics for Human Computation Paul Heymann and Hector Garcia-Molina WWW'11

19 Basic Buyer human programming A human program generates forms; advertised through a marketplace. Workers look at posts, and then complete the forms for compensation.

20 Game Maker human programming The programmer writes a human program and a game. The game implements features to make it fun and difficult to cheat. The human program loads and dumps data from the game.

21 Human Processing programming

22 Task description: – Input, output, web forms, human driver, other information – Human task instance Human drivers: interact with workers – Functions: initialization (forms, games), retrieving results – “Human Program” accesses workers via “human drivers” Recruiters: post task instances into the marketplaces, (by working with marketplace drivers) – Marketplace driver provides an interface to marketplaces (description)(instance)

23 Turkalytics Challenge: collecting reliable data about the workers and the tasks they perform Why? – If a task is not being completed, is it because no workers are seeing it? Is it because the task is currently being offered at too low a price? – How does the task completion time break down? – Do workers spend more time previewing tasks or doing them? – Do they take long breaks? – Which are the more “reliable” workers?

24 Interaction Model Search-Preview-Accept (SPA) model

25 Interaction Model Search-Continue-RapidAccept-Accept-Preview (SCRAP) Continue completing a task that was accepted but not submitted Accept the next task in a HITGroup w/o previewing it

26 Turkalytics Data Models

27 Turkalytics Architecture Client-side javascript: ta.js Log Server Client-side javascript: ta.js ta.js Ajax: POST Log messages (JSON ) Analysis Server Log messages (JSON )

28 Implementation: client-side Javascript Requester embeds a Turkalytics script (ta.js) into a HIT (when designing a HIT) – Monitoring: Detect relevant worker data and actions. – Sending: Log events by making image requests to the log server (ajax: POST)

29 Implementation: ta.js -- client-side JavaScript ta.js’s monitoring activities: – Client Information: Worker’s screen resolution? What plugins are supported? Can ta.js set cookies? – DOM Events: Over the course of a page view, the browser emits various events (e.g., load, submit, before unload, and unload events) – Activity: listens on a second-by-second basis for the mousemove, scroll and keydown events to determine if the worker is active or inactive. – Form Contents: examines forms on the page and their contents; logs initial form contents, incremental updates, and final state.

30 Implementation: log/analysis Log Server: – Simple web app built on Google’s App Engine. – Receives logging events from clients running ta.js and saves them to a data store. IP address, user agent, and referer, etc Analysis Server: – Periodically polls the log server to download any new events that have been received – Event inserted into DB, considering the following: Time constraints: data availability to analysis server Dependencies: if events are dependent on one another Incomplete input: if all events are not received yet.. Unknown input: what if unexpected input is received?

31 Implementation: analysis // what type of data (event) is sent // actual data for a given type Detailed info about task // session ID

32 Experiments Tasks: – Named Entity Recognition (NER): This task, posted in groups of 200 by a researcher in Natural Language Processing, asks workers to label words in a Wikipedia article if they correspond to people, organizations, locations, or demonyms. (2, 000 HITs, 1 HIT Type, more than 500 workers.) – Turker Count (TC): This task, posted once a week by a professor of business at U.C. Berkeley, asks workers to push a button, and is designed just to gauge how many workers are present in the marketplace. (2 HITs, 1 HIT Type, more than 1, 000 workers each.) – Create Diagram (CD): This task, posted by the authors, asked workers to draw diagrams for this paper based on hand drawn sketches

33 Experiments: origin of workers GeoLite City DB from MaxMind to geolocate all remote users by IP address

34 Experiments: worker characteristics

35 Experiments: states/actions RapidAccept is quite popular (Continue is rare)

36 Experiments: # previews Artificial recency for NER/CD (keep making them near the top in the list): NER and CD exhibit less severe drop as opposed to TC Artificial Recency

37 Experiments: activity vs. delay Average active and total seconds for each worker who completed the NER task (correlation 0.88)

38 Discussion Multi-tasking users? Activity vs. working time Privacy?? – We can collect as much as we can.. – How about Google Analytics? Any web pages that we visit can collect such information… False data injection? How can we better utilize the dataset? – Re-designing existing tasks, pricing, etc. (or mining user behavior?)

39 Summary Turkomatic: divide and conquer strategy for performing more “challenging tasks” in M- Turk TurKontrol: decision-theoretic approach for work-flow control (e.g., how many improve/vote tasks?) Turkalytics: monitoring workers’ behavior remotely


Download ppt "Task and Workflow Design II KSE 801 Uichin Lee. Contents Turkomatic: divide and conquer strategy for performing more “challenging tasks” in M- Turk TurKontrol:"

Similar presentations


Ads by Google