Download presentation

Presentation is loading. Please wait.

Published byAntony Embry Modified over 2 years ago

1
David Karger Sewoong Oh Devavrat Shah MIT + UIUC

2
o A patient is asked: rate your pain on scale 1-10 o Medical student gets answer : 5 o Intern gets answer : 8 o Fellow gets answer : 4.5 o Doctor gets answer : 6 o So what is the “right” amount of pain? o Crowd-sourcing o Pain of patient = task o Answer of patient = completion of task by a worker

6
o Goal: reliable estimate the tasks with min’l cost o Key operational questions: o Task assignment o Inferring the “answers”

7
o N tasks o Denote by t 1, t 2, …, t N – “true” value in {1,..,K} o M workers o Denote by w 1, w 2, …, w M – “confusion” matrix o Worker j: confusion matrix P j =[P j kl ] o Worker j’s answer: is l for task with value k with prob. P j kl o Binary symmetric case o K = 2: tasks takes value +1 or -1 o Correct answer w.p. p j

8
t1t1 tNtN t2t2 t N-1 w1w1 w2w2 w M-1 wMwM A 11 A N-1 1 A N2 A 2M o Binary tasks: o Worker reliability: o Necessary assumption: we know

9
o Goal: given N tasks o To obtain answer correctly w.p. at least 1-ε o What is the minimal number of questions (edges) needed? o How to assign them, and how to infer tasks values? t1t1 tNtN t2t2 t N-1 w1w1 w2w2 w M-1 wMwM A 11 A N-1 1 A N2 A 2M

10
o Task assignment graph o Random regular graph o Or, regular graph w large girth t1t1 tNtN t2t2 t N-1 w1w1 w2w2 w M-1 wMwM A 11 A N-1 1 A N2 A 2M

11
o Majority: o Oracle: t1t1 tNtN t2t2 t N-1 w1w1 w2w2 w M-1 wMwM A 11 A N-1 1 A N2 A 2M

12
o Majority: o Oracle: o Our Approach: t1t1 tNtN t2t2 t N-1 w1w1 w2w2 w M-1 wMwM A 11 A N-1 1 A N2 A 2M

13
o Iteratively learn o Message-passing o O(# edges) operations o Approximation of o Maximum Likelihood t1t1 tNtN t2t2 t N-1 w1w1 w2w2 w M-1 wMwM A 11 A N-1 1 A N2 A 2M

14
t1t1 tNtN t2t2 t N-1 w1w1 w2w2 w M-1 wMwM A 11 A N-1 1 A N2 A 2M o Theorem (Karger-Oh-Shah). o Let n tasks assigned to n workers as per o an (l,l) random regular graph o Let ql > √2 o Then, for all n large enough (i.e. n =Ω(l O(log(1/q)) e lq ))) after O(log (1/q)) iterations of the algorithm Crowd Quality

15
o To achieve target P error ≤ε, we need o Per task budget l = Θ(1/q log (1/ε)) o And this is minimax optimal o Under majority voting (with any graph choice) o Per task budget required is l = Ω(1/q 2 log (1/ε)) no significant gain by knowing side-information (golden question, reputation, …!)

16
Theorem (Karger-Oh-Shah). Given any adaptive algorithm, let Δ be the average number of workers required per task to achieve desired P error ≤ε Then there exists {p j } with quality q so that gain through adaptivity is limited

17
Theorem (Karger-Oh-Shah). To achieve reliability 1-ε, per task redundancy scales as K/q (log 1/ε + log K) Through reducing K-ary problem to K-binary problems (and dealing with few asymmetries)

18
o Learning similarities o Recommendations o Searching, …

19
o Learning similarities o Recommendations o Searching, …

22
o Crow-sourcing o Regular graph + message passing o Useful for designing surveys/taking polls o Algorithmically o Iterative algorithm is like power-iteration o Beyond stand-alone tasks o Learning global structure, e.g. ranking

Similar presentations

OK

Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on index numbers lecture Free ppt on job satisfaction Ppt on water pollution and conservation Ppt on solar system with sound Ppt on review writing books Ppt on mobile network layer A ppt on artificial intelligence Ppt on rc phase shift oscillator connections Ppt on central limit theorem statistics Download ppt on live line maintenance usa