Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.

Slides:



Advertisements
Similar presentations
Ulams Game and Universal Communications Using Feedback Ofer Shayevitz June 2006.
Advertisements

Estimation of Means and Proportions
Incentivize Crowd Labeling under Budget Constraint
Hadi Goudarzi and Massoud Pedram
On-line learning and Boosting
Introduction to Sensitivity Analysis Graphical Sensitivity Analysis
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
An Approximate Truthful Mechanism for Combinatorial Auctions An Internet Mathematics paper by Aaron Archer, Christos Papadimitriou, Kunal Talwar and Éva.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Online Performance Guarantees for Sparse Recovery Raja Giryes ICASSP 2011 Volkan Cevher.
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
T HE POWER OF C ONVEX R ELAXATION : N EAR - OPTIMAL MATRIX COMPLETION E MMANUEL J. C ANDES AND T ERENCE T AO M ARCH, 2009 Presenter: Shujie Hou February,
Visual Recognition Tutorial
Kuang-Hao Liu et al Presented by Xin Che 11/18/09.
ECIV 201 Computational Methods for Civil Engineers Richard P. Ray, Ph.D., P.E. Error Analysis.
A general approximation technique for constrained forest problems Michael X. Goemans & David P. Williamson Presented by: Yonatan Elhanani & Yuval Cohen.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Ensemble Learning: An Introduction
The k-server Problem Study Group: Randomized Algorithm Presented by Ray Lam August 16, 2003.
1 Worst-Case Equilibria Elias Koutsoupias and Christos Papadimitriou Proceedings of the 16th Annual Symposium on Theoretical Aspects of Computer Science.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Maximum Likelihood (ML), Expectation Maximization (EM)
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
Visual Recognition Tutorial
Competitive Analysis of Incentive Compatible On-Line Auctions Ron Lavi and Noam Nisan SISL/IST, Cal-Tech Hebrew University.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
Online Function Tracking with Generalized Penalties Marcin Bieńkowski Institute of Computer Science, University of Wrocław, Poland Stefan Schmid Deutsche.
Radial Basis Function Networks
Wind Power Scheduling With External Battery. Pinhus Dashevsky Anuj Bansal.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
Primal-Dual Meets Local Search: Approximating MST’s with Non-uniform Degree Bounds Author: Jochen Könemann R. Ravi From CMU CS 3150 Presentation by Dan.
The Multiplicative Weights Update Method Based on Arora, Hazan & Kale (2005) Mashor Housh Oded Cats Advanced simulation methods Prof. Rubinstein.
Using Trust in Distributed Consensus with Adversaries in Sensor and Other Networks Xiangyang Liu, and John S. Baras Institute for Systems Research and.
A Model and Algorithms for Pricing Queries Tang Ruiming, Wu Huayu, Bao Zhifeng, Stephane Bressan, Patrick Valduriez.
Crowdsourcing with Multi- Dimensional Trust Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department of Electrical.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
Benk Erika Kelemen Zsolt
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
1 Online algorithms Typically when we solve problems and design algorithms we assume that we know all the data a priori. However in many practical situations.
A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries Surajit Chaudhuri Gautam Das Vivek Narasayya Presented by Sushanth.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Manipulating the Quota in Weighted Voting Games (M. Zuckerman, P. Faliszewski, Y. Bachrach, and E. Elkind) ‏ Presented by: Sen Li Software Technologies.
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.
Time Complexity of Algorithms (Asymptotic Notations)
Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
A Optimal On-line Algorithm for k Servers on Trees Author : Marek Chrobak Lawrence L. Larmore 報告人:羅正偉.
CS 3343: Analysis of Algorithms Lecture 19: Introduction to Greedy Algorithms.
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Predicting Consensus Ranking in Crowdsourced Setting Xi Chen Mentors: Paul Bennett and Eric Horvitz Collaborator: Kevyn Collins-Thompson Machine Learning.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Towards Robust Revenue Management: Capacity Control Using Limited Demand Information Michael Ball, Huina Gao, Yingjie Lan & Itir Karaesmen Robert H Smith.
Resource Provision for Batch and Interactive Workloads in Data Centers Ting-Wei Chang, Pangfeng Liu Department of Computer Science and Information Engineering,
Machine Learning: Ensemble Methods
Better Algorithms for Better Computers
Support Vector Machines
Bayesian Models in Machine Learning
Integer Programming (정수계획법)
Networked Real-Time Systems: Routing and Scheduling
DATABASE HISTOGRAMS E0 261 Jayant Haritsa
Presentation transcript:

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department of Electrical and Computer Engineering University of Maryland, College Park, MD 2 Deptment of Computer Science, University of Maryland, College Park, MD

Motivation Requester has budget constraint. Workers on AMT platform have varying reliability. Some are even malicious. Each task incurs certain amount of cost. Difficult tasks are more expensive and easy tasks are cheaper. Workers with higher reliability should expect to receive higher pay and workers with lower reliability are cheaper to be recruited to answer questions. Goal: optimally assign tasks to workers with varying trust and reliability under budget constraint.

Problem Setting Crowdsourcing Assignment Engine Malicious workers More reliable workers Pure experts Amazon Turkers Trust Evaluation True Label Inference Objective: minimize estimation error Task distributed to Turkers

Problem Formulation Crowdsourcing Assignment Engine Objective: minimize estimation error Task distributed to Turkers depends on the choice of estimation algorithm. is the truth value for question i. w are estimated trust values of workers given by an independent component introduced in the previous subproblem. It is assumed to be fixed and serves as input to the assignment engine.

Applying Probabilistic Approximation Correction (PAC) from learning theory, we relax the previously nondeterministic optimization problem into a convex optimization problem: Proof: Let the right hand side equals Trust-Aware Budget Allocation

Proof continued: With probability, the following holds: We express as: If, question i is always estimated correctly. Otherwise we get the wrong answer with probability Trust-Aware Budget Allocation

Proof continued: Therefore, we obtain the upper bound on the error rate which we are going to minimize The last inequality holds since w’s takes values from [0,1]. Therefore, we relaxed the optimization problem to minimizing the new upper bound: Trust-Aware Budget Allocation

Applying Probabilistic Approximation Correction (PAC) from learning theory, we get a relaxed problem: Intuition in the solution: when budget is not sufficient, assign budget to the most efficient workers. The most efficient worker is defined to be a worker that has highest reliability (squared)-cost ratio. Analytical solution

Trust-Aware Budget Allocation With Penalty Intuition: when budget is high, we want to allocate budget to expensive workers (more trustworthy) instead of just efficient workers. Therefore, the taste of uniform strategy enforced by the penalty term serves exactly this purpose.

Theoretical Guarantees We provide the upper bound on the error probability rate of the trust-aware budget allocation scheme: given budget B, i.e., The error bound above has the following characteristics: decreases exponential with budget B. decreases when cost is lower. decreases when workers are more reliable.

Theoretical Guarantees Proof Assume weight majority vote, the labels contributed by workers are aggregated by Hoeffding concentration bound gives us: Plugging the optimal solution we obtain the bound straightforwardly.

Theoretical Guarantees If, with probability at least, the total error probability satisfies:

Experiment Benchmarks UA: the algorithm tends to allocate the same number of people to answer a question from each available crowd. If the budget is not used up, for each question, it randomly chooses an expert from the set of crowds. CQSA: for each question, the algorithm only chooses people from the most trustworthy crowd to assign If budget is not consumed, it iterates the question set again and randomly chooses an expert from the set of crowds for each question. CA: the algorithm only chooses the cheapest crowd (the least trustworthy crowd) for questions according to

Experiment Results TAAP performs the best out of all benchmarks across the span of budget. When budget is small (<200), the improvement of TAA and TAAP over CQSA and UA is up by 30%. TAA performs poorly when budget is high due to the floor function during assignment and the sparsity feature in the optimal solution in the relaxed problem (only workers from the most efficient group are chosen). This is fixed by TAAP. TAA: Trust-aware allocation. TAAP: Trust-aware allocation with penalty.

Experiment Results In this experiment, we do not assume the trust values of workers can be perfectly estimated. We add Gaussian noise on the true trust values of workers. We do experiments with varying level of noises (increasing means of Gaussian noise). TAAP is affected to some extend when the noise goes high. However, when budget is high, the allocation scheme is very robust against all levels of noises.

Conclusions We formalize the problem of trust-aware task allocation in crowdsourcing and provide a principled way to solve it. We model the workers’ trustworthiness as reliability and the cost depends on both workers’ trusts and questions’ difficulty. Our method is flexible in that you can plugin more complicated aggregation method other than weighted majority vote. We provide theoretical guarantee for our trust-aware allocation scheme.

Future Work The theoretical guarantee for the trust-aware algorithm when the weight (trust) can not be perfectly estimated has not been addressed yet. It would be interesting to investigate this. Now the trust estimate is assumed to be fixed and given by another component. We will consider the case where trust is dynamically updated and crowdsourcing assignment is done online instead of offline.

Thank you