Systematization of Crowdsoucing for Data Annotation Aobo, Feb. 2010.

Slides:



Advertisements
Similar presentations
Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
Advertisements

Human Computation and Crowdsourcing: Survey and Taxonomy Uichin Lee Sept. 7, 2011.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Crowdsourcing ontology engineering Elena Simperl Web and Internet Science, University of Southampton 11 April 2013.
Game Theoretic Aspect in Human Computation Presenter: Chien-Ju Ho
Spit Ball Interactive game for ages 6 to 18 Based on Spit Wod Willy (1993)
A Linguistic Approach for Semantic Web Service Discovery International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) July 13, 2012 Jordy.
Presenter: Chien-Ju Ho  Introduction to Amazon Mechanical Turk  Applications  Demographics and statistics  The value of using MTurk Repeated.
Collaborative Human Computing Zack Zhu March 31, 2010 Seminar for Distributed Computing 1.
CROWDSOURCING DIGITAL CULTURAL HERITAGE Zagreb, 6-8 November 2013 INFuture2013 Goran Zlodi, Tomislav Ivanjko Faculty of Humanities and Social Sciences,
Vamshi Ambati | Stephan Vogel | Jaime Carbonell Language Technologies Institute Carnegie Mellon University A ctive Learning and C rowd-Sourcing for Machine.
Harnessing manpower for creating semantics (doctoral dissertation) Jakub Šimko Institute of Informatics and Software Engineering,
A one player game where players are asked to tag funny video clips in a given time frame. They will score points throughout the game and be entered into.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Experimental Evaluation in Computer Science: A Quantitative Study Paul Lukowicz, Ernst A. Heinz, Lutz Prechelt and Walter F. Tichy Journal of Systems and.
Sparsity, Scalability and Distribution in Recommender Systems
Recap: HOGgles Data, Representation, and Learning matter. – This work looked just at representation By creating a human-understandable HoG visualization,
Human Computation CSC4170 Web Intelligence and Social Computing Tutorial 7 Tutor: Tom Chao Zhou
Beyond datasets: Learning in a fully-labeled real world Thesis proposal Alexander Sorokin.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Educational Psychology
Human Computation and Crowdsourcing Uichin Lee May 8, 2011.
Introduction to Natural Language Processing Heshaam Faili University of Tehran.
Problem (Question) (Difficulty). Decisions Problem (Question) (Difficulty) Decisions Research.
ICE 234: Survey of Computers in Schools Christy Keeler, Ph.D. One-Computer Classroom.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
Human Computation & ESP Game 2008/12/19 Presenter: Lin, Sin-Yan 1.
Flashcard Application —A facebook application with multiple purposes Aobo Wang 1.
Differentially Private Data Release for Data Mining Noman Mohammed*, Rui Chen*, Benjamin C. M. Fung*, Philip S. Yu + *Concordia University, Montreal, Canada.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Ideas Session Willer Travassos, Jan. 24th. GWAP Games with a purpose (GWAP) uses the computational power of humans to perform tasks that computers are.
Assessing the Frequency of Empirical Evaluation in Software Modeling Research Workshop on Experiences and Empirical Studies in Software Modelling (EESSMod)
An Analytical Study of Puzzle Selection Strategies for the ESP Game Ling-Jyh Chen, Bo-Chun Wang, Kuan-Ta Chen Academia Sinica Irwin King, and Jimmy Lee.
Human Computation and Computer Vision CS143 Computer Vision James Hays, Brown University.
Can Change this on the Master Slide Monday, August 20, 2007Can change this on the Master Slide0 A Distributed Ranking Algorithm for the iTrust Information.
Playing GWAP with strategies - using ESP as an example Wen-Yuan Zhu CSIE, NTNU.
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
CROWDSOURCING Massimo Poesio Part 2: Games with a Purpose.
Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.
Gaze-Tracked Crowdsourcing Jakub Šimko, Mária Bieliková
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Human computation Jakub Šimko Slovak University of Technology in Bratislava, Faculty of Informatics and Information Technologies,
1 An Introduction to Computational Linguistics Mohammad Bahrani.
MT Evaluation: What to do and what not to do Eduard Hovy Information Sciences Institute University of Southern California.
Incentive Mechanism Design and Implementation for Mobile Sensing Systems Zhibo Wang Dept. of EECS University of Tennessee, Knoxville Project for ECE 692.
CrowdForge: Crowdsourcing Complex Work Aniket Kittur, Boris Smus, Robert E. Kraut February 1, 2011 Presenter: Karalis Alexandros 1.
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks EMNLP 2008 Rion Snow CS Stanford Brendan O’Connor Dolores.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
CS 2750: Machine Learning Active Learning and Crowdsourcing
Building PetaScale Applications and Tools on the TeraGrid Workshop December 11-12, 2007 Scott Lathrop and Sergiu Sanielevici.
Big Data: Every Word Managing Data Data Mining TerminologyData Collection CrowdsourcingSecurity & Validation Universal Translation Monolingual Dictionaries.
GRAPH BASED MULTI-DOCUMENT SUMMARIZATION Canan BATUR
© Prentice Hall, 2005 Excellence in Business CommunicationChapter Planning Business Reports and Proposals.
Internet Econ: Google/Facebook POV
By : Namesh Kher Big Data Insights – INFM 750
Simon Tucker and Steve Whittaker University of Sheffield
Slovak University of Technology in Bratislava,
Distributed Representation of Words, Sentences and Paragraphs
Writing Objectives in Blooms Taxonomy
Human computation, and the wisdom of crowds
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
Semantic Approach for Evaluating Product Design Using Image Schemas
Chin-Sheng Chen Florida International University
Chapter 14 INTRODUCING EVALUATION
Presentation transcript:

Systematization of Crowdsoucing for Data Annotation Aobo, Feb. 2010

Outline  Overview  Related Work  Analysis and Classification  Recommendation  Future work  Conclusions  Reference

Overview  Contribution  Provide a faceted analysis of existing crowdsourcing annotation applications.  Discuss recommendations on how practioners can take advantage of crowdsourcing.  Discuss the potential opportunities in this area.  Defination  Crowdsoucing  GWAP  Distributed Human-based Computation  AMT  HIT

Retaled Works  “A Taxonomy of Distributed Human Computation”  Author: J. Quinn and B. Bederson  Year: 2009 Contribution  Divide the DHC applications into seven genres.  Proposed six dimensions to help characterize the different approaches.  Propose some recommandation and future directions

Related work  “A Survey of Human Computation Systems”  Author: Yuen, Chen and King  Year: 2009 Contribution  General survey of various human computation systems separately.  Compare the GWAPs based on the game structure, verification method, and game mechanism  Present the performance aspect issues of GWAPs.

Analysis and Classification  Dimensions

Analysis and Classification  GWAP  High score in : GUI desine, Implementation cost, Annotation speed,  Low score in : Anonotation cost, Difficulty, Participation time, Domain Coverage, Popularization  Medium score in: Annotation accuracy, Data size  NLP tasks: - Word Sense Disambiguation - Coreference Annotation

Analysis and Classification  AMT  High score in : Annotation cost  Low score in : GUI design, Implementation Cost, Number of Participants, Data size  Medium score in: Popularization, Difficulty, Domain coverage, Paticipation time, Popularization Annotation accuracy,  NLP tasks: - Parsing - Part-of-Speech Tagging

Analysis and Classification  Wisdom of Volunteers  High score in : Number of Participants, Data size, Difficulty, Paticipation time  Low score in : GUI design, Fun  Medium score in: Implementation Cost, Annotation accuracy,  NLP tasks: - Paraphrasing - Machine Translation task - Summarization

Recommendation  GWAP  Submit the GWAP games to a popular game website which provides and recommend new games for players  Uniform game developing platform  AMT  Make the task fun  Rank the employers by their contribuation  Award employers who provide original data to be annotated  Donate the whole or part of the benefit to an charity  Wisdom of Volunteers  Rank the users by their contribuation  Push the tasks to the public users

Conclusions  Propose different dimentions of existing crowdsourcing annotation applications.  Discuss recommendations on each crowdsourcing approach  Discuss the potential opportunities in this area

Reference 1.Benjamin B. Bederson Alexander J. Quinn A taxonomy of distributed human computation. 2.Aniket Kittur, Ed H. Chi, and Bongwon Suh Crowdsourcing user studies with mechanical turk. 3.Rion Snow, Brendan O’Connor, Daniel Jurafsky, and Andrew Ng Cheap and fast – but is it good? evaluating non-expert annotations for natural language tasks. 4.A. Sorokin and D. Forsyth Utility data annotation with amazon mechanical turk. 5.Luis von Ahn and Laura Dabbish. 2008a. Designing games with a purpose. Commun. ACM, 51(8):58– 67, August. 6.Luis von Ahn and Laura Dabbish. 2008b. General techniques for designing games with a purpose. Commun. ACM, 51(8):58–67. 7.Man-Ching Yuen, Ling-Jyh Chen, and Irwin King A survey of human computation systems. Computational Science and Engineering, IEEE International Conference on, 4:723–728.