Beyond datasets: Learning in a fully-labeled real world Thesis proposal Alexander Sorokin.

Beyond datasets: Learning in a fully-labeled real world Thesis proposal Alexander Sorokin

Research projects

Thesis

Motivation

Task Amazon Mechanical Turk Is this a dog? o Yes o No Workers Answer: Yes Task: Dog? Pay: $0.01 Broker www.mturk.com $0.01

Select examples Joint work with Tamara and Alex Berg http://vision.cs.uiuc.edu/annotation/data/simpleevaluation/html/horse.html

Click on landmarks $0.01 http://vision-app1.cs.uiuc.edu:8080/mt/results/people14-batch11/p7/

Outline something $0.01 http://vision.cs.uiuc.edu/annotation/results/production-3-2/results_page_013.html Data from Ramanan NIPS06

Mark object attributes $0.03

Teach a robot

How do we define the task?

Annotation specification

Annotation language

Ideal task properties

How good are the annotations? Submission isVolumeActionRedo Empty6%Rejectyes Clearly bad2%Rejectyes Almost good4%Accept (pay)yes Good88%Accept (pay)no Task: label people, box+14pts; Volume 3078 HITs

How do we make it better?

1. Average N annotations

2. Require qualification Please read the detailed instructions to learn how to perform the task. Please confirm that you understand the instructions by answering the following questions: Which of the following checboxes are correct for this annotation? No people (there are people in the image) > 20 people (there are less than 20 people of appropriate size) Small heads (there are unmarked small heads in the image) Task: Put a box around every head

2. Require qualification

3. Use task pipeline

4. Do grading

Grade conflicts Total grades: 4410

5. Automatic grading

Learning to grade TaskBottlesPeopleHandsLarge objects Accuracy95.0%83.8%45.5%29.5%

Quality control

Setting the pay

Annotation Method Comparison ApproachCostScaleSetup effort CollaborativeQualityDirectedCentralElastic to $ MTurk$+++ * no+/+++Yesno+++++ GWAP++++***no+Yes + LabelME++Yes++noYes Image Parsing $$++**no++++Yes +++ In house$$$+*no+++Yesno++

Is it useful?

Publications

Thesis

Fully labeled world assumption Goal: learn to detect every object

Why is it important

Computer vision task

Challenges

Lighting conditions Background clutter Lighting and background are known Within-class variability Viewpoint changes Internal deformations 100 000 categories How many instances? 10s billions total 10 000 locally 1000 examples per category 1-10 labels per object Single image Rich sensor data

PR2 Sensing capabilities

Autonomous data collection

Data labeling

Learning

Preliminary learning results UChicago-VOC2008-person

Expected outcome

Thesis

Detect-Sample-Label

Sampling based estimation

Standard deviation table

Estimating recall

Experimental results

What are the errors?

Timeline

Acknowledgments Special thanks to: David Forsyth Nicolas Loeff, Ali Farhadi, Du Tran, Ian Endres Tamara Berg, Pushmeet Kohli Dolores Labs (Lukas Biewald) Willow Garage (Gary Bradsky, Alex Teichman, Daniel Munos, …) All workers at Amazon Mechanical Turk This work was supported in part by the National Science Foundation under IIS - 0534837 and in part by the Office of Naval Research under N00014-01-1-0890 as part of the MURI program. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect those of the National Science Foundation or the Office of Naval Research.

Thank you

What is an annotation task?

PR2 Platform 2 Laser scanners Fixed and Tilting 7 cameras 2 stereo pairs, 1 hires (5mpx) 2 in the arms Structured light 16 cores, 48 GB RAM 2 Arms

What are datasets good for? Training –The data is fully labeled Evaluation Tweaking the parameters –Performance is computed automatically Comparing algorithms –“They run on exact same data”

Why are datasets bad? Data sampling and labeling bias Small changes in performance are insignificant Parameter tweaking doesn’t generalize Overfitting to the datasets Datasets should be discarded after performance is measured

Beyond datasets: Learning in a fully-labeled real world Thesis proposal Alexander Sorokin.

Similar presentations

Presentation on theme: "Beyond datasets: Learning in a fully-labeled real world Thesis proposal Alexander Sorokin."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Beyond datasets: Learning in a fully-labeled real world Thesis proposal Alexander Sorokin.

Similar presentations

Presentation on theme: "Beyond datasets: Learning in a fully-labeled real world Thesis proposal Alexander Sorokin."— Presentation transcript:

Similar presentations

About project

Feedback