16-824: Learning-based Methods in Vision Instructors: Alexei (Alyosha) Efros 225 Smith Hall Leon Sigal

Slides:



Advertisements
Similar presentations
Poster & Project Presentations The Robert Gordon University
Advertisements

LECTURE 1: COURSE INTRODUCTION Xiaowei Yang. Roadmap Why should you take the course? Who should take this course? Course organization Course work Grading.
1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.
USING A MOOC TO FLIP MY FRESHMAN BIOLOGY COURSE BRIAN WHITE UMB PROVOST FELLOW FOR EDUCATION AND TECHNOLOGY.
Welcome to the seminar course
3 Small Comments Alex Berg Stony Brook University I work on recognition: features – action recognition – alignment – detection – attributes – hierarchical.
16-721: Advanced Machine Perception Staff: Instructor: Alexei (Alyosha) Efros 4207 TA: David Bradley 2216 NSH Web Page:
Computer and Robot Vision I
CPSC 425: Computer Vision (Jan-April 2007) David Lowe Prerequisites: 4 th year ability in CPSC Math 200 (Calculus III) Math 221 (Matrix Algebra: linear.
CORNELL UNIVERSITY CS 764 Seminar in Computer Vision Ramin Zabih Fall 1998.
CSCD 555 Research Methods for Computer Science
Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce.
16-721: Learning-based Methods in Vision Staff: Instructor: Alexei (Alyosha) Efros 4207 TA: Tomasz Malisiewicz Smith Hall.
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
15-463: Rendering and Image Processing Staff Prof: Alexei Efros TA: James Hays Web Page
16-899A Pixels to Percepts Instructors: Alexei (Alyosha) Efros, 225 Smith Hall, CMU.Lavanya Sharan, Disney Research Pittsburgh Web Page:
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Visual Object Recognition Rob Fergus Courant Institute, New York University
16-721: Learning-based Methods in Vision Staff: Instructor: Alexei (Alyosha) Efros 4207 TA: Jean-Francois Lalonde A521.
Jochen Triesch, UC San Diego, 1 COGS Visual Modeling Jochen Triesch & Martin Sereno Dept. of Cognitive Science UC.
Vision in Man and Machine. STATS 19 SEM Talk 2. Alan L. Yuille. UCLA. Dept. Statistics and Psychology.
Opportunities of Scale, Part 2 Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba.
Approaches to Representing and Recognizing Objects Visual Classification CMSC 828J – David Jacobs.
Welcome to CS 395/495 Measurement and Analysis of Online Social Networks.
Computer Vision (CSE/EE 576) Staff Prof: Steve Seitz TA: Aseem Agarwala Web Page
Welcome to CS 395/495 Internet Architectures. What is this class about? (1) Goal: to help you understand what the future Internet will look like –What.
SBU Digital Media CSE 690 Internet Vision Organizational Meeting Tamara Berg Assistant Professor SUNY Stony Brook.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean Hall 5409 T-R 10:30am – 11:50am.
Computer Integrated Surgery II /646/452 Spring 2004 Russell H. Taylor Th-Fri 1:00-2:15.
Class Info Website for materials available at: – We’ll use Moodle for turning in assignments, grades, s, blog.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
Human abilities Presented By Mahmoud Awadallah 1.
04/30/13 Last class: summary, goggles, ices Discrete Structures (CS 173) Derek Hoiem, University of Illinois 1 Image: wordpress.com/2011/11/22/lig.
How do we know that we solved vision? : Learning-Based Methods in Vision A. Efros, CMU, Spring 2009.
Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29,
Computing & Information Sciences Kansas State University Wednesday, 03 Dec 2008CIS 530 / 730: Artificial Intelligence Lecture 38 of 42 Wednesday, 03 December.
Visual Scene Understanding (CS 598) Derek Hoiem Course Number: Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday.
Science Fair How To Get Started… (
Graphical Models in Vision. Alan L. Yuille. UCLA. Dept. Statistics.
How to Read Research Papers? Xiao Qin Department of Computer Science and Software Engineering Auburn University
Why is computer vision difficult?
Welcome to EECS 395/495 Networking Problems in Cloud Computing.
UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.
CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.
11/26/2015 Copyright G.D. Hager Class 2 - Schedule 1.Optical Illusions 2.Lecture on Object Recognition 3.Group Work 4.Sports Videos 5.Short Lecture on.
CPSC : Data-driven Computer Graphics Jinxiang Chai.
Jack Pinches INFO410 & INFO350 S INFORMATION SCIENCE Computer Vision I.
ITCS 6265 Details on Project & Paper Presentation.
1 Choosing a Computer Science Research Problem. 2 Choosing a Computer Science Research Problem One of the hardest problems with doing research in any.
Computer Vision, CS766 Staff Instructor: Li Zhang TA: Yu-Chi Lai
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15.
A Brief Introduction on Face Detection Mei-Chen Yeh 04/06/2010 P. Viola and M. J. Jones, Robust Real-Time Face Detection, IJCV 2004.
Computer Vision No. 1 What is the Computer Vision?
Introduction to Recognition CS4670/5670: Intro to Computer Vision Noah Snavely mountain building tree banner vendor people street lamp.
Dana Nau: CMSC 722, AI Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
CS 558 Computer Vision John Oliensis. Today’s class What is vision What is computer vision How we can solve vision problems –Important tools –Overall.
BUS 642 Entire Course (2 Sets) FOR MORE CLASSES VISIT This Tutorial Contains 2 Sets of Assignments for All Weeks, Check Details.
SHAHAB iCV Research Group.
BUS 642 Course Experience Tradition / snaptutorial.com
The progress of the world depends almost entirely upon education
BUS 642 Help Bcome Exceptional / bus642.com
Lecture 25: Introduction to Recognition
Opportunities of Scale, Part 2
This Class This is a graduate level spatial modeling class in natural resources This will be one of the most challenging classes you’ll probably take You’ll.
Brief Review of Recognition + Context
This Class This is a graduate level spatial modeling class in natural resources This will be one of the most challenging classes you’ll probably take You’ll.
CAP 6412: Advanced Computer Vision
Topics in Formal Reasoning for Cyber-Physical Systems
GSP 470/570 Advanced Geospatial Analysis and Modeling
Presentation transcript:

16-824: Learning-based Methods in Vision Instructors: Alexei (Alyosha) Efros 225 Smith Hall Leon Sigal Disney Research Pittsburgh Web Page:

Today Introduction Why This Course? Administrative stuff Overview of the course

Alexei (Alyosha) Efros Ph.D 2003, from UC Berkeley (signed by Arnie!) Postdoctoral Fellow, University of Oxford, ’03-’04 Research Interests: Vision, Graphics, Data-driven “stuff” Leonid Sigal PhD 2007, from Brown University Postdoctoral Fellow, University of Toronto, ’07-’09 Research interests: Vision, Graphics, Machine Learning A bit about Us

Why this class? The Old Days™: 1. Graduate Computer Vision 2. Advanced Machine Perception

Why this class? The New and Improved Days: 1. Graduate Computer Vision 2. Advanced Machine Perception Physics-based Methods in Vision Geometry-based Methods in Vision Learning-based Methods in Vision

Describing Visual Scenes using Transformed Dirichlet ProcessesTransformed Dirichlet Processes. E. Sudderth, A. Torralba, W. Freeman, and A. Willsky. NIPS, Dec The Hip & Trendy Learning

Learning as Last Resort

from [Sinha and Adelson 1993] EXAMPLE: Recovering 3D geometry from single 2D projection Infinite number of possible solutions!

Learning-based Methods in Vision This class is about trying to solve problems that do not have a solution! Don’t tell your mathematician frineds! This will be done using Data: E.g. what happened before is likely to happen again Google Intelligence (GI): The AI for the post-modern world! Note: this is not quite statistics Why is this even worthwhile? Even a decade ago at ICCV99 Faugeras claimed it wasn’t!

The Vision Story Begins… “What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking.” -- David Marr, Vision (1982)

Vision: a split personality “What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking. In other words, vision is the process of discovering from images what is present in the world, and where it is.” Answer #1: pixel of brightness 243 at position (124,54) …and depth.7 meters Answer #2: looks like bottom edge of whiteboard showing at the top of the image Which do we want? Is the difference just a matter of scale? depth map

Measurement vs. Perception

Brightness: Measurement vs. Perception

Proof!

Lengths: Measurement vs. Perception Müller-Lyer Illusion

Vision as Measurement Device Real-time stereo on Mars Structure from Motion Physics-based Vision Virtualized Reality

…but why do Learning for Vision? “What if I don’t care about this wishy-washy human perception stuff? I just want to make my robot go!” Small Reason: For measurement, other sensors are often better (in DARPA Grand Challenge, vision was barely used!) For navigation, you still need to learn! Big Reason: The goals of computer vision (what + where) are in terms of what humans care about.

So what do humans care about? slide by Fei Fei, Fergus & Torralba

Verification: is that a bus? slide by Fei Fei, Fergus & Torralba

Detection: are there cars? slide by Fei Fei, Fergus & Torralba

Identification: is that a picture of Mao? slide by Fei Fei, Fergus & Torralba

Object categorization sky building flag wall banner bus cars bus face street lamp slide by Fei Fei, Fergus & Torralba

Scene and context categorization outdoor city traffic … slide by Fei Fei, Fergus & Torralba

Rough 3D layout, depth ordering slide by Fei Fei, Fergus & Torralba

Challenges 1: view point variation Michelangelo slide by Fei Fei, Fergus & Torralba

Challenges 2: illumination slide credit: S. Ullman

Challenges 3: occlusion Magritte, 1957 slide by Fei Fei, Fergus & Torralba

Challenges 4: scale slide by Fei Fei, Fergus & Torralba

Challenges 5: deformation Xu, Beihong 1943 slide by Fei Fei, Fergus & Torralba

Challenges 6: background clutter Klimt, 1913 slide by Fei Fei, Fergus & Torralba

Challenges 7: object intra-class variation slide by Fei-Fei, Fergus & Torralba

Challenges 8: local ambiguity slide by Fei-Fei, Fergus & Torralba

Challenges 9: the world behind the image

In this course, we will: Take a few baby steps…

Role of Learning Learning Algorithm Features Data

Role of Learning Data Features Algorithm Shashua

Course Outline Overview of Learning for Vision (1 lecture) Overview of Data for Vision (1 lecture) Features Human Perception and visual neuroscience Theories of Human Vision Low-level Vision Filters, edge detection, interest points, etc. Mid-level Vision Segmentation, Occlusions, 2-1/2D, scene layout, etc. High-Level Vision Object recognition Scene Understanding Action / Motion Understaing Etc.

Goals Read some interesting papers together Learn something new: both you and us! Get up to speed on big chunk of vision research understand 70% of CVPR papers! Use learning-based vision in your own work Learn how to speak Learn how think critically about papers Participate in an exciting meta-study!

Course Organization Requirements: 1.Class Participation (33%) Keep annotated bibliography Post on the Class Blog before each class Ask questions / debate / flight / be involved! 2.Two Projects (66%) Deconstruction Project Implement and Evaluate a paper and present it in class Must talk to us AT LEAST 2 weeks beforehand! Can be done in groups of 2 (but must do 2 projects) Synthesis Project Do something worthwhile with what you learned for Deconstruction Project Can be done in groups of 2 (1 project)

Class Participation Keep annotated bibliography of papers you read (always a good idea!). The format is up to you. At least, it needs to have: Summary of key points A few Interesting insights, “aha moments”, keen observations, etc. Weaknesses of approach. Unanswered questions. Areas of further investigation, improvement. Before each class: Submit your summary for current paper(s) in hard copy (printout/xerox) Submit a comment on the Class Blog ask a question, answer a question, post your thoughts,praise, criticism, start a discussion, etc.

Deconstruction Project 1.Pick a paper / set of papers from the list 2.Understand it as if you were the author Re-implement it If there is code, understand the code completely Run it on data the same data (you can contact authors for data and even code sometimes) 3.Understand it better than the author Run it on two other data sets (e.g. LabelMe dataset, Flickr dataset, etc, etc) Run it with two other feature representations Run it with two other learning algorithms Maybe suggest directions for improvement. 4.Prepare an amazing 45min presentation Discuss with me twice – once when you start the project, 3 days before the presentation

Synthesis Project Hopefully can grow out of the deconstruction project 2 people can work on one

End of Semester Awards We will vote for: Best Deconstruction Project Best Synthesis Project Best Blog Comment Prize: dinner in a French restaurant in Paris (transportation not included!) or some other worthy prizes