Presentation on theme: "Lecture 12 What happens if we solve object recognition? 6.870 Object Recognition and Scene Understanding"— Presentation transcript:
Lecture 12 What happens if we solve object recognition? Object Recognition and Scene Understanding
Do we really need computer vision? We can put RFID tags to all objects, we can use GPS, … Sure, we also do not need to understand locomotion if we are happy being plants. Some objects might not like to have RFID tags attached to them. The goal of Vision is to understand the world, whether the world wants or not.
Polygons world Blocks world Imagine that object recognition and segmentation is solved, so, now what?
Here word recognition is solved, we can access the meaning of the words. But yet, we are far from having solved language understanding. Detecting objects is just one small piece of understanding scenes (it might not even be the hardest). Images and sequence tell stories, and the structure of those stories are as complex as sentences, paragraphs and books. “I went to the airport by car, but it took me a very long time because of the traffic.”
Images, with our current set of features, look more like strange sentences: I performed the action of going from one place to another in order to reach the point from where there are devices with wings in which people can get inside and perform the action of going from one place to another even when the other place is really far away. To perform the first action, I used another device that lacks wings and that, instead, has four round things attached to the sides. It took me a long time to complete the first action as there was many other people using the same device-with-four-round-things attached to them performing similar actions to me and trying to occupy the same space as me.
How to give a talk
How to give a talk Preparation: It helps me to go to the conference room a day earlier and get a sense of the speakers viewpoint. This way, the day of the talk it will not be the first time I am in that situation. It removes uncertainty. There are no unimportant talks. There are no big or small audiences. Prepare each talk with the same enthusiasm.
How to give a talk Delivering: Look at the audience! Try not to talk to your laptop or to the screen. Instead, look at the other humans in the room. You have to believe in what you present, be confident… even if it only lasts for the time of your presentation. Do not be afraid to acknowledge limitations of whatever you are presenting. Limitations are good. They leave job for the people to come. Trying to hide the problems in your work will make the preparation of the talk a lot harder and your self confidence will be hurt.
How to give a talk Talk organization: here there are as many theories as there are talks. Here there are some extreme advices: 1.Go into details / only big picture 2.Go in depth on a single topic / cover as many things as you can 3.Be serious (never make jokes, maybe only one) / be funny (it is just another form of theater) Corollary: ask people for advice, but at the end, if will be just you and the audience. Chose what fits best your style. What everybody agree on is that you have to practice in advance (the less your experience, the more you have to practice). Do it with an audience or without, but practice. The best advice I got came from Yair Weiss while preparing my job talk: “just give a good talk”
How to give the project class talk Initial conditions: I started with a great idea It did not work The day before the presentation I found 40 papers that already did this work Then I also realized that the idea was not so great How do I present? Just give a good talk
Next week Alec Rivers Scene Understanding Based on Object Relationships Gokberk Cinbis Category Level 3D Object Detection Using View-Invariant Representations Hueihan Jhuang and Sharat Chikkerur Video shot boundary detection using GIST representation Jenny Yuen Semiautomatic alignment of text and images Nathaniel R Twarog A Filtering Approach to Image Segmentation: Perceptual Grouping in Feature Space Nicolas Pinto Evaluating dense feature descriptor and multi-kernel learning for face detection/recognition Tilke Judd and Vladimir Bychkovsky Identify the same people in different photographs from the same event Tom Kollar Context-based object priors for scene understanding Tom Ouyang Hand-Drawn Sketch Recognition, A Vision-Based Approach Papers due this Friday: send PDF by