Deep Learning for Graphics and Vision

Slides:



Advertisements
Similar presentations
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Advertisements

Advanced topics.
Rajat Raina Honglak Lee, Roger Grosse Alexis Battle, Chaitanya Ekanadham, Helen Kwong, Benjamin Packer, Narut Sereewattanawoot Andrew Y. Ng Stanford University.
From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains Baochen Sun and Kate Saenko UMass Lowell.
Large-Scale Object Recognition with Weak Supervision
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
The University of Ontario CS 4487/9587 Algorithms for Image Analysis n Web page: Announcements, assignments, code samples/libraries,
Visual Scene Understanding (CS 598) Derek Hoiem Course Number: Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday.
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
12/7/10 Looking Back, Moving Forward Computational Photography Derek Hoiem, University of Illinois Photo Credit Lee Cullivan.
Deep Visual Analogy-Making
Unsupervised Visual Representation Learning by Context Prediction
Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,
Gaussian Conditional Random Field Network for Semantic Segmentation
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University
Conditional Generative Adversarial Networks
Recent developments in object detection
Generative Adversarial Nets
Generative Adversarial Network (GAN)
Convolutional Neural Network
Summary of “Efficient Deep Learning for Stereo Matching”
Deep Learning Amin Sobhani.
A Survey to Self-Supervised Learning
From Vision to Grasping: Adapting Visual Networks
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Deep Predictive Model for Autonomous Driving
Data Driven Attributes for Action Detection
Krishna Kumar Singh, Yong Jae Lee University of California, Davis
Saliency-guided Video Classification via Adaptively weighted learning
Matt Gormley Lecture 16 October 24, 2016
CSCI 5922 Neural Networks and Deep Learning Generative Adversarial Networks Mike Mozer Department of Computer Science and Institute of Cognitive Science.
Adversaries.
Combining CNN with RNN for scene labeling (segmentation)
CSCI 5922 Neural Networks and Deep Learning: Image Captioning
Synthesis of X-ray Projections via Deep Learning
CS6890 Deep Learning Weizhen Cai
Authors: Jun-Yan Zhu*, Taesun Park*, Phillip Isola, Alexei A. Efros
Adversarially Tuned Scene Generation
By: Kevin Yu Ph.D. in Computer Engineering
Computer Vision James Hays
Image Classification.
Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
ECE 599/692 – Deep Learning Lecture 1 - Introduction
Non-Stationary Texture Synthesis by Adversarial Expansion
KFC: Keypoints, Features and Correspondences
Semantic segmentation
GAN Applications.
Outline Background Motivation Proposed Model Experimental Results
Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu
Lip movement Synthesis from Text
View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.
Machine Learning based Data Analysis
Introduction to Object Tracking
Recent Advances in Generative Adversarial Networks (GAN)
Advances in Deep Audio and Audio-Visual Processing
Heterogeneous convolutional neural networks for visual recognition
Course Recap and What’s Next?
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Wrap-up Computer Vision Spring 2019, Lecture 26
Chuan Wang1, Haibin Huang1, Xiaoguang Han2, Jue Wang1
Human-object interaction
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
CVPR 2019 Tutorial on Map Synchronization
SFNet: Learning Object-aware Semantic Correspondence
Week 7 Presentation Ngoc Ta Aidean Sharghi
Computing the Stereo Matching Cost with a Convolutional Neural Network
Directional Occlusion with Neural Network
20 November 2019 Output maps Normal Diffuse Roughness Specular
Presentation transcript:

Deep Learning for Graphics and Vision Thomas Funkhouser Princeton University COS 598F, Spring 2017

Goals of the Seminar Study recent methods in deep learning for graphics and vision Survey the current state-of-the-art on learning visual representations Discuss research projects currently being done by students Brainstorm about future research directions

Seminar Organization Somewhere between a group meeting and a reading group Mainly reading, presentation, and discussion Recent papers and ongoing research No formal assignments No exams, no programming assignments Cannot be taken for undergraduate credit

Coursework Study and discuss research topics/questions Make presentation(s) describing the state of the field on a topic/question (30 minute talk, plus discussion afterwards) Read and discuss research papers Make a presentation describing at least one research paper in detail (20 minutes or so, plus discussion afterwards) Present ongoing research projects Make a presentation about your own research project related to the course (20 minute talk, plus discussion afterwards)

Focus of Study Using deep networks to learn representations of the real-world from images and leverage them in graphics/vision applications

What Will Not be Our Focus … Background on graphics and vision … Lighting, materials, cameras, optics, geometry, light transport, etc. Stereo, flow, tracking, reconstruction, etc. Hand-tuned features Background on deep learning … Network types CNN, RNN, LSTM, … Network architectures Layers, connections, pooling, … Training strategies Backpropagation, stochastic gradient descent Initialization, dropout, batch normalization

Proposed Topics Recent work on … Learning without full supervision Weakly supervised, self-supervised, unsupervised Learning 3D representations of the world Voxels, point clouds, surfaces Learning mappings between domains Text, RGB, depth, video, 3D shapes, etc. Synthetic data Applications at the boundary of graphics and vision Synthesis, editing, etc.

Proposed Topics Recent work on … Learning without full supervision Weakly supervised, self-supervised, unsupervised Learning 3D representations of the world Voxels, point clouds, surfaces Learning mappings between domains Text, RGB, depth, video, 3D shapes, etc. Synthetic data Applications at the boundary of graphics and vision Synthesis, editing, etc.

Supervised Learning Andrey Kurenkov Will unsupervised and self-supervised learning methods remove the need for human labels?

Supervised Learning What labels? More categories? More images per category? More specific spatial labels? More data with noisier labels? “What makes ImageNet good for Transfer Learning?” Jacob MY Huh, Pulkit Agrawal, Alexei Efros, NIPS Workshop on Large-Scale Computer Vision Systems, 2016 LSUN: Construction of a Large-Scale Image Dataset using Deep Learning with Humans in the Loop Fisher Yu Ari Seff Yinda Zhang Shuran Song Thomas Funkhouser Jianxiong Xiao, arXiv 2016 What's the point: Semantic segmentation with point supervision Amy Bearman, Olga Russakovsky, Vittorio Ferrari and Li Fei-Fei., ECCV 2016. Will unsupervised and self-supervised learning methods remove the need for human labels?

Weakly Supervised Learning How much can be learned from weak supervision? Dilated Residual Networks, Fisher Yu et al, 2017. Will unsupervised and self-supervised learning methods remove the need for human labels?

Self-Supervised Learning “Object-Centric Representation Learning from Unlabeled Videos,” Ruohan Gao, Dinesh Jayaraman, Kristen Grauman, ACCV 2016 Context encoders: Feature learning by inpainting D Pathak, P Krahenbuhl, J Donahue, T Darrell, AA Efros, CVPR 2016

Self-Supervised Learning What type of data naturally provide self-supervision? RGB-D images Multi-view Synthetic Video ???

Unsupervised Learning Variational Autoencoder

Generative Adversarial Networks What are the limits of GANs? Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Alec Radford, Luke Metz, Soumith Chintala, arXiv 2016

Reinforcement Learning Learning to Act by Predicting the Future Alexey Dosovitskiy and Vladlen Koltun, ICLR 2017

Physical Interaction Lerrel Pinto, Dhiraj Gandhi, Yuanfeng Han, Yong-Lae Park and Abhinav Gupta. The Curious Robot: Learning Visual Representations via Physical Interactions. ECCV 2016.

Questions to Consider Is Supervised Learning Necessary? Is there a combination of unsupervised, self-supervised, and reinforcement learning methods that will subsume the need for supervised human labels? What real-world constraints are useful for unsupervised learning? Multiple views, materials, lights, surfaces, occlusion, light transport, physics etc. Is physical interaction necessary to learn the best representations?

Proposed Topics Recent work on … Learning without full supervision Weakly supervised, self-supervised, unsupervised Learning 3D representations of the world Voxels, point clouds, surfaces Learning mappings between domains Text, RGB, depth, video, 3D shapes, etc. Synthetic data Applications at the boundary of graphics and vision Synthesis, editing, etc.

Discriminative Models of 3D Shape "VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition Daniel Maturana and Sebastian Scherer, IROS 2015

Discriminative Models of 3D Shape Input Labeled voxels Ground truth "S. Song, F. Yu, A. Zeng, A. Chang, M. Savva, and T. Funkhouser, “Semantic Scene Completion from a Single Depth Image,” arXiv 2016

Generative Models of 3D Shape "Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling," Wu, Jiajun, Zhang, Chengkai, Xue, Tianfan, Freeman, Bill, Tenenbaum, Josh, Advances in Neural Information Processing Systems, 2016

Generative Models of 3D Shape "Vconv-dae: Deep volumetric shape learning without object labels,“ Sharma, Abhishek, Grau, Oliver, Fritz, Mario, Computer Vision-ECCV 2016 Workshops, 2016

Generative Models of 3D Shape "Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling," Wu, Jiajun, Zhang, Chengkai, Xue, Tianfan, Freeman, Bill, Tenenbaum, Josh, NIPS 2016

Questions to Consider What parameterization is best for 3D shapes? Voxels, point cloud, surface fans, mapping to sphere, etc.? How should networks for 3D shapes be different than for natural images? How do networks, datasets, and training protocols have to be adapted to work effectively for 3D data?

Proposed Topics Recent work on … Learning without full supervision Weakly supervised, self-supervised, unsupervised Learning 3D representations of the world Voxels, point clouds, surfaces Learning mappings between domains Text, RGB, depth, video, 3D shapes, etc. Synthetic data Applications at the boundary of graphics and vision Synthesis, editing, etc.

Image to 3D Shape Representation Multi-view Convolutional Neural Networks for 3D Shape Recognition Hang Su Subhransu Maji Evangelos Kalogerakis Erik Learned-Miller, ICCV 2015

Image and Depth Together Learning with Side Information through Modality Hallucination, Judy Hoffman, Saurabh Gupta, Trevor Darrell In Proc. Computer Vision and Pattern Recognition (CVPR), 2016.

Text to Image Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Honglak Lee, Bernt Schiele ICML 2016

Learning Dense Correspondence via 3D-guided Cycle Consistency Image to 3D to Image Learning Dense Correspondence via 3D-guided Cycle Consistency Tinghui Zhou, Philipp Krähenbühl, Mathieu Aubry, Qixing Huang and Alyosha Efros, CVPR 2016

Questions to Consider How to transfer representation learned in one domain to another? How to leverage information from multiple domains into a single representation?

Proposed Topics Recent work on … Learning without full supervision Weakly supervised, self-supervised, unsupervised Learning 3D representations of the world Voxels, point clouds, surfaces Learning mappings between domains Text, RGB, depth, video, 3D shapes, etc. Synthetic data Applications at the boundary of graphics and vision Synthesis, editing, etc.

Inverse Graphics Deep Convolutional Inverse Graphics Network, Tejas D. Kulkarni , Will Whitney, Pushmeet Kohli , Joshua B. Tenenbaum, arXiv:1503.03167v4 [cs.CV] 22 Jun 2015

Image Colorization Colorful Image Colorization Richard Zhang, Phillip Isola, Alexei A. Efros, ECCV 2016

Image Style Transfer A Neural Algorithm of Artistic Style Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, CVPR 2016

Image Editing Neural Photo Editing with Introspective Adversarial Networks Andrew Brock, Theodore Lim, J.M. Ritchie, Nick Weston 2016

Questions to Consider Will GANs completely replace patch-based synthesis? Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis, Chuan Li, Michael Wand, CVPR 2016

Questions? Comments?

For the Rest of this Meeting … Break into small groups Discuss what are the most interesting and important topics to study in Deep Learning for Graphics and Vision Report back to the whole group Integrate suggestions into the class schedule