Understanding and Predicting Image Memorability at a Large Scale A. Khosla, A. S. Raju, A. Torralba and A. Oliva International Conference on Computer Vision.

Slides:



Advertisements
Similar presentations
Classification spotlights
Advertisements

What makes an image memorable?
A generic model to compose vision modules for holistic scene understanding Adarsh Kowdle *, Congcong Li *, Ashutosh Saxena, and Tsuhan Chen Cornell University,
Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.
DeepID-Net: deformable deep convolutional neural network for generic object detection Wanli Ouyang, Ping Luo, Xingyu Zeng, Shi Qiu, Yonglong Tian, Hongsheng.
Large-Scale Object Recognition with Weak Supervision
Spatial Pyramid Pooling in Deep Convolutional
Generic object detection with deformable part-based models
PRED 354 TEACH. PROBILITY & STATIS. FOR PRIMARY MATH Lesson 14 Correlation & Regression.
Kuan-Chuan Peng Tsuhan Chen
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Understanding the Emotional Impact of Images Jia JIA Computer Science, Tsinghua University Joint work with Xiaohui WANG, Peiyun HU, Sen WU, Jie TANG and.
Describing Relationships Using Correlations. 2 More Statistical Notation Correlational analysis requires scores from two variables. X stands for the scores.
VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.
Feedforward semantic segmentation with zoom-out features
Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, Xiangyang Xue
ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,
Loss-based Learning with Weak Supervision M. Pawan Kumar.
PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang 1,2 Manohar Paluri 1 Marć Aurelio Ranzato 1 Trevor Darrell 2 Lumbomir Boudev 1 1 Facebook.
Spatial Localization and Detection
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.
ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Scale Up Video Understanding with Deep Learning May 30, 2016 Chuang Gan Tsinghua University 1.
City Forensics: Using Visual Elements to Predict Non-Visual City Attributes Sean M. Arietta, Alexei A. Efros, Ravi Ramamoorthi, Maneesh Agrawala Presented.
Wenchi MA CV Group EECS,KU 03/20/2017
Recent developments in object detection
Deep Learning for Dual-Energy X-Ray
Convolutional Neural Network
Object Detection based on Segment Masks
Object detection with deformable part-based models
From Vision to Grasping: Adapting Visual Networks
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Krishna Kumar Singh, Yong Jae Lee University of California, Davis
Saliency-guided Video Classification via Adaptively weighted learning
Understanding and Predicting Image Memorability at a Large Scale
Article Review Todd Hricik.
Lecture 24: Convolutional neural networks
Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD
Adversaries.
Compositional Human Pose Regression
Part-Based Room Categorization for Household Service Robots
From: What's color got to do with it
Machine Learning: The Connectionist
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Adri`a Recasens, Aditya Khosla, Carl Vondrick, Antonio Torralba
Enhanced-alignment Measure for Binary Foreground Map Evaluation
A Convolutional Neural Network Cascade For Face Detection
Bird-species Recognition Using Convolutional Neural Network
Deep Face Recognition Omkar M. Parkhi Andrea Vedaldi Andrew Zisserman
Feature Film Features: Applying machine learning to movie genre identification  CSCI 5622 Group L: Grant Baker, John Dinkel, Derek Gorthy, Jeffrey Maierhofer,
Introduction to Neural Networks
Image Classification.
NormFace:
Convolutional Neural Networks for Visual Tracking
Visualizing and Understanding Convolutional Networks
Object Tracking: Comparison of
Adarsh Kowdle*, Congcong Li*, Ashutosh Saxena, and Tsuhan Chen
Ladislav Rampasek, Anna Goldenberg  Cell 
Heterogeneous convolutional neural networks for visual recognition
Abnormally Detection
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
Do Better ImageNet Models Transfer Better?
Presentation transcript:

Understanding and Predicting Image Memorability at a Large Scale A. Khosla, A. S. Raju, A. Torralba and A. Oliva International Conference on Computer Vision (ICCV), 2015 Presented by Yue Guo

Memorable ~90% Average ~70% Forgettable ~40% Isola et al (2014). PAMI Large difference in image memorability

Main Contributions 1.Built LaMem, the largest annotated image memorability dataset to date; 2.Achieved the state-of-the-art performance on the LaMem using Convolutional Neural Networks (CNNs); 3.Provided a method to perform image memorability manipulation

LaMem It contains 60,000 images from MIR Flickr, AVA dataset, affective images dataset (consisting of Art and Abstract datasets), image saliency datasets (MIT1003 and NUSEF), SUN, image popularity dataset, Abnormal Objects dataset, aPascal dataset etc. Memorability score

Procedure The vigilance repeats ensured that workers were paying attention. About 27 times larger than the previous dataset. Model memorability with time delay between images by a log- linear relationship. Obtained 80 scores per image on average, resulting in a total of about 5 million data points.

Math Memorability as a function of time interval Object function Solve this optimization problem using Expectation–maximization (EM)

Findings 1.Popularity: The popularity scores of the most memorable images are statistically higher than those of the others 2.Saliency Images that are more memorable tend to have more consistent human fixations 3.Emotion Images that evoke negative emotions such as anger and fear tend to be more memorable than those portraying positive ones 4.Aesthetics The aesthetic score of an image and its memorability have little to no correlation.

CNNs AlexNet, pre-trained on ILSVRC 2012 and Places dataset, fine-tuned with a Euclidean loss layer as the last layer, called MemNet. Rank correlation Human consistency on LaMem is 0.68

Analysis Visualization the CNN features after fine-tuning The memorability heat maps

Subjective judgments do not predict image memorability Memorable Forgettalbe Think memorable Think forgettable (89%)(86%) (44%)(40%) Isola et al (2011). Neural Information Processing Systems (NIPS)