Understanding and Predicting Image Memorability at a Large Scale A. Khosla, A. S. Raju, A. Torralba and A. Oliva International Conference on Computer Vision.

Slides:

Advertisements

Similar presentations

Classification spotlights

Advertisements

What makes an image memorable?

A generic model to compose vision modules for holistic scene understanding Adarsh Kowdle *, Congcong Li *, Ashutosh Saxena, and Tsuhan Chen Cornell University,

Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.

DeepID-Net: deformable deep convolutional neural network for generic object detection Wanli Ouyang, Ping Luo, Xingyu Zeng, Shi Qiu, Yonglong Tian, Hongsheng.

Large-Scale Object Recognition with Weak Supervision

Spatial Pyramid Pooling in Deep Convolutional

Generic object detection with deformable part-based models

PRED 354 TEACH. PROBILITY & STATIS. FOR PRIMARY MATH Lesson 14 Correlation & Regression.

Kuan-Chuan Peng Tsuhan Chen

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Understanding the Emotional Impact of Images Jia JIA Computer Science, Tsinghua University Joint work with Xiaohui WANG, Peiyun HU, Sen WU, Jie TANG and.

Describing Relationships Using Correlations. 2 More Statistical Notation Correlational analysis requires scores from two variables. X stands for the scores.

VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR

Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:

Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.

Feedforward semantic segmentation with zoom-out features

Zuxuan Wu, Xi Wang, Yu-Gang Jiang, Hao Ye, Xiangyang Xue

ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Loss-based Learning with Weak Supervision M. Pawan Kumar.

PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang 1,2 Manohar Paluri 1 Marć Aurelio Ranzato 1 Trevor Darrell 2 Lumbomir Boudev 1 1 Facebook.

Spatial Localization and Detection

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.

ICCV 2009 Tilke Judd, Krista Ehinger, Fr´edo Durand, Antonio Torralba.

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

Scale Up Video Understanding with Deep Learning May 30, 2016 Chuang Gan Tsinghua University 1.

City Forensics: Using Visual Elements to Predict Non-Visual City Attributes Sean M. Arietta, Alexei A. Efros, Ravi Ramamoorthi, Maneesh Agrawala Presented.

Wenchi MA CV Group EECS,KU 03/20/2017

Recent developments in object detection

Deep Learning for Dual-Energy X-Ray

Convolutional Neural Network

Object Detection based on Segment Masks

Object detection with deformable part-based models

From Vision to Grasping: Adapting Visual Networks

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Saliency-guided Video Classification via Adaptively weighted learning

Understanding and Predicting Image Memorability at a Large Scale

Article Review Todd Hricik.

Lecture 24: Convolutional neural networks

Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD

Compositional Human Pose Regression

Part-Based Room Categorization for Household Service Robots

From: What's color got to do with it

Machine Learning: The Connectionist

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Adri`a Recasens, Aditya Khosla, Carl Vondrick, Antonio Torralba

Enhanced-alignment Measure for Binary Foreground Map Evaluation

A Convolutional Neural Network Cascade For Face Detection

Bird-species Recognition Using Convolutional Neural Network

Deep Face Recognition Omkar M. Parkhi Andrea Vedaldi Andrew Zisserman

Feature Film Features: Applying machine learning to movie genre identification CSCI 5622 Group L: Grant Baker, John Dinkel, Derek Gorthy, Jeffrey Maierhofer,

Introduction to Neural Networks

Image Classification.

Convolutional Neural Networks for Visual Tracking

Visualizing and Understanding Convolutional Networks

Object Tracking: Comparison of

Adarsh Kowdle*, Congcong Li*, Ashutosh Saxena, and Tsuhan Chen

Ladislav Rampasek, Anna Goldenberg Cell

Heterogeneous convolutional neural networks for visual recognition

Abnormally Detection

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

Do Better ImageNet Models Transfer Better?

Presentation transcript:

Understanding and Predicting Image Memorability at a Large Scale A. Khosla, A. S. Raju, A. Torralba and A. Oliva International Conference on Computer Vision (ICCV), 2015 Presented by Yue Guo

Memorable ~90% Average ~70% Forgettable ~40% Isola et al (2014). PAMI Large difference in image memorability

Main Contributions 1.Built LaMem, the largest annotated image memorability dataset to date; 2.Achieved the state-of-the-art performance on the LaMem using Convolutional Neural Networks (CNNs); 3.Provided a method to perform image memorability manipulation

LaMem It contains 60,000 images from MIR Flickr, AVA dataset, affective images dataset (consisting of Art and Abstract datasets), image saliency datasets (MIT1003 and NUSEF), SUN, image popularity dataset, Abnormal Objects dataset, aPascal dataset etc. Memorability score

Procedure The vigilance repeats ensured that workers were paying attention. About 27 times larger than the previous dataset. Model memorability with time delay between images by a log- linear relationship. Obtained 80 scores per image on average, resulting in a total of about 5 million data points.

Math Memorability as a function of time interval Object function Solve this optimization problem using Expectation–maximization (EM)

Findings 1.Popularity: The popularity scores of the most memorable images are statistically higher than those of the others 2.Saliency Images that are more memorable tend to have more consistent human fixations 3.Emotion Images that evoke negative emotions such as anger and fear tend to be more memorable than those portraying positive ones 4.Aesthetics The aesthetic score of an image and its memorability have little to no correlation.

CNNs AlexNet, pre-trained on ILSVRC 2012 and Places dataset, fine-tuned with a Euclidean loss layer as the last layer, called MemNet. Rank correlation Human consistency on LaMem is 0.68

Analysis Visualization the CNN features after fine-tuning The memorability heat maps

Subjective judgments do not predict image memorability Memorable Forgettalbe Think memorable Think forgettable (89%)(86%) (44%)(40%) Isola et al (2011). Neural Information Processing Systems (NIPS)