Food Recognition Using Statistics of Pairwise Local Features Shulin (Lynn) Yang University of Washington Mei Chen Intel Labs Pittsburgh Dean Pomerleau.

Slides:

Advertisements

Similar presentations

Michele Merler Jacquilene Jacob.  Applications online are inherently insecure  Growing rate of hackers  Confidentiality of online systems should be.

Advertisements

The Extended Cohn-Kanade Dataset(CK+):A complete dataset for action unit and emotion-specified expression Author：Patrick Lucey, Jeffrey F. Cohn, Takeo.

Exemplar-Based Segmentation of Pigmented Skin Lesions from Dermoscopy Images Mei Chen Intel Labs Pittsburgh Approach Motivation Skin.

電腦視覺 Computer and Robot Vision I

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012.

Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,

Human Action Recognition by Learning Bases of Action Attributes and Parts.

Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.

Robust Object Tracking via Sparsity-based Collaborative Model

1 Mining Relationships Among Interval-based Events for Classification Dhaval Patel 、 Wynne Hsu Mong 、 Li Lee SIGMOD 08.

Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.

Image Categorization by Learning and Reasoning with Regions Yixin Chen, University of New Orleans James Z. Wang, The Pennsylvania State University Published.

Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:

5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.

Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.

Selective Transfer Machine for Personalized Facial Action Unit Detection Wen-Sheng Chu, Fernando De la Torre and Jeffery F. Cohn Robotics Institute, Carnegie.

Real-time Action Recognition by Spatiotemporal Semantic and Structural Forest Tsz-Ho Yu, Tae-Kyun Kim and Roberto Cipolla Machine Intelligence Laboratory,

Classification with Hyperplanes Defines a boundary between various points of data which represent examples plotted in multidimensional space according.

Autonomous Learning of Object Models on Mobile Robots Xiang Li Ph.D. student supervised by Dr. Mohan Sridharan Stochastic Estimation and Autonomous Robotics.

Final Exam Review CS485/685 Computer Vision Prof. Bebis.

A Thousand Words in a Scene P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez and T. Tuytelaars PAMI, Sept

Recognizing Deformable Shapes Salvador Ruiz Correa Ph.D. Thesis, Electrical Engineering.

Learning Visual Bits with Direct Feature Selection Joel Jurik 1 and Rahul Sukthankar 2,3 1 University of Central Florida 2 Intel Research Pittsburgh 3.

Why Categorize in Computer Vision ?. Why Use Categories? People love categories!

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)

Multimodal Information Analysis for Emotion Recognition

Pseudo-supervised Clustering for Text Documents Marco Maggini, Leonardo Rigutini, Marco Turchi Dipartimento di Ingegneria dell’Informazione Università.

Object Detection with Discriminatively Trained Part Based Models

Experiments Test different parking lot images captured in different luminance conditions The test samples include 1300 available parking spaces and 1500.

Deformable Part Model Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 11 st, 2013.

A Statistical Method for 3D Object Detection Applied to Face and Cars CVPR 2000 Henry Schneiderman and Takeo Kanade Robotics Institute, Carnegie Mellon.

Face Recognition by Support Vector Machines 指導教授 : 王啟州教授學生 : 陳桂華 Guodong Guo, Stan Z. Li, and Kapluk Chan School of Electrical and Electronic Engineering.

Exploiting Wikipedia Categorization for Predicting Age and Gender of Blog Authors K Santosh Aditya Joshi Manish Gupta Vasudeva Varma

1 A Compact Feature Representation and Image Indexing in Content- Based Image Retrieval A presentation by Gita Das PhD Candidate 29 Nov 2005 Supervisor:

Histograms of Oriented Gradients for Human Detection(HOG)

Ilya Gurvich 1 An Undergraduate Project under the supervision of Dr. Tammy Avraham Conducted at the ISL Lab, CS, Technion.

Hierarchical Matching with Side Information for Image Classification

Design of PCA and SVM based face recognition system for intelligent robots Department of Electrical Engineering, Southern Taiwan University, Tainan County,

Dense Color Moment: A New Discriminative Color Descriptor Kylie Gorman, Mentor: Yang Zhang University of Central Florida I.Problem:  Create Robust Discriminative.

Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.

A New Method for Crater Detection Heather Dunlop November 2, 2006.

An ANN Approach to Identify if Driver is Wearing Safety Belts Hanwen Chen 12/9/2013.

Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.

On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,

Cell Segmentation in Microscopy Imagery Using a Bag of Local Bayesian Classifiers Zhaozheng Yin RI/CMU, Fall 2009.

Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.

Date of download: 6/1/2016 Copyright © 2016 SPIE. All rights reserved. Diagrams of the (a) OC-SVM classifier and (b) MHOC-SVM classifier. Figure Legend:

Gaussian Conditional Random Field Network for Semantic Segmentation

Bag-of-Visual-Words Based Feature Extraction

Data Driven Attributes for Action Detection

CLASSIFICATION OF TUMOR HISTOPATHOLOGY VIA SPARSE FEATURE LEARNING Nandita M. Nayak1, Hang Chang1, Alexander Borowsky2, Paul Spellman3 and Bahram Parvin1.

An Additive Latent Feature Model

Recognizing Deformable Shapes

Thesis Advisor : Prof C.V. Jawahar

Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules

Context-Aware Modeling and Recognition of Activities in Video

Action Recognition in Temporally Untrimmed Videos

Outline Announcement Texture modeling - continued Some remarks

CS 1674: Intro to Computer Vision Scene Recognition

CVPR 2014 Orientational Pyramid Matching for Recognizing Indoor Scenes

Unsupervised Classification

iSRD Spam Review Detection with Imbalanced Data Distributions

Human Action Recognition Week 8

Outline Background Motivation Proposed Model Experimental Results

Recognizing Deformable Shapes

SFNet: Learning Object-aware Semantic Correspondence

Presentation transcript:

Food Recognition Using Statistics of Pairwise Local Features Shulin (Lynn) Yang University of Washington Mei Chen Intel Labs Pittsburgh Dean Pomerleau Robotics Institute Rahul Sukthankar Carnegie Mellon 1

Abstract Food items are deformable objects that exhibit significant variations in appearance  Food recognition is difficult the key to recognizing food is to exploit the spatial relationships between different ingredients (such as meat and bread in a sandwich). 2

Introduction The goals of such systems are to enable people to better understand the nutritional content of their dietary choices and to provide medical professionals with objective measures of their patients’ food intake. 3

Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 4

Semantic Texton Forest (STF) 5

6

7

8

Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 9

Global Ingredient Representation (GIR) 10

Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 11

Pairwise Features 12

Between-pair category Between-pair category : B(P1,P2) The feature for each pixel pair has t discrete values, t being the number of pixels exist along the line between a pair of pixels. We use to represent the feature set for pixels P1 and P2. 13

14

15

Joint pairwise features 16

Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 17

Histogram representation for pairwise feature distribution 18

19

Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 20

Classification with local feature distributions 21

Experimental Methodology 1. Dataset 2. Baseline approaches 3. Preprocessing with STF 22

Pittsburgh Food Image Dataset(PFID) 23

Experimental Methodology 1. Dataset 2. Baseline approaches 3. Preprocessing with STF 24

Bag of SIFT features 25

SVM(Support Vector Machine) 26

SVM 理論實線為找出的 Hyper-plan ，將 H1 與 H2 稱之為 Support Hyper-plans ，而我們希望能夠找出最佳的 Classification Hyper-plan 使兩 Support Hyper-plans 之間有最大的 Margin 。 27

Experimental Methodology 1. Dataset 2. Baseline approaches 3. Preprocessing with STF 28

Preprocessing with STF 29

Results 1. Classification accuracy on the 61 categories 30

Confusion matrix Rows: the 61 categories of food Columns: the ground truth categories 31

Such cases are challenging 32

Even for humans, to distinguish. So 61 PFID food categories 7 major groups 33

2. Classification accuracy into 7 major food types 1.sandwiches 2.salads/sides 3.chicken 4.breads/pastries 5.donuts 6.bagels 7.tacos 34

35

Confusion matrix Rows: the major 7 food categories Columns: the ground truth major categories 36

Result (OM) Orientation and midpoint is the higher-order feature that gives the best accuracy. This pair of features is able to leverage the vertically-layered structure of many fast foods. 37

In future work We plan to extend our method to: (1) Perform food detection and spatial localization in addition to whole-image recognition (2) Handle cluttered images containing several foods and non-food items (3) Develop practical food recognition applications (4) Explore how the proposed method generalizes to other recognition domains 38