Food Recognition Using Statistics of Pairwise Local Features Shulin (Lynn) Yang University of Washington Mei Chen Intel Labs Pittsburgh Dean Pomerleau Robotics Institute Rahul Sukthankar Carnegie Mellon 1
Abstract Food items are deformable objects that exhibit significant variations in appearance Food recognition is difficult the key to recognizing food is to exploit the spatial relationships between different ingredients (such as meat and bread in a sandwich). 2
Introduction The goals of such systems are to enable people to better understand the nutritional content of their dietary choices and to provide medical professionals with objective measures of their patients’ food intake. 3
Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 4
Semantic Texton Forest (STF) 5
6
7
8
Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 9
Global Ingredient Representation (GIR) 10
Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 11
Pairwise Features 12
Between-pair category Between-pair category : B(P1,P2) The feature for each pixel pair has t discrete values, t being the number of pixels exist along the line between a pair of pixels. We use to represent the feature set for pixels P1 and P2. 13
14
15
Joint pairwise features 16
Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 17
Histogram representation for pairwise feature distribution 18
19
Pairwise local feature distribution (PFD) 1. Soft labeling of pixels 2. Global Ingredient Representation (GIR) 3. Pairwise Features 4. Histogram representation for pairwise feature distribution 5. Histogram normalization 6. Classification with local feature distributions 20
Classification with local feature distributions 21
Experimental Methodology 1. Dataset 2. Baseline approaches 3. Preprocessing with STF 22
Pittsburgh Food Image Dataset(PFID) 23
Experimental Methodology 1. Dataset 2. Baseline approaches 3. Preprocessing with STF 24
Bag of SIFT features 25
SVM(Support Vector Machine) 26
SVM 理論 實線為找出的 Hyper-plan ,將 H1 與 H2 稱之為 Support Hyper-plans ,而我們希望能夠找出最佳的 Classification Hyper-plan 使兩 Support Hyper-plans 之 間有最大的 Margin 。 27
Experimental Methodology 1. Dataset 2. Baseline approaches 3. Preprocessing with STF 28
Preprocessing with STF 29
Results 1. Classification accuracy on the 61 categories 30
Confusion matrix Rows: the 61 categories of food Columns: the ground truth categories 31
Such cases are challenging 32
Even for humans, to distinguish. So 61 PFID food categories 7 major groups 33
2. Classification accuracy into 7 major food types 1.sandwiches 2.salads/sides 3.chicken 4.breads/pastries 5.donuts 6.bagels 7.tacos 34
35
Confusion matrix Rows: the major 7 food categories Columns: the ground truth major categories 36
Result (OM) Orientation and midpoint is the higher-order feature that gives the best accuracy. This pair of features is able to leverage the vertically-layered structure of many fast foods. 37
In future work We plan to extend our method to: (1) Perform food detection and spatial localization in addition to whole-image recognition (2) Handle cluttered images containing several foods and non-food items (3) Develop practical food recognition applications (4) Explore how the proposed method generalizes to other recognition domains 38