Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ilya Gurvich 1 An Undergraduate Project under the supervision of Dr. Tammy Avraham Conducted at the ISL Lab, CS, Technion.

Similar presentations


Presentation on theme: "Ilya Gurvich 1 An Undergraduate Project under the supervision of Dr. Tammy Avraham Conducted at the ISL Lab, CS, Technion."— Presentation transcript:

1 Ilya Gurvich 1 An Undergraduate Project under the supervision of Dr. Tammy Avraham Conducted at the ISL Lab, CS, Technion

2 2

3 What is it? 3

4 sky trees hill brushes trees river water trees 4

5 5

6  “Ourdoor” LabelMe category.  Additional filtering of “open country”, “mountain” and “coast” images.  A total of 1144 images (256x256 pixels each).  These images are divided to an equally sized “training set” and a “testing set”.  Handling synonyms 6

7  Use features vectors to represent patches  Use the multi-class SVM algorithm to learn the classes which these patches belong to  Find the optimal parameters for the SVM algorithm  Classify whole regions  This project is a part of a larger study in which global context was used 7

8  The feature vector must be as discriminative as possible  Our feature vector contains a concatenation of: ◦ HSV Histogram ◦ Edges Directions Histogram / Histogram of Oriented Gradients (HoG) ◦ Gray-Level Co-occurrence Matrix (GLCM)  Based on Vogel & Scheile IJCV 2007 8

9  Each color channel (i.e. Hue, Saturation, Value) is used to build its histogram  These histograms are then concatenated 9

10  The image is first converted to gray-scale  The Canny algorithm is then used to detect edges  For each pixel on which an edge is detected the direction of the gradient is calculated  The directions are then quantified and distributed to the histogram bins  The histogram is then normalized 10

11  Used as an improvement to the Edges Directions Histogram  A gray-scale image is used  The directions and magnitudes of the gradients are calculated for every pixel of the image  The directions are quantified. Every pixel adds the gradient magnitude to the histogram bin determined by the direction  More formally: ◦ The value of a bin for the directions in the range [α,α+Δα] is: ◦ Where I is the image, is the gradient at the pixel p. 11

12  Grey-Level Co-occurrence Matrix texture measurements have been the workhorse of image texture since they were proposed by Haralick in the 1970s.  Works on gray-scale images  Everyday texture terms - rough, silky, bumpy - refer to touch.  A texture that is rough to touch has: ◦ A large difference between high and low points, and ◦ A space between highs and lows approximately the same size as a finger.  Silky would have ◦ Little difference between high and low points, and ◦ The differences would be spaced very close together relative to finger size. Adapted from http://www.fp.ucalgary.ca/mhallbey/tutorial.htm By Mryka Hall-Beyerhttp://www.fp.ucalgary.ca/mhallbey/tutorial.htm 12

13  The GLCM is a tabulation of how often different combinations of pixel brightness values occur in an image.  The input to the GLCM computation algorithm a gray-scale image and a displacement vector (D).  The size of the GLCM matrix is NxN where N is the number of quantified gray-levels.  GLCM(i,j) counts the number of times that a pixel with the value of i was in the image and within an offset D from that pixel was a pixel with the value of j.  More formally: ◦ Where: (GLCM) i,j is the value of the GLCM matrix entry at (i,j). I is the image. R – the rows of the image. C – the columns of the image. I a,b is the gray-scale value at the pixel (i,j) in the image. 13

14  We compute 4 GLCMs with the following displacements: ◦ (1,0), (1,1), (0,1), (-1,1)  We then calculate the following statistical measurements on each of the GLCMs: ◦ Contrast, Energy, Entropy, Homogeneity, Inverse Difference Moment, Correlation.  The 6 measurements per GLCM are then concatenated, forming a vector of 24 elements. 14

15 15

16  A multi-class SVM algorithm with an RBF kernel is used to classify patches  A grid-search was performed to find the optimal SVM parameters: C and γ  The grid-search was implemented to execute parallelly in MATLAB  On a 4-core 2.5 GHz machine the search ran for 2 days 16

17 17

18 truth\predictionfieldgrassgroundlandmountainplainplantsriverrockssandseaskytreessnow field66.10.30.90.019.21.12.40.01.20.64.20.43.70.0 grass91.90.0 2.00.03.00.0 3.00.0 ground19.00.04.50.046.50.0 0.81.714.113.20.20.0 land mountain4.00.00.90.082.20.10.20.1 0.43.77.50.90.0 plain18.50.0 42.90.0 3.633.31.80.0 plants49.40.0 28.80.07.70.0 14.10.0 river5.00.05.80.034.70.00.20.50.01.634.317.90.0 rocks9.00.012.50.067.40.0 10.40.00.70.0 sand4.90.00.60.027.60.90.00.10.012.032.421.50.0 sea4.40.00.90.013.10.0 0.40.03.761.915.60.0 sky0.0 4.40.0 0.53.691.40.00.1 trees30.60.0 49.20.00.50.0 0.31.21.117.10.0 snow0.0 11.80.0 27.10.0 2.411. 8 31. 8 0.015.3 General accuracy rate: 71.76% 18

19 truth\predictionfieldgrassgroundlandmountainplainplantsriverrockssandseaskytreessnow field66.40.51.00.019.21.02.00.11.20.54.20.43.70.0 grass92.90.02.00.02.00.0 3.00.0 ground19.20.04.90.046.70.0 0.81.713.712.80.20.0 land mountain3.90.11.00.082.00.1 0.43.87.30.90.0 plain18.50.00.60.042.90.0 4.831.51.80.0 plants53.80.00.60.027.60.03.20.0 14.70.0 river5.50.06.10.034.30.0 0.60.01.334.417.80.0 rocks9.00.010.40.068.10.0 10.40.02.10.0 sand5.10.00.30.028.20.70.0 13.132.919.70.0 sea4.40.00.90.013.10.0 0.20.03.662.515.40.0 sky0.0 4.50.0 0.43.591.40.00.1 trees30.50.0 50.00.2 0.0 1.10.917.20.0 snow0.0 11.80.0 25.90.0 2.412. 9 34. 1 0.012.9 General accuracy rate: 71.85 % 19

20  The accuracy rates are correlated with the sizes of the classes ◦ Unbalanced dataset ◦ Learning the prior  Members of smaller classes are often confused with the semantically most similar larger class  Labeling noise  Upper bound on the accuracy rate of local patches 20

21  Using the HSI+GLCM+Edges Feature Vector  Every region contains several patches  Associating a region to a category/class gives us a more global knowledge about the scene  Two voting methods ◦ A single vote per patch ◦ A weighted vote per patch, according to its probability (an output of the probabilistic SVM)  Will this improve the accuracy rates? Remember that there are usually several patches that form a region. 21

22 truth\predictionfieldgrassgroundlandmountainplainplantsriverrockssandseaskytreessnow field71.60.0 21.31.40.70.00.7 2.10.01.40.0 grass100.00.0 ground27.60.03.40.044.80.0 3.46.913.80.0 land mountain5.80.00.20.085.80.0 0.42.34.80.60.0 plain12.50.0 50.00.0 37.50.0 plants45.00.0 35.00.05.00.0 15.00.0 river10.90.02.20.039.10.0 32.615.20.0 rocks14.30.07.10.071.40.0 7.10.0 sand5.50.0 34.50.0 9.129.121.80.0 sea5.00.00.60.013.30.0 1.764.115.50.0 sky0.0 6.00.0 0.40.892.80.0 trees39.60.0 40.70.0 1.10.0 2.216.50.0 snow0.0 40.00.0 40.00.020.0 General accuracy rate: 70.77% 22

23 truth\predictionfieldgrassgroundlandmountainplainplantsriverrockssandseaskytreessnow field70.20.0 22.01.40.70.00.7 2.80.01.40.0 grass100.00.0 ground27.60.03.40.044.80.0 10.313.80.0 land mountain4.60.00.40.086.60.0 0.42.34.80.80.0 plain12.50.0 50.00.0 37.50.0 plants45.00.0 35.00.05.00.0 15.00.0 river10.90.02.20.037.00.0 34.815.20.0 rocks7.10.014.30.071.40.0 7.10.0 sand3.60.0 34.50.0 9.127.325.50.0 sea5.00.00.60.012.70.0 1.763.017.10.0 sky0.0 4.80.0 0.20.894.20.0 trees30.80.0 47.30.0 1.10.0 2.218.70.0 snow 0.0 20.00.0 60.00.020.0 General accuracy rate: 71.34% 23

24  The project was combined with: ◦ Non-Local Characterization of Scenery Images: Statistics, 3D Reasoning, and a Generative Model / Tamar Avraham and Michael Lindenbaum  Submitted to CVPR 2011: ◦ Multiple Region Classification for Scenery Images using Top-Bottom Order and Boundary Shape Cues  The following are now taken into account: ◦ The relative location of the region ◦ The height of the region ◦ The boundary between the regions ◦ Texture and color 24

25 sky mountain sea? ground? rocks?plants? only layout sky? sea? mountain? ground? sea rocks only color&texture + = sky mountain sea rocks 25  Goal: to show that region classification using global + local descriptors is better than only local descriptors

26 26 top bottom skytreesgroundsea

27 Ground truth Input image Relative location Boundary shape Color & texture All cues 27

28 19 categories! 28  Accuracy per class: ◦ Color & texture: higher accuracy for trees, field, rocks, plants, snow ◦ Layout: better for sky, mountain, sea, sand ◦ Other classes performance: very low due to their number. CueAccuracy Color&Texture0.615 Relative Location0.503 Boundary Shape0.452 Relative Loc. + Boundary Shape0.573 Color&Texture + Relative Loc.0.676 Color&Texture + Boundary Shape0.641 All (ORC)0.682

29 sky sea river lake mountain cliff plateau land field valley bank beach sand ground rocks plants trees grass snow SKY WATER LAND SAND GROUND ROCKS PLANTS TREES GRASS SNOW MOUNTAIN PLAIN VALLEY BANK land structure land cover basic classeshigh level categories

30 30 ground truthInput image M-ORC

31 31

32  Scenery images  Feature vectors  Optimal parameters  Patches classification  Regions classification  Incorporating global context 32

33  Segmentation  Scene categorization  Extension to other domains  Picture alignment 33

34 34

35 35


Download ppt "Ilya Gurvich 1 An Undergraduate Project under the supervision of Dr. Tammy Avraham Conducted at the ISL Lab, CS, Technion."

Similar presentations


Ads by Google