Implementing GIST on the GPU
Refrence Original Work Aude Oliva, Antonio Torralba Modeling the shape of the scene: a holistic representation of the spatial envelope International Journal of Computer Vision, Vol. 42(3): , CPU Implementation Torralba.pdf Our Work Parallelize to work on the GPU
Introduction Recognition of real world scenes Spatial Envelope A very low dimensional representation of the scene is called the Spatial Envelope. Takes care of Naturalness Openness Roughness Expansion Ruggedness
Intuitive Notion Naturalness Straight lines indicate man made structures Crooked, rough lies indicate natural environment. Openness Presence of horizon reflects highly open environment Roughness Refers to the size of its major components. Is correlated with the fractal dimension of the scene
Intuitive Notion contd… Degree of Expansion Converging parallel lines allows to percept depth gradient of the space. A flat view of a building would have a low DoE A street with long vanishing lines would have a high DoE. Degree of Ruggedness The deviation of the ground wrt horizon Open environments have a flat horizontal ground level Mountainous landscapes have rugged ground Ruggedness produce oblique contours, hide the horizon. Man-made environments built on flat ground, less rugged.
Basic Algorithm Create Gabor Filter Bank Preprocess image Local contrast normalization Local luminance invariance normalization for color images Descriptor Convolve image with filters For each filter Divide image into blocks Mean of each block is a number in feature vector
Parameters Involved Number of Scales Typically = 3 Number of orientations per scale Typically = 8 Image size Typically 320x240 to 1024x1024
Ways to Parallelize On Number of scales * orientations per scale Number of threads = 24 Less than typical image size On Image MxN Pixels Number of threads = M*N >> 24. High degree of parallelism.
Parallelization for creation of Gabor filter Pixel Level Threads calculating Gabor filter value
Parallelization for prefiltering image Pixel Level Fast Fourier Transforms Pixel by pixel Multiplications => Inherent Parallelism
RGB image components Filters Descriptor Division into blocks Calculating Descriptor
Graphs
Speed up Image SizeCPU Time ( ms )GPU Time ( ms ) 64 x x x x x
Thank You !!