Presentation is loading. Please wait.

Presentation is loading. Please wait.

IMAGE DATABASES Prof. Hyoung-Joo Kim OOPSLA Lab. Computer Engineering Seoul National University.

Similar presentations


Presentation on theme: "IMAGE DATABASES Prof. Hyoung-Joo Kim OOPSLA Lab. Computer Engineering Seoul National University."— Presentation transcript:

1 IMAGE DATABASES Prof. Hyoung-Joo Kim OOPSLA Lab. Computer Engineering Seoul National University

2 Contents Introduction Image database systems Indexing issues and basic mechanisms A taxonomy on image indexes Color-spatial hierarchical indexes Signature-based color-spatial retrieval Summary

3 Introduction Traditional DBMS –effective in managing structured data –not effective for images that are non-alphanumeric and unstructured Applications to manage images –medical application –geographic information system –satellite image database –criminal database –interior design –art galleries and museum management –etc...

4 Introduction(2) Content-based image retrieval techniques –techniques that retrieve image based on their visual properties such as texture, color, shape, etc –sequentially comparing image feature is time- consuming and impractical –need content-based image index

5 Image Database System Additional functionalities –feature extraction the system must be able to analyze an image to extract key features such as shape, color, texture –feature-based indexing the system must build indexes based on the features extracted –content-based retrievals the system should support a wide range of queries that involve the contents of the image –measure of similarity the system requires a measure to capture what we humans perceive as similarity between two images

6 Architecture of Image DBMS Input Image PREPROCESSING MODULE QUERY MODULE Image Input/Scanner Feature Extraction Update Index/ Database Interactive Query Formulation Browsing & Feedback Feature Extraction Feature Matching Runtime Processor Concurrency Control & Recovery Manager Feature/Image Database User Query Output Retrieved Image

7 Indexing Issues Three issues in content-based index design –determine a representation for the indexing feature –determine a similarity measure between two images based on their representation –determine an appropriate index organization

8 Indexing Issues Desirable properties of a representation(first issue) –exactness –space efficiency –computationally inexpensive similarity matching –preservation of the similarity between the features –automatic extraction –insensitivity to noise, distortion, rotation

9 Indexing Issues Main criterion of the similarity measure(second issue) –if two images are similar under the indexing feature, then their representations should remain so. Several alternatives to determine the similarity –exact match the representation of an image feature is usually coarse, in sense that images with similar feature will be mapped to the same representation –approximate match the degree of similarity between the image representations is computed based on some approximation techniques

10 Indexing Issues Criteria for selection of an index structure –the similarity measure can be supported efficiently –storage efficiency –maintenance(update) overhead Appropriate index structure –be determined by the representation and similarity measure –example if image feature is represented as a vector, an the similarity measure is the Euclidean distance, then a natural choice is the multi-dimensional point access method

11 Basic Index Schemes Spatial access methods(SAMs) –basic idea extract k image features for each image map images into points in a k-dimensional feature space use SAMs such as the grid file, quad-tree, the family of R- tree –problem: “high-dimensionality curse” these techniques perform no better than sequential scanning as the dimension becomes sufficiently large

12 Basic Index Schemes Inverted file –basic idea an inverted list is created for each distinct key(indexed features) the inverted list consists of a list of pointers to the objects that contain features that are similar to the indexed feature –problem high storage overhead & expensive update Signature file –refer [Text Database]

13 Taxonomy on Image Indexes Shape feature Spatial relationship Texture Color

14 Taxonomy on Image Indexes Content-based indexes Spatial relationship Texture Objects Shape 1-D string Inverted file Numerical vectors Color Color histogramColor-Spatial Signature2-D string Multi-level signature file Sequential file Multi-dimensional index Multi-dimensional index Multi-level histogram 2-level B+-tree 3-tier color index Tamura features Multi-dimensional index Signature file Similarity against representative objects Signature Inverted file

15 Shape Feature Representation –boundary information –a collection of rectangle that forms a rectangular cover of the shape –mathematical morphology –pattern spectrum –numerical vector using 2-D Fourier transformation or Wavelet transformation Problems –shape vary widely from object to object –unless the images have very distinct shape, the performance may suffer

16 Spatial relationship Representation –2-D string semantic representation for spatial relationship using a two-dimensional string projection of the symbols along the x-axis and y-axis example : “O1 is to left of O2 which is left of O3” the projection on the x-axis - O1 < O2 < O3 variations and extensions to the 2-D string have been proposed –Problem spatial relationships can be drastically affected by the orientation of the image

17 Texture and Color Texture –can be represented by the coarseness, contrast, directionality –the extraction of text information is a computationally intensive operations Color –color histogram that captures the color composition of images –color alone is not sufficient to characterize an image consider two image - one with the top half blue and bottom half red, while the other’s left half is red and it’s right half is blue although these two images are similar in color composition, the are entirely different to a human observer. –recent studies have proposed to integrate color and its spatial distribution

18 Color-spatial hierarchical Indexes Hierarchical Indexes –multiple indexing mechanisms are integrated to form a single index structure –three indexes that have been proposed to integrated color and spatial information for retrieval Two-level B + -tree structure Three-tier color index Sequential Multi-Attribute Tree(SMAT)

19 Two-level B + -tree Feature extraction –the color-spatial information of an image is modeled by splitting the image into 9 equal sub- areas(3  3) –the color information within each sub-area is represented by a color histogram –one can obtain a more accurate similarity by matching the corresponding color histogram of two image –the color histogram within each sub-area is mapped into a numerical key Retrieval technique

20 Two-level B+-tree Retrieval technique –use two level information –the first level describes the composition of colors corresponding to the histogram of the region the colors are grouped into 11 bins each group is assigned a range which bounds the percentage of pixels in the mage with colors of group –the second level contains the average H, average V, and average C values of all the 11 histogram bins

21 Two-level B+-tree Level 1: B+-tree on Normal Pixel Count Level 2: B+-tree on Average H,V, and C values

22 Three-tier Color Index Layer 1: –dominant color classification a fixed number of dominant colors is extracted the dominant colors are those with the largest number of pixel count Layer 2: –multidimensional R-tree structure the image can be assigned to a partition prune away image within the candidate partitions that are not relevant Layer 3: –multi-level color histogram(quad-tree structure) compare the histograms of the query image with those of remaining potential candidate image

23 Tree-tier Color Index Tier 1: Dominant Color Classification Tier 2: R-tree Tier 1: Multi-Level Color Histogram K = 1 K = 2 K = 3 0 1 2 n(no.of colors) (0,1) (0,2).. (0,n) (1,2) …...…(2,n) …. (0,1,2)... (0,1,n) ……. (0,2,n) …...… …. K : dominant colors

24 SMAT Height-balanced color-spatial index –the problem with the two approaches individual tree structures(B+-tree, R-tree, Dominant Color Classification) are height-balanced, the entire hierarchical index structure may not be so. –height-balancing of SMAT SMAT is a multi-tier index structure similar to two approaches, but remains height-balancing of the entire index structure by controlling the growth of the tree using a certain threshold which is application-specific value. refer to the VLDB Journal(1998) 7: 115-128 for more information

25 Signature-based Color-Spatial Retrieval Representation of color-spatial information –an image is partitioned into a grid of m  n cells of equal size –the colors that can be used to represent a cell are determined –for a given color, each cell is examined to determine the percentage of the total number of pixels in the cell having that color 0 12 31 An image partitioned into a 4x8 grid Cell does not satisfy the threshold Cell does satisfy the threshold - an image is represented by 32-bit color signature, 10001111001000110000000001100011

26 The retrieval process –an image with k colors has k color signatures –let Q i and D i denote the signature of color i for a query image Q and a database image D. –let the representative color sets of Q and D be C Q and C D. Signature-based Color-Spatial Retrieval

27 Summary Promising area that require further research –performance evaluation comparative study will be useful for application designers and practitioners to pick the best method for their applications –more on access method designing efficient access methods will make the content- based retrieval techniques more practical and useful –concurrent access and distributed indexing we expect to see more real-time application as well as application running in parallel or distributed environment

28 Summary –Integration and optimization most content-based image retrieval techniques only capture a part of the image’s semantics important issues –selecting an “optimal” set of image features that fits best for an application –developing techniques that can integrate them into achieve the optimal results one promising method(semantic-based retrieval technique) –use content-based techniques as the basis, but also exploits semantic meanings of the image and queries to support concept-based queries


Download ppt "IMAGE DATABASES Prof. Hyoung-Joo Kim OOPSLA Lab. Computer Engineering Seoul National University."

Similar presentations


Ads by Google