Presentation on theme: "Indexing Techniques for Multimedia Databases"— Presentation transcript:
1 Indexing Techniques for Multimedia Databases Multimedia Technologies7/17/97Indexing Techniques for Multimedia DatabasesMultimedia SimilaritySearch StructureImage IndexingVideo IndexingKien A. Hua
2 Traditional DBMSDesigned to manage one-dimensional datasets consisting of simple data types, such as strings and numbersLimited kinds of queries: exact match, partial match, and range queriesWell-understood indexing methods: B-trees, hashing
3 Characteristic of Multimedia Queries Multimedia Technologies7/17/97Characteristic of Multimedia QueriesWe normally retrieve a few records from a traditional DBMS through the specification of exact queries based on the notions of “equality”.The types of queries expected in an image/video DBMS are relatively vague or fuzzy, and are based on the notion of “similarity”.The indexing structure should be able to satisfy similarity-based queries for a wide range of similarity measures.Kien A. Hua
4 Content-Based Retrieval Multimedia Technologies7/17/97Content-Based RetrievalIt is necessary to extract the features which are characteristics of the image and index the image on these features.Examples: Shape descriptions, texture properties.Typically there are a few different quantitative measures which describes the various aspect of each feature.Example: The texture attribute of an imagecan be modeled as a 3-dimensional vector with measures of directionality, contrast, and coarseness.Kien A. Hua
5 Introduction Multimedia require support of multi-dimensional datasets E.g., a 256 dimensional feature vector.That impliesSpecialized kinds of queriesNew indexing approaches. Two choices:Map n-dimensional data to a single dimension and use traditional indexing structures (B-trees)Develop specialized indexing structures
6 Low-Dimensional Indexing Applications Spatial Databases (GIS, CAD/CAM)Number of dimensions: 2-4Spatial queries. For example:Which objects intersect a given 2D or 3D rectangleWhich objects intersect a given objectSpecialized indexing structuresquad-tree, BSP-tree, K-D-B-tree, R-tree, R+-tree, R*-tree, X-tree, …
7 High-Dimensional (HD) Indexing Applications Multimedia databases (Images, Sounds, Movies)Map multimedia object to a n-dimensional point called feature vectorNumber of dimensions: typicallyIndexing:Actually index only feature vectorsData structures used:same as for spatial databases (R-Trees, X-trees)or, structures tailored to index specifically feature vectors (TV-Tree)
8 HD Considerations (1) Main problem: Data comes in two forms In general there is no total-ordering of d-dimensional objects that preserves spatial proximityData comes in two formsN-dimensional pointsN-dimensional objects extended in spaceObjects can have rather complex shapes (extents)Typically abstract from the actual form and index some simpler shapes, such as Minimum Bounding Boxes (MBB) or n-dimensional hyper spheres
9 HD Considerations (2) “Dimensionality curse” As the number of dimensions increasesperformance tends to degrade (often exponentially)Indexing structures become inefficient for certain kinds of queriesPerformance is often CPU-bound, not just I/O-bound as in traditional DBMS
10 HD Queries Overview No standard algebra or query language The set of operators strongly depends on application domainQueries are usually expressed by an extension of SQL (e.g. abstract data types)Although there are no standards, some queries are common
11 Multiattribute and Spatial Indexing of Multimedia Objects Multimedia Technologies7/17/97Multiattribute and Spatial Indexing of Multimedia ObjectsSpatial Databases: Queries involve regions that are represented as multidimensional objects.Example: A rectangle in a 2-dimensional space involves four values: two points and two values for each point.Access methods that index on multidimensional keys yield better performance for spatial queries.Multimedia Databases: Multimedia objects typically have several attributes that characterize them.Example: Attributes of an image include coarseness, shape, color, etc.Multimedia databases are also good candidates for multikey search structures.Kien A. Hua
12 Measure of SimilarityA suitable measure of similarity between an image feature vector F and query vector Q is the weighted metric W:where A is an nxn matrix which can be used to specify suitable weighting measures.
13 Similarity Based on Euclidean Distance Multimedia Technologies7/17/97Similarity Based on Euclidean Distanceé3ùé2ùé3ùé2ùêúêúêúêúF=4F=4F=4Q=41êú2êú3êúêúê6úê7úêúêúëûëûë7ûë6ûé1ùé1ù[êúêúD(F1 ,Q)=1]×1×=1êúêúêë1ûúëêûúé1ùéù[êúêúD(F2 ,Q)=1]×1×=1êúêúëê1úûêë1úûé1ùé1ùêúêúD(F3 ,Q)=×1×=2êúêúKien A. Huaêë1ûúëê1ûú
14 Similarity Based on Euclidean Distance (cont.) Multimedia Technologies7/17/97Similarity Based on Euclidean Distance (cont.)Feature 2F1QF2F3Feature 1Points which lie at the same distance from the query point are all equally similar, e.g., F1 and F2.Kien A. Hua
15 Similarity Based on Weighted Euclidean Distance Multimedia Technologies7/17/97Similarity Based on Weighted Euclidean Distancewhere A is the diagonal.Example:é4ùé3ùé3ùé1ùêúêúêúêúF=5F=5Q=5A=11êú2êúêúêúê7úê8úê7úê2úëûëûëûëûé1ùé1ùêúêúD(F1 ,Q)=1×1×=1êúêúê2úêúëûëûKien A. Huaé1ùéùêúêúD(F2 ,Q)=1×1×=2êúêúê2úê1úëûëûD(F1 ,Q) < D(F2 ,Q) F1 is more similar to Q
16 How to determine the weights ? The variance of the individual featuremeasures can be used as their weights.éS12ùúúSi2A=S2:the variance of the i-th feature measures.êúúêS32ëûRationale: A feature with a larger variance is more discriminating.
17 Multimedia Technologies 7/17/97Query TypesQuerying in image DBMS is envisioned to be iterative in nature:Vague Queries: Queries at the earlier stage can be very “loose”.Retrieve images containing textures similar to this sample.K-nearest-neighbor-queries: The user specifies the number of close matches to the given query point.Retrieve 10 images containing textures directionally similar to this sampleRange queries: An interval is given for each dimension of the feature space and all the records which fall inside this hypercube are retrieved.Kien A. Hua.......+.r.++...Q..QQr is large r is small range query=> vague query => 3-nearest neighborquery
18 Indexing Multimedia Objects Feature YIndexing Multimedia ObjectsO2..O1Feature XCan’t we index multiple features using a B+-tree ?B+-tree defines a linear orderSimilar objects (e.g., O1 and O2) can be far apart in the indexing orderWhy multidimensional indexing ?A multidimensional index defines a “spatial order”Conceptually similar objects are spatially near each other in the indexing order (e.g., O1 and O2)
20 Z-ordering Curve with 2 bits Space Filling CurvesAssume that each dimension is represented by a fixed bit width numberPartition the universe with a gridLabel each grid cell with a unique number called the curve valueFor points, store that number in a traditional one-dimensional indexObjects can be handled through decomposition into multiple cellsZ-ordering Curve with 2 bits
21 Multimedia Technologies 7/17/97k-d Treesk-d tree is a multidimensional binary search tree.Each node consists of a “record” and two pointers. The pointers are either null or point to another node.Nodes have levels and each level of the tree discriminates for one attribute.The partitioning of the space with respect to various attributes alternates between the various attributes of the n-dimensional search space.Example: 2-D treeDiscriminatorInput SequenceA = (65, 50)B = (60, 70)C = (70, 60)D = (75, 25)E = (50, 90)F = (90, 65)G = (10, 30)H = (80, 85)I = (95, 75)A(65, 50)XYB(60, 70)C(70, 60)Kien A. HuaF(90, 65)G(10,30)E(50,90)D(75, 25)H(80, 85)I(95, 75)
22 k-d Tree: Search Algorithm Multimedia Technologies7/17/97k-d Tree: Search AlgorithmNotations:Algorithm: Search for P(K1, ..., Kn)Q := Root; /* Q will be used to navigate the tree */While NOT DONE DO the following:if Ki(P) = Ki(Q) for i = 1, ..., n then we havelocated the node and we are DONEOtherwise if A = Disc(Q) and KA(P) < KA(Q)then Q := Low(Q)else Q := High(Q)Performance: O(logN), where N is the number of recordsL(..., KA(L), ...)M = Low(L)N = High(L)MNDisc(L) : The discriminator at L’s levelKA(L) : The A-attribute value of LLow(L) : The left child of LHigh(L) : The right child of LKien A. Hua
23 Multidimensional Tries Multimedia Technologies7/17/97Multidimensional TriesMultidimensional tries, or k-d tries, are similar to k-d tree except that they divide the embedding space.Each split evenly divides a regionExample: Construction of a 2D triesPartitioning of the spaceInsert A(65,50):Y13X<=50X>504C(70, 60)A(65, 50)102030405060705B(60,70)Insert B(60, 70):2X<=50X>50A(65,50)6D(75,25)Y<=50Y>507A(65,50)B(60, 70)X102030405060708090Insert C(70,60):Insert D(75, 25):X>50X<=50X<=50X>50Y<=50Y>50Kien A. HuaY<=50Y>50X<=75X>75A(65,50)X<=75X<=75X>75X>75Y<=25Y>25Y<=75Y>75X<=75Y>75D(75,25)A(65,50)X<=62.5X>62.5X<=62.5X>62.5B(60, 70)C(70, 60)B(60,70)C(70,60)
24 Multidimensional Tries: Using Buckets Disadvantage: The maximum level of decomposition depends on the minimum separation between two points.A solution: Split a region only if itcontains more than p points.
25 Multimedia Technologies 7/17/97Grid Files1001007550251234Grid directoryABCDlinear scale75DEFG50HIJJ25Data bucketKKLM2550751001234255075100Split Strategy: The partitioning is done with only one hyperplane, but the split extends to all the regions in the splitting direction1. The directory is quite sparse.2. Many adjacent directory entries may point tothe same data block.3. For partial-match and range queries, manydirectory entries, but only few data blocks,may have to be scanned.Kien A. Hua
26 Multimedia Technologies 7/17/97Point-Quad TreesEach node of a k-dimensional quad tree partitions the object space into k quadrants.The partitioning is performed along all search dimensions and is data dependent, like k-d trees.Example:Partitioning of the spaceThe quad treeAD(35,85)B(75,80)PSENENWSWBC(90,65)DNEA(50,50)ESESWNWCE(25,25)Kien A. HuaTo insert P(55, 75):Since XA< XP and YA < YP go to NE (i.e., B).Since XB > XP and YB > YP go to SW, which in this case is null.
27 Spatial Index TreesWe will talk about data normalized in the range [0, 1] for all the dimensions.Minimum Bounding Region (MBR) refers to the smallest region (rectangle, circle) that encloses the entire shape of the objects or all the data points.
28 R-tree R-trees are higher generalizations of B-trees. The nodes correspond to disk pages.All leaf nodes appear at the same level.Root and intermediate nodes corresponds to the smallest rectangle that encloses its child nodes, i.e., containing [r, <page pointer>] pairs.Leaf nodes contain pointers to the actual objects, i.e., containing [r, <RID>] pairs.A rectangle may be spatially contained in several nodes (e.g., J ), yet it can be associated with only one node.
29 R-Trees Hierarchy of nested d-dimensional intervals (boxes). Each node v corresponds to a disk page & d-dimensional interval, .Store MBB or MBR of n-dimensional object.Permits overlap of index entries.Index used as filter mechanism for query.Every node contains between m and M entries unless it is a root.The root node has at least 2 entries unless it is a leaf.Height-balanced.Which of the above properties are similar to trees ?
30 R-tree: Insertion A new object is added to the appropriate leaf node. If insertion causes the leaf node to overflow, the node must be split, and the records distributed in the two leaf nodes.Minimizing the total area of the covering rectanglesMinimizing the area common to the covering rectanglesSplits are propagated up the tree (similar to B-tree).
31 R-tree: DeleteIf a deletion causes a node to underflow, its nodes are reinserted (instead of being merged with adjacent nodes as in B-tree).There is no concept of adjacency in an R-tree.
32 D-tree: Domain Decomposition Multimedia Technologies7/17/97D-tree: Domain DecompositionIf the number of objects inside a domain exceeds a certain thresholds, the domain is split into two subdomains.Example 1: Horizontal SplitA subdomainGFSplit lineFGEEDBDA borderobjectACBOriginal domainACExample 2: Vertical SplitSplit along longest dimensionABCFEGKien A. HuaOriginal domainDA subdomainABCFEGD
34 D-tree: Split Example (continued) Multimedia Technologies7/17/97D-tree: Split Example (continued)D-treeEmbedding SpaceAfter 3rdsplit:D11D2D121D122D11D2D121D122Internal nodeAfter 4thsplit:D1D2D11D21External nodeKien A. HuaD121D122D22D11D121D122D21D22D22.P
35 Multimedia Technologies 7/17/97D-tree: Range QueriesNote: A range query can be represented as a hypercube embedded in the search space.Search Strategy:Retrieve the set, say S, of all subdomains which overlap with the query cube.For each subdomain, in S, which is not fully contained in the query cube, discard the objects falling outside the query cube.Algorithm:Search(D_tree_root, search_cube)Current_node = D_tree_rootFor each entry in Current_node, say (D, P), if D overlaps with search_cube, we do the following:If Current_node is an external node, retrieve the objects, in D.P, which fall within the overlap region.If Current_node is an internal node, call Search(D.P, search_cube).Kien A. Hua
36 D-tree: Desirable Properties Multimedia Technologies7/17/97D-tree: Desirable PropertiesD-trees are balanceThe search path for an object is unique No redundant searches.More splits occur in the denser regions of the search space. Objects are evenly distributed among the data nodes.Similar objects are physically clustered in the same, or neighboring data nodes.Good performance is ensured regardless of the insertion order of the data.Kien A. Hua
37 Content-Based Image Indexing Multimedia Technologies7/17/97Content-Based Image IndexingKeyword ApproachProblem: there is no commonly agreed-upon vocabulary for describing image properties.Computer Vision TechniquesProblem: General image understanding and object recognition is beyond the capability of current computer vision technology.Image Analysis TechniquesIt is relatively easy to capture the primitive image properties such asprominent regions,their colors and shapes,and related layout and location information within images.These features can be used to index image data.Kien A. Hua
38 Possible FeaturesEdgeRegionColorShapeLocationSizeTexture
39 EDGE Types of Edges – Step, Ramp, Spike and Roof. 3 stages in edge detectionFiltering : Image is passed through a filter in order to remove noise.Differentiation : highlights the locations where intensity changes are significant.Detection
40 Classes of edge detection schemes Prewit, Robert, Sobel, and Laplacian – 3x3 and 5x5 gradient operatorsHueckel, Hartly and Haralick’s – surface fittingCanny - the derivatives of Gaussian
41 Canny Edge DetectorThe results of choosing the standard deviation sigma of the edge detectors as 3.lena.gifvertical edgeshorizontal edgesnorm of thegradientafterthresholdingafter thinning
42 Features Acquisition: Region Segmentation Multimedia Technologies7/17/97Features Acquisition: Region SegmentationGroup adjacent pixels with similar color properties into one region, andsegment the pixels with distinct color properties into different regions.Kien A. Hua
43 Definition of Segmentation All pixels must have the same ..All pixels must not differ by more than ..All pixels must not differ by more than T from the mean ..The standard deviation must small ..
45 Seed Segmentation Compute the histogram Smooth the histogram by averaging to remove small peaksIdentify candidates peaks and valleysDetect good peaks by peakiness testSegment the image using thresholdsApply connected component algorithm
46 Region Growing Split and Merge Algorithm Phagocyte Algorithm Likelihood Ratio Test
48 ColorWe can divide the color space into a small number of zones, each of which is clearly distinct with others for human eyes.Each of the zones is assigned a sequence number beginning from zero.Notes: It is proven that human eyes are not very sensitive to colors. In fact, users only have a vague idea about the colors they want to specify.
49 Shape Shape feature can be measured by properties: Circularity, major axis orientation, and Moment.Circularity:Notes: The more circular the shape, the closer to onethe circularity.Major Axis Orientation:Moment : the first and the secondraa2aa
50 Location The image is divided into sub-areas. Each sub-area is labeled with a number.The region location is represented by the number of the sub-area in which the centroid (gravity center) of the region is contained.Note: When a user queries the database by visual contents, approximate feature values are used.It is meaningless to use absolute feature values as indices.Location of A is 4Location of B is 112B345A678
51 Size Total number of pixels occupied by the region The size range is divided into groups.A region’s size is represented by the corresponding group number.Example:group number Size RangeS: object size Asub: size of the sub-areaNotes: Only the regions more than one-fourth of the sub-area are registered.
52 Texture Approach based on Statistics: angular second moment (energy, homogeneity or uniformity), entropy, correlation, inverse difference moment, contrast (inertia), variance, sum average, sum variance, difference variance, difference entropy, information measure of correlation I, information measure of correlation II, and maximal correlation coefficient.Approach based on human perception:coarseness, contrast, directionality, line-likeness, regularity and roughnessbusyness, complexity and texture strengthrepetitiveness, orientation, and complexity
53 Image Indexing by contents Multimedia Technologies7/17/97Image Indexing by contentsBy applying image segmentation techniques, a set of regions are detected along with their locations, sizes, colors, texture and shapes.These features can be used to index image data.Kien A. Hua
54 Multimedia Technologies 7/17/97Texture AreasTexture areas and images with dominant high frequency components are beyond the capacity of image segmentation techniques.Matching on the distribution of colors (i.e., color histograms) is a simple yet effective means for these areas.Strategy: Dividing an image into sub-areas and creating a histogram for each of the sub-areas.Note: the partitioning of the image is to capture locality information. We don’t want to match an image with a red balloon on top with an image with a red car in the bottom.Kien A. Hua
55 Multimedia Technologies 7/17/97HistogramsGray-Level Histogram: It is a plot of the number of pixels that assume each discrete value that the quantized image intensity can take.Color Histogram: It holds information on color distribution. It is a plot of the statistics of the R, G, B components in the 3-D color space.Kien A. Hua
56 Histograms (cont.) Most histogram bins are sparsely populated, with only a small number of bins capturing themajority of pixel counts.We can use the largest, say 20, bins as the representative bins of the histogram.these 20 bins form a chain in the 3-D color space.If we can represent such chains using a numerical number, then we can index the color images using various tree structures.Connecting order: The representative bins are sorted in ascending order by their distance from the origin of the color space.Weighted Perimeter:Weighted Angle:Format of the index key:B(8,2,6)(3,2,3)(0,1,1)R(6,2,0)(2,3,0)GWP (10 bits)WA (10 bits)