Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multimedia Indexing and Retrieval Kowshik Shashank Project Advisor: Dr. C.V. Jawahar.

Similar presentations


Presentation on theme: "Multimedia Indexing and Retrieval Kowshik Shashank Project Advisor: Dr. C.V. Jawahar."— Presentation transcript:

1 Multimedia Indexing and Retrieval Kowshik Shashank Project Advisor: Dr. C.V. Jawahar

2 Problem Statement Develop efficient algorithms for a real time, private multimedia database “Develop efficient algorithms for a real time, private multimedia database”

3 Applications Defense systems Surveillance systems Image/Video collections (under copyright notices) Web 2.0 Web image search

4 Feature Extraction Indexing Query Feature Extraction Database Similarity Measure Result

5 Indexing Schemes Hierarchical Structures Vocabulary Trees Hashing

6 Private Retrieval In Hierarchical Structures

7 Querying in CBIR …….. Feature vector Query Image

8 Private Content Based Image Retrieval 1. The user extracts the feature vector of the query image, say f query. 2. The user asks for the data stored in the root node of the indexing structure. 3. f query and the information are used to decide whether to access the left or the right sub-tree. 4. The user frames a Query Q i to access the node at level i. 5. The database replies with A i for the query Q i. 6. The user performs a function f( A i ) to obtain the information at the node. Go to step 3.

9 Private Content Based Image Retrieval A2A2 Q2Q2 Q1Q1 A1A1 Feature vector (f query ) …….. Root Info f query, f(A 1 ) f query, f(A 2 )

10 Quadratic Residuosity assumption Consider a natural number N = p. q where p, q are large prime numbers. Construct a set `y` is called a Quadratic Residue (QR), if x | y = x 2 and x, y else `y` is called a Quadratic Non-Residue (QNR). Construct a set Y N with equal number of QRs and QNRs

11 Quadratic Residuosity Assumption: Given a number `y` Y N, it is predictably hard to decide whether `y` is a QR or a QNR. Basic Rules QNR * QNR = QR QNR * QR = QNR QR * QR = QR

12 Viewing the nodes in a level

13 QR QNRQR User i th 1010 Database QR 2 QRQNR 2 QR User i th 1010 Database Now the user decides on the i th element as QR or QNR and decides upon the data at the i th index in the database. Q1Q2Q3Q4 Q1 2 Q2Q3 2 Q4 Querying on a Linear Database A[i] = Q[i] if 0 A[i] = Q[i] 2 if 1 Q A

14 Converting to 2D database

15 QNR QR ….. …. QR QNR m x n Frame a query of length ‘m’ with a QNR in the position of the row in which the node occurs QR ….. …. ….. ….

16 The database forms a m x n matrix with the first bit of information QR 00 1 0 11 11 ….. …. QNR m x n 0 ….. 1 0 …. ….. ….

17 QRQR 2 QNR QR 2 QR ….. …. QNR m x n Put the square of the number if the bit value is 1 else retain the same number QNR ….. QNR 2 QR …. ….. Multiply along the columns QNR QR ….. AiAi ….

18 Framing the Query and Reply If the user is interested in the data at node (x,y)  Frame a query of length m in which the x th value is a QNR and rest are QR.  The database computes the reply A i of length n and returns to the user.  If the value of A i [y] is a QR then the value is 1 else 0.

19 Complexity of the algorithm The communication complexity is O(m) on the user side and O(n) on the server side. Hence the communication complexity is O(max(m,n)) If m = n =, the communication complexity is

20 Extension to other Hierarchical Structures Hierarchical Structures  Number of nodes at each level.  Information at a node. Any number of nodes can be converted into a ‘m x n’ matrix. Any information can be represented in binary format. If the user has the data about the indexing structure and the format of the information stored at a node, the algorithm can be simulated for any hierarchical structure.

21 Results KD Tree and Corel Database  Corel Database consists of 9907 images.  Color feature extracted as color histogram with 768 dimensions.  Average Retrieval Time: 0.596 secs  Sample Results

22 Results Vocabulary Tree and Nister Dataset  Nister Dataset consists of 10,200 images.  SIFT features used to obtain visual words.  Vocabulary size of 10000 visual words.  Average Retrieval Time: 0.320 secs  Sample Results

23 Results Vocabulary size was varied to test the scalability of the algorithm. As the size increases, the size of the tree increases causing more data to be exchanged, thus increasing the average retrieval time.

24 Results LSH and Corel Dataset  LSH – Locality Sensitive Hashing  90 hash functions each having 450 bins on an average.  Two level hierarchy.  Average Retrieval Time: 0.221 secs  Confusion metric was varied to obtain various levels of privacy.  As confusion metric decreases, the data exchanged decreases thus giving faster retrieval times.

25 Results The algorithm was tested for its scalability. Synthetic datasets to the tune of a million images were used to test the practicality of the algorithm. Dataset SizeQuery Time(in secs) 2 10 0.005832 2 12 0.008856 2 14 0.012004 2 16 0.037602 2 18 0.129509 2 20 0.261255

26 Conclusion We have addressed the problem of private retrieval in Image databases. The algorithm is shown to be customizable for all hierarchical structures as well as Hash based Indexing. Experimental study shows that the algorithm is accurate, efficient and scalable. Algorithm is fully private and feasible on large image databases using the state of art indexing schemes. Demonstrated a near linear operating region for image databases, where the trade off between privacy and speed is feasible.

27 m x n QiQi 01 11 1 ….. …. 1 ….. 0 0 …. ….. …. 01 11 1 QR ….. …. QNR 11 ….. 0 0 …. ….. …. 1 QRQR 2 ….. …. m x n QNR 2 ….. QNR QR …. ….. …. QRQR 2 ….. …. m x n QNR 2 ….. QNR QR …. ….. …. QR ….. QNR AiAi


Download ppt "Multimedia Indexing and Retrieval Kowshik Shashank Project Advisor: Dr. C.V. Jawahar."

Similar presentations


Ads by Google