Spring-2008MMDB-Audio 1 Audio Databases. Spring-2008 MMDB-Audio 2 Metadata Using metadata to represent audio content is done in a very similar way as.

Slides:



Advertisements
Similar presentations
1 DATA STRUCTURES USED IN SPATIAL DATA MINING. 2 What is Spatial data ? broadly be defined as data which covers multidimensional points, lines, rectangles,
Advertisements

Mechanical Waves and Sound
Indexing DNA Sequences Using q-Grams
15 Data Compression Foundations of Computer Science ã Cengage Learning.
Data Compression CS 147 Minh Nguyen.
Text Databases Text Types
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
HASH TABLE. HASH TABLE a group of people could be arranged in a database like this: Hashing is the transformation of a string of characters into a.
Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Types, characteristics, properties
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Searching on Multi-Dimensional Data
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
McCrieght’s algorithm for linear- time suffix tree construction Example.
Modern Information Retrieval
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
BTrees & Bitmap Indexes
Aki Hecht Seminar in Databases (236826) January 2009
Indexed Search Tree (Trie) Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
2010/3/81 Lecture 8 on Physical Database DBMS has a view of the database as a collection of stored records, and that view is supported by the file manager.
Quick Review of Apr 15 material Overflow –definition, why it happens –solutions: chaining, double hashing Hash file performance –loading factor –search.
1 The Representation, Indexing and Retrieval of Music Data at NTHU Arbee L.P. Chen National Tsing Hua University Taiwan, R.O.C.
The Effectiveness Study of Music Information Retrieval Arbee L.P. Chen National Tsing Hua University 2002 ACM International CIKM Conference.
E.G.M. PetrakisDimensionality Reduction1  Given N vectors in n dims, find the k most important axes to project them  k is user defined (k < n)  Applications:
1 Chapter 2 Reviewing Tables and Queries. 2 Chapter Objectives Identify the steps required to develop an Access application Specify the characteristics.
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
Chapter 61 Chapter 6 Index Structures for Files. Chapter 62 Indexes Indexes are additional auxiliary access structures with typically provide either faster.
P. Sci. Unit 5 Waves Chapter 17.
Hashtables David Kauchak cs302 Spring Administrative Talk today at lunch Midterm must take it by Friday at 6pm No assignment over the break.
1 Music Classification Using Significant Repeating Patterns Chang-Rong Lin, Ning-Han Liu, Yi-Hung Wu, Arbee L.P. Chen, Proc. of 9th International Conference,
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Mechanical Waves and Sound
Physics 11 Vibrations and Waves Mr. Jean December 15 th, 2014.
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
Chapter 11 Indexing & Hashing. 2 n Sophisticated database access methods n Basic concerns: access/insertion/deletion time, space overhead n Indexing 
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
B-Trees And B+-Trees Jay Yim CS 157B Dr. Lee.
For this section we start with Hooke’s Law. But we already learned this. (partially)
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
NIBEDITA MAULIK GRAND SEMINAR PRESENTATION OCT 21 st 2002.
12.1 Chapter 12: Indexing and Hashing Spring 2009 Sections , , Problems , 12.7, 12.8, 12.13, 12.15,
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
A survey of different shape analysis techniques 1 A Survey of Different Shape Analysis Techniques -- Huang Nan.
Been-Chian Chien, Wei-Pang Yang, and Wen-Yang Lin 8-1 Chapter 8 Hashing Introduction to Data Structure CHAPTER 8 HASHING 8.1 Symbol Table Abstract Data.
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
Sets of Digital Data CSCI 2720 Fall 2005 Kraemer.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
Hashtables David Kauchak cs302 Spring Administrative Midterm must take it by Friday at 6pm No assignment over the break.
BIT 3193 MULTIMEDIA DATABASE CHAPTER 4 : QUERING MULTIMEDIA DATABASES.
Suffix Tree 6 Mar MinKoo Seo. Contents  Basic Text Searching  Introduction to Suffix Tree  Suffix Trees and Exact Matching  Longest Common Substring.
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Generic Trees—Trie, Compressed Trie, Suffix Trie (with Analysi
Indexing Structures for Files and Physical Database Design
CS522 Advanced database Systems
Indexing and hashing.
Data Compression.
Azita Keshmiri CS 157B Ch 12 indexing and hashing
Music Matching Speaker : 黃茂政 指導教授 : 陳嘉琳 博士.
Data Compression CS 147 Minh Nguyen.
Multimedia Information Retrieval
Indexing and Hashing Basic Concepts Ordered Indices
Presentation transcript:

Spring-2008MMDB-Audio 1 Audio Databases

Spring-2008 MMDB-Audio 2 Metadata Using metadata to represent audio content is done in a very similar way as we did for video. The metadata used to represent audio content may be viewed as a set of objects spread out cover a time line. We may index the metadata associated with audio in exactly the same way as we indexed video, and the same query-processing techniques may be used over again.

Spring-2008 MMDB-Audio 3 Example: The following figure shows the line segments associated with part of an opera. Activity1 may be Act 1 of the opera, activity2 may be Act 1, Scene 1, and so on.

Spring-2008 MMDB-Audio 4 Example: (conti.) Each activity may have an associated set of fields.  Singers: It may be a set valued field containing records having a Role, SingerType and SingerName. If the triple (Lohengrin, Tenor, Rene Kollo) appears in the segment [50, 100), Rene Kollo, a tenor, is singing the role of Lohengrin during the time segment [50, 100) of the opera.  Score: It may be a field of type music_doc which points to a relevant part of the music score associated with the time segment [50, 100).  Transcript: It may be a field of type document that points to the relevant part of the libretto during the time segment [50, 100).

Spring-2008 MMDB-Audio 5 Signal-Based Audio Content In some applications, creation of metadata is somewhat complex, speaker unknown or content unclear. Audio data is considered as a signal,  (x), over time x. Different features of the signal  are extracted, indexed and stored for efficient retrieval. Metadata may still be used to complement the signal data.

Spring-2008 MMDB-Audio 6 Sample Audio Signals

Spring-2008 MMDB-Audio 7 Signal Period of vibration, T = time taken for a “particle” in the wave to return to its starting position, ex. from point A to point B. Frequency of vibration, f = number of vibrations per second. f = 1/T. Velocity, v = the speed of the crests and troughs move to the right. v= w/T = w  f, where w denotes the wavelength of the wave. Amplitude, a = the maximum intensity of the signal associated with the wave.

Spring-2008 MMDB-Audio 8 Indexing by Segmentation Split up the audio signal into relatively homogeneous “windows.” This may be done in one of two ways:  Application developer can specify, a priori, a window size w (in sec. or min.), and assume that the wave’s properties within that window are obtained by averaging.  Use a homogeneity predicate as in the case of images, except that this homogeneity predicate applies to the one-dimensional case..

Spring-2008 MMDB-Audio 9 Windowing Using audio signal The following figure shows a nonhomogeneous audio signal. After split into five windows, each window is homogeneous in the sense that it has a constant amplitude, wavelength, and wave velocity.

Spring-2008 MMDB-Audio 10 Indexing Using Feature Extraction After segmentation, the audio signal may be viewed as a sequence of n windows, w 1, …, w n. For each window, we extract some features associated with the audio signal. If k features are extracted, then an audio signal may be considered to be a sequence of n points in a k-dimensional space.

Spring-2008 MMDB-Audio 11 Example Features Intensity(I): the power of the signal generated by the wave (in Watts per square meters). Where  is the density of the material through which the sound is being propagated. Loudness(L): Where L 0 denotes the loudness with the lowest frequency (about 15Hz) that a human ear can detect.

Spring-2008 MMDB-Audio 12 Content Index In general, to index the content of an audio signal, we proceed with the following two step: 1.Find a set w 1, …, w n of window segments. 2.For each window w i, store a vector consisting of K acoustical attributes. An audio database may be viewed as a set of (K+3)-tuples consisting of the audio source (audio file), the window (within that audio file), the duration of the window, and the K feature values associated with that window. A k-d tree can be used to index audio data.

Spring-2008MMDB-Audio 13 Content-based Retrieval for Music Databases

Spring-2008 MMDB-Audio 14 Introduction The management of large collections of music data in a multimedia database has received much attention in the past few years. For music content-based retrieval, we can extract the features, such as melodies, rhythms and chords, from the music data and develop indices that will help to retrieve the relevant music data quickly.

Spring-2008 MMDB-Audio 15 Music Feature string Ex: “ sol-do-re-mi-mi-mi-mi-re-mi-do-do” Melody feature string:eabccccbaa Rhythm string: Music feature sting:e 1 a 1 b 1 c 2 c 2 c 1 c 1 b 1 a 2 a 2 A sample of “You Are My Sunshine”

Spring-2008 MMDB-Audio 16 Features of Music Data Coding scheme: a music object  a sequence of music segments  music segment = (segment type, segment duration, segment pitch)  four segment types: ┌┐(type A), └┘(type B), ┌┘(type C), and └┐(type D)

Spring-2008 MMDB-Audio 17 Features of Music Data For example, the sequence of music segments: (B,3,-3) (A,1,+1) (D,3,-3) (B,1,-2) (C,1,+2) (C,1,+2) (C,1,+1)

Spring-2008 MMDB-Audio 18 music segment = (type, duration, pitch)

Spring-2008 MMDB-Audio 19 Music Data Retrieval: System Architecture

Spring-2008 MMDB-Audio 20 Indexing String Indexing for music data  Suffix tree Numeric Indexing for music data  R-tree

Spring-2008 MMDB-Audio 21 Suffix tree A suffix tree is an index structure that has been proposed to locate strings that are exactly matched to a target string. No two edges out of a node can have edge-labels beginning with the same character. For any leaf i, the concatenation of the edge-labels on the path from the root to leaf i exactly spells out the suffix of string that starts at position i.

Spring-2008 MMDB-Audio 22  1 ababc 1 2 babc Ex:ababc {ababc,babc,abc,bc,c}  ab 2 babc 1 abcc 3  b 1 ab c 3 abc 4 2 c  14 abb c 3 abc 2 c 5 c

Spring-2008 MMDB-Audio 23 Ex:”Do Re Do Re Mi” →ababc a b a a c c b c

Spring-2008 MMDB-Audio 24 Numeric Mapping Numeric Mapping Function  v(m):the integer value of segment of m adjacent notes  m: adjacent notes from melody feature string  P(x i ):the integer value of each note  1  i  m

Spring-2008 MMDB-Audio 25 Numeric Mapping (Con.) For example: A music feature string denoted by ‘bcdbc’, n=10, m=4 b c d b c =   b c d b =   c d b c

Spring-2008 MMDB-Audio 26 Example: two tigers (S 1 : Do Re Mi Do Do Re Mi Do) MelodyMusic SegmentV(4)Integer value abcaabcaabca 0* * * *  bcaa 1* * * * caab 2* * * * aabc 0* * * * abca 0* * * *  The integer value of music of two tigers.

Spring-2008 MMDB-Audio 27 Numeric Indexing Structure (R-Tree) (21,2110) s12 NULL s11 3 NULL s14 NULL s15 NULL Non-leaf Node Leaf Node Link List

Spring-2008 MMDB-Audio 28 Pitch Change abca→bcdb─ 》 1,1,-2  m: adjacent notes from melody feature string  Adj: the maximum value of distance of two pitches  D: the total number of distances of pitches

Spring-2008 MMDB-Audio 29 Example: “abcaabca” Suppose: m=10, Adj=9, D=19 Music Segment valueV(4) Integer value ,10, 710* * *  , 7, 910* * * ,9,107* * * ,10, 109* * * ,10, 710* * * 

Spring-2008 MMDB-Audio 30 Numeric Index  2726,2727)  3392,3809) s22 NULL s NULL s15 NULL s23 NULL s21 NULL

Spring-2008 MMDB-Audio 31 Searching in Numeric Index Exact Matching  For example: Music query segment is ‘ccdbb’  {ccdbb}→{ccdb} →{cdbb} V(4) {1322} V(4) {1132}

Spring-2008 MMDB-Audio 32  21,1002)  1132,1322)  2110,3224) s12 NULL s11 3 NULL s23s33 NULL s22s14 NULL s32 NULL s21 NULL s31 NULL s34 NULL s15 NULL 1132 s23 s34 NULL 1322 s22 s31 NULL Non-leaf Node Leaf Node Link List  1132,1322) {s2,s3}  {s2,s3}→ {s2,s3} position_s2  2,3),position_s3  1,4) →s2.

Spring-2008 MMDB-Audio 33 Approximate Searching <= h  n 0 <= h  n 1  a multiple of n 1 <= h  n 2  a multiple of n 2 … <= h  n m-1  a multiple of n m-1 n: the number of pitches m: adjacent notes from melody feature string h: the distance of two pitches We can examine the difference between the transformed value of the query string and existing data.

Spring-2008 MMDB-Audio 34 Example:  1) <= 1  10 0  2) <= 1  10 1  a multiple of 10 1  3) <= 1  10 2  a multiple of 10 2  4) <= 1  10 3  a multiple of 10 3 Approximate matching conditions for m=4, n=10,h=1 Ex: b b c d a b c d

Spring-2008 MMDB-Audio 35 Multi-Feature indexing Combine Suffix tree Independent Suffix tree Twin Suffix tree Grid-Twin Suffix tree Numeric Index Hybrid Multi-feature Index

Spring-2008 MMDB-Audio 36 Combine Suffix Tree Ex:”a 1 a 2 b 1 →{12,7}” “121→{12,7,1,6…}” The feature strings are directly used to construct the index in the index structure Combined Suffix Tree.

Spring-2008 MMDB-Audio 37 Independent Suffix Tree constructed from “a 1 b 2 a 1 b 2 c 2 ” (Melody:ababc) (Rhythm:12122) The Independent Suffix Trees separates the feature strings into a melody and a rhythm string and stores them in two independent suffix trees.

Spring-2008 MMDB-Audio 38 Twin Suffix Tree Twin Suffix Tree is constructed by adding additional information to the Independent Tree. This index structure consists of a melody and a rhythm suffix tree with links pointing.

Spring-2008 MMDB-Audio 39 Twin Suffix Tree The Twin Suffix Tree constructed from “a 1 b 2 a 2 b 1 a 2 b 2 c 2 ”

Spring-2008 MMDB-Audio 40 Grid-Twin Suffix Tree Use a hash function to map each suffix of the feature string into a specific bucket of a 2D grid. The hash function uses the first n symbols of the suffix to map it into a specific bucket.

Spring-2008 MMDB-Audio 41 Grid-Twin Suffix Tree ”a1b2a2c1a3””a1b2a2c1a3”

Spring-2008 MMDB-Audio 42 Condensed Grid-Twin Suffix Tree

Spring-2008 MMDB-Audio 43 Condensed Grid-Twin Suffix Tree “abaca” “caaca” entry Music ID entry Music ID

Spring-2008 MMDB-Audio 44 Multi-Feature Numeric Indexing for Music Data rhythm Melody:“a 1 b 1 c 1 a 1 ” melody

Spring-2008 MMDB-Audio 45 Multi-Feature Numeric Indexing for Music Data Non- Leaf Node Leaf Node Link List

Spring-2008 MMDB-Audio 46 Multi-Feature Numeric Indexing for Music Data melody rhythm chord 500

Spring-2008 MMDB-Audio 47 Hybrid Multi-Feature Index Using a multi-feature tree structure instead of grid structure in GTST. (2, 3) (3, 5) (6, 2) (5, 5) (4, 3.75) (1, 1) (1.5, 2 )

Spring-2008 MMDB-Audio 48 Suffix Trees with Bit Arrays Instead of the links between corresponding feature nodes in Twin Suffix Tree, the bit arrays are created to indicate the relationships between suffix trees.

Spring-2008 MMDB-Audio 49 Feature Extraction of Music Data We can find some sequence of notes appeared more than one time in a music object, which are called the repeating patterns. A lot of researches in musicology and music psychology consent that the repeating pattern is one of general features in music structure modeling.

Spring-2008 MMDB-Audio 50 Repeating Patterns of Music Data Repeating patterns: In string S, there is a sub-string appearing more than once and its length being equal to or greater than 2. Non-trivial repeating patterns: The frequency of the repeating pattern X appearing in the string S is more than it is appearing in any other repeating patterns. Fault tolerant non-trivial repeating patterns: It allows the sequences with partial different notes being as in the same non-trivial repeating pattern.

Spring-2008 MMDB-Audio 51 Example: RP C-D-E-FC-D-ED-E-F C-DD-E Freq23233 RPE-FCDEF Freq23332 non-trivial: freq(“C-D-E-F”) = freq(“D-E-F”) = freq(“E-F”) = freq(“F”) =2 freq(“C-D-E”) = freq(“C-D”) = freq(“D-E”) = freq(“C”) =freq(“D”) = freq(“E”) = 3. ===>only “C-D-E-F” and “C-D-E” are non-trivial. Consider the melody string “C-D-E-F-C-D-E-C-D-E-F”, this melody string has ten repeating patterns

Spring-2008 MMDB-Audio 52 Music Feature Extractions Correlative Matrix FastPET RP-Tree 2RC Similar Non-trivial Repeating Pattern Fault Tolerance Non-trivial Repeating Patterns

Spring-2008 MMDB-Audio 53 CORRELATIVE MATRIX The correlative matrix of the string S=“CAACCAACDCBC" There are four cases to set CS : 1.T i,j =1 and T (i+1),(j+1) = 0 T 1,4 =1 and T 2,5 =0 ---> insert CS=("C",1,0) 2.T i,j =1 and T (i+1),(j+1) ≠ 0 T 1,5 =1 and T 2,6 ≠0 ---> modify to CS=("C",2,1) 3.T i,j >1 and T (i+1),(j+1) ≠ 0 T 2,6 =2 and T 3,7 ≠0 --> insert CS=("CA",1,1),("A",1,1) 4.T i,j >1 and T (i+1),(j+1) = 0 T 4,8 =4 and T 5,9 =0 ---> insert CS=("CAAC",1,0),("AAC",1,1),("AC",1,1) change ("C",6,1) into ("C",7,2) CS=candidate set ==> CS(pattern,rep_count,sub_count)

Spring-2008 MMDB-Audio 54 CORRELATIVE MATRIX (cont.) There are two more tasks we have to do : 1.If a repeating pattern is a substring of another repeating pattern, and their repeating are the same, it will be removed from the candidate set CS. EX:("CA",1,1),("CAA",1,1),("AA",1,1),("AAC",1,1) and ("AC",1,1) are be moved since they are all the substring of the repeating pattern ("CAAC",1,0) 2.We should calculate the real repeating frequency for every repeating pattern found. EX: "C" = rep_count= f =

Spring-2008 MMDB-Audio 55 RP-TREE The RP-tree for the music feature string S=“ABCDEFGHABCDEFGHIJABC” {ABCDEFGH,2,(1,9)} {ABCD,2,(1,9)}{BCDE,2,(2,10)}{CDEF,2,(3,11)}{DEFG,2,(4,12)}{EFGH,2,(5,13)} {AB,3,(1,9,19)}{BC,3,(2,10,20)}{CD,2,(3,11)}{DE,2,(4,12)}{EF,2,(5,13)}{FG,2,(6,14)}{GH,2,(7,15)} {A,3,(1,9,19)}{B,3,(2,10,20)}{C,3,(3,11,21)}{D,2,(4,12)}{E,2,(5,13)}{F,2,(6,14)}{G,2,(7,15)}{H,2,(8,16)}

Spring-2008 MMDB-Audio 56 RP-TREE (cont.) {ABCDEFGH,2,(1,9)} {AB,3,(1,9,19)} {BC,3,(2,10,20)} (a) {AB,3,(1,9,19)} {BC,3,(2,10,20)} {ABCDEFGH,2,(1,9)} {ABC,3,(1,9,19)} (b) {ABCDEFGH,2,(1,9)} {ABC,3,(1,9,19)} (c)

Spring-2008 MMDB-Audio 57 FastPET: Fast Pattern Extracting Technique abcdbcdabcabcd a- 1 1 b c d b c d - 3 a - 1 b - 2 c - 3 a - b - c - d - Correlative Matrix for “abcdbcdabcabcd” i j

Spring-2008 MMDB-Audio 58 FastPET (cont.) abcdbcdabcabcd a- 1 1 b c d … d ‘abc’ P[8] = {3},P[11] = {3} PatternSet = {{‘abc’,3}}

Spring-2008 MMDB-Audio 59 FastPET (cont.) abcdbcdabcabcd a- b c d b … … d P[8] = {3},P[11] = {3, 4} PatternSet = {{‘abc’,3},{‘abcd’, 2}}

Spring-2008 MMDB-Audio 60 FastPET (cont.) i Non-trivial repeating pattern bcabcbcdabcd Frequency4332 Pattern Length2334 Starting position 2,5,9,121,8,112,5,121,11 P[5] = {3}, P[8] = {3},P[9] = {2}, P[11] = {3, 4},P[12] = {3} PatternSet = {{‘bc’, 4}, {‘abc’,3}, {‘bcd’, 3}, {‘abcd’, 2}} j Non-trivial RP for ’abcdbcdabcabcd’

Spring-2008 MMDB-Audio 61 2RC (Two-Row Comparsion) abcdbcdabcabcd a111  2RC can provide memory saving, O(n).  Example : S=“abcdbcdabcabcd” Row A i=

Spring-2008 MMDB-Audio 62 2RC (cont.) abcdbcdabcabcd a111 b2122 Row A Row B i=2

Spring-2008 MMDB-Audio 63 abcdbcdabcabcd b2122 c3233 Row A Row B i=3 2RC (cont.)

Spring-2008 MMDB-Audio 64 abcdbcdabcabcd c3233 d434 Row A Row B PatternSet={{“abc”,3}} i=4 2RC (cont.)

Spring-2008 MMDB-Audio 65 True suffix tree approach for non-trivial repeating pattern discovering (TRP) Step 1. constructing suffix tree by adding a stop symbol ‘#’ into the tail of string S. Step 2. finding out repeating patterns. Step 3. pattern sweeping..

Spring-2008 MMDB-Audio 66 Example 1 - Step 1 of TRP 3 root 2bcdabcabcd# 12# 5 abcabcd# d 9 abcd# bc 6 abcabcd# bcdabcabcd# 13 # d 10 abcd# c 4 bcdabcabcd# 7 abcabcd# 14 # d 8 1 bcdabcabcd# 11 # abcd# dabc True suffix tree of S=“abcdbcdabcabcd#”.

Spring-2008 MMDB-Audio 67 Example 1 - Step 2 of TRP Repeating patternsabcdabcbcdcdbc Frequency23334 Pattern Length43322 Starting position1,111,8,112,5,123,6,132,5,9,12 All repeating patterns of music object S – “abcdbcdabcabcd”.

Spring-2008 MMDB-Audio 68 Example 1 - Step 3 of TRP Repeating patternabcdabcbcdcdbc Pattern Length43322 Starting position1,111,8,112,5,123,6,132,5,9,12 Ending positions4, 143, 10, 134, 7, 14 3, 6, 10, 13 Scopes of repeating pattern 1~4 11~14 1~3 8~10 11~13 2~4 5~7 12~14 3~4 6~7 13~14 2~3 5~6 9~10 12~13 Pattern sweeping for music object S – “abcdbcdabcabcd”. Non-trivial repeating patterns

Spring-2008 MMDB-Audio 69 Example 2 - TRP Repeating patternsLengthFrequency Scope aa29 1~2 aaa38 1~3 aaaa47 1~4 aaaaa56 1~5 aaaaaa65 1~6 aaaaaaa74 1~7 aaaaaaaa83 1~8 aaaaaaaaa92 1~9 Non-trivial repeating pattern Pattern sweeping for repeating patterns of S = “aaaaaaaaaa”.

Spring-2008 MMDB-Audio 70 Fault Tolerant Non-trivial Repeating Pattern Discovering Step 1. Constructing Suffix Tree Step 2. Creating Repeating Pattern Table Step 3. Greedy Concatenating Repeating Patterns Step 4. Exacting Fault Tolerant Non-trivial Repeating Patterns

Spring-2008 MMDB-Audio 71 Step 2 of FTRP RP Table RP abcbcbcbccbc Length 543 Start 1,62,73,8 End 1+5-1= = = = = =10 Scope 1~5 6~10 2~5 7~10 3~5 8~10 Creating Repeating Pattern Table

Spring-2008 MMDB-Audio 72 Step 3 of FTRP Greedy Concatenating Repeating Patterns Position Notebcfdaehgbcdae RP fault 1fault 0 bc?dae RP bcdae Scope 1~2, 9~10 4~6, 11~13

Spring-2008 MMDB-Audio 73 Step 4 of FTRP FTRPbc?dae Scope1~69~13 RPbcdae Scope1~2,9~104~6,11~13 “bc” and “dae” are all in “bc?dae” Exacting Fault Tolerant Non-trivial Repeating Patterns

Spring-2008 MMDB-Audio 74 Performance Study The Effect on Repeating Pattern Found

Spring-2008 MMDB-Audio 75 Hit Ratio Improvement