Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Fourier Transform

Similar presentations


Presentation on theme: "Fast Fourier Transform"β€” Presentation transcript:

1 Fast Fourier Transform
Algorithms in Action Fast Fourier Transform Haim Kaplan, Uri Zwick Tel Aviv University March 2016 Last updated: March 28, 2017

2 String Matching abraabracadabracadabraabara abracadabra abracadabra
Given a text of length 𝑛 and a pattern of length π‘š, find all occurrences of the pattern in the text. The naΓ―ve algorithm runs in 𝑂 π‘šπ‘› time. Several classical algorithms run in 𝑂 π‘š+𝑛 time. [Knuth-Morris-Pratt (1977)] [Boyer-Moore (1977)]

3 More String Matching Problems
abraabracadabracadabraabara abracadabra abracadabra Count the number of matches/mismatches in each alignment of the pattern with the text. (Find all aligments with at most π‘˜ mismatches.) Allow a wildcard (β€œdon’t care”) (βˆ—) that match any (single) symbol in the pattern and/or text. β€œTraditional” string matching techniques are not so efficient for these extensions.

4 (Cross-)Correlation π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 𝑦 0 𝑦 1 𝑦 2 𝑦 3 𝑧 βˆ’3 = π‘₯ 0 𝑦 3
𝑧 βˆ’3 = π‘₯ 0 𝑦 3 𝑧 βˆ’2 = π‘₯ 0 𝑦 2 + π‘₯ 1 𝑦 3 𝑧 βˆ’1 = π‘₯ 0 𝑦 1 + π‘₯ 1 𝑦 2 + π‘₯ 2 𝑦 3 𝑧 0 = π‘₯ 0 𝑦 0 + π‘₯ 1 𝑦 1 + π‘₯ 2 𝑦 2 + π‘₯ 3 𝑦 3 𝑧 1 = π‘₯ 1 𝑦 0 + π‘₯ 2 𝑦 1 + π‘₯ 3 𝑦 2 𝑧 2 = π‘₯ 2 𝑦 0 + π‘₯ 3 𝑦 1 𝑧 3 = π‘₯ 3 𝑦 0

5 (Cross-)Correlation 𝑧 π‘˜ = 𝑖 π‘₯ 𝑖 𝑦 π‘–βˆ’π‘˜ = 𝑗 π‘₯ 𝑗+π‘˜ 𝑦 𝑗 = π±βˆ— 𝐲 𝑅 π‘˜+π‘›βˆ’1
A convolution without the initial reversal, with a shift of indices. 𝑧 π‘˜ = 𝑖 π‘₯ 𝑖 𝑦 π‘–βˆ’π‘˜ = 𝑗 π‘₯ 𝑗+π‘˜ 𝑦 𝑗 = π±βˆ— 𝐲 𝑅 π‘˜+π‘›βˆ’1 π‘˜=βˆ’(π‘›βˆ’1),…,π‘›βˆ’1. The correlation of two vectors of length 𝑛 can be computed in 𝑂 𝑛 log 𝑛 time.

6 (Cross-)Correlation (unequal lengths)
π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 π‘₯ 4 π‘₯ 5 𝑦 0 𝑦 1 𝑦 2 𝑦 3 𝑧 βˆ’3 = π‘₯ 0 𝑦 3

7 (Cross-)Correlation π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 π‘₯ 4 π‘₯ 5 𝑦 0 𝑦 1 𝑦 2 𝑦 3
𝑧 βˆ’3 = π‘₯ 0 𝑦 3 𝑧 βˆ’2 = π‘₯ 0 𝑦 2 + π‘₯ 1 𝑦 3

8 (Cross-)Correlation π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 π‘₯ 4 π‘₯ 5 𝑦 0 𝑦 1 𝑦 2 𝑦 3
𝑧 βˆ’3 = π‘₯ 0 𝑦 3 𝑧 βˆ’2 = π‘₯ 0 𝑦 2 + π‘₯ 1 𝑦 3 𝑧 βˆ’1 = π‘₯ 0 𝑦 1 + π‘₯ 1 𝑦 2 + π‘₯ 2 𝑦 3

9 (Cross-)Correlation π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 π‘₯ 4 π‘₯ 5 𝑦 0 𝑦 1 𝑦 2 𝑦 3
𝑧 βˆ’3 = π‘₯ 0 𝑦 3 𝑧 βˆ’2 = π‘₯ 0 𝑦 2 + π‘₯ 1 𝑦 3 𝑧 βˆ’1 = π‘₯ 0 𝑦 1 + π‘₯ 1 𝑦 2 + π‘₯ 2 𝑦 3 𝑧 0 = π‘₯ 0 𝑦 0 + π‘₯ 1 𝑦 1 + π‘₯ 2 𝑦 2 + π‘₯ 3 𝑦 3

10 (Cross-)Correlation π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 π‘₯ 4 π‘₯ 5 𝑦 0 𝑦 1 𝑦 2 𝑦 3
𝑧 βˆ’3 = π‘₯ 0 𝑦 3 𝑧 βˆ’2 = π‘₯ 0 𝑦 2 + π‘₯ 1 𝑦 3 𝑧 βˆ’1 = π‘₯ 0 𝑦 1 + π‘₯ 1 𝑦 2 + π‘₯ 2 𝑦 3 𝑧 0 = π‘₯ 0 𝑦 0 + π‘₯ 1 𝑦 1 + π‘₯ 2 𝑦 2 + π‘₯ 3 𝑦 3 𝑧 1 = π‘₯ 1 𝑦 0 + π‘₯ 2 𝑦 1 + π‘₯ 3 𝑦 2 + π‘₯ 4 𝑦 3

11 (Cross-)Correlation π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 π‘₯ 4 π‘₯ 5 𝑦 0 𝑦 1 𝑦 2 𝑦 3
𝑧 βˆ’3 = π‘₯ 0 𝑦 3 𝑧 βˆ’2 = π‘₯ 0 𝑦 2 + π‘₯ 1 𝑦 3 𝑧 βˆ’1 = π‘₯ 0 𝑦 1 + π‘₯ 1 𝑦 2 + π‘₯ 2 𝑦 3 𝑧 0 = π‘₯ 0 𝑦 0 + π‘₯ 1 𝑦 1 + π‘₯ 2 𝑦 2 + π‘₯ 3 𝑦 3 𝑧 1 = π‘₯ 1 𝑦 0 + π‘₯ 2 𝑦 1 + π‘₯ 3 𝑦 2 + π‘₯ 4 𝑦 3 𝑧 2 = π‘₯ 2 𝑦 0 + π‘₯ 3 𝑦 1 + π‘₯ 4 𝑦 2 + π‘₯ 5 𝑦 3 𝑧 3 = π‘₯ 3 𝑦 0 + π‘₯ 4 𝑦 1 + π‘₯ 5 𝑦 2

12 (Cross-)Correlation π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 π‘₯ 4 π‘₯ 5 𝑦 0 𝑦 1 𝑦 2 𝑦 3
𝑧 βˆ’3 = π‘₯ 0 𝑦 3 𝑧 βˆ’2 = π‘₯ 0 𝑦 2 + π‘₯ 1 𝑦 3 𝑧 βˆ’1 = π‘₯ 0 𝑦 1 + π‘₯ 1 𝑦 2 + π‘₯ 2 𝑦 3 𝑧 0 = π‘₯ 0 𝑦 0 + π‘₯ 1 𝑦 1 + π‘₯ 2 𝑦 2 + π‘₯ 3 𝑦 3 𝑧 1 = π‘₯ 1 𝑦 0 + π‘₯ 2 𝑦 1 + π‘₯ 3 𝑦 2 + π‘₯ 4 𝑦 3 𝑧 2 = π‘₯ 2 𝑦 0 + π‘₯ 3 𝑦 1 + π‘₯ 4 𝑦 2 + π‘₯ 5 𝑦 3 𝑧 3 = π‘₯ 3 𝑦 0 + π‘₯ 4 𝑦 1 + π‘₯ 5 𝑦 2 𝑧 4 = π‘₯ 4 𝑦 0 + π‘₯ 5 𝑦 1

13 (Cross-)Correlation π‘₯ 0 π‘₯ 1 π‘₯ 2 π‘₯ 3 π‘₯ 4 π‘₯ 5 𝑦 0 𝑦 1 𝑦 2 𝑦 3
𝑧 βˆ’3 = π‘₯ 0 𝑦 3 𝑧 βˆ’2 = π‘₯ 0 𝑦 2 + π‘₯ 1 𝑦 3 𝑧 βˆ’1 = π‘₯ 0 𝑦 1 + π‘₯ 1 𝑦 2 + π‘₯ 2 𝑦 3 𝑧 0 = π‘₯ 0 𝑦 0 + π‘₯ 1 𝑦 1 + π‘₯ 2 𝑦 2 + π‘₯ 3 𝑦 3 𝑧 1 = π‘₯ 1 𝑦 0 + π‘₯ 2 𝑦 1 + π‘₯ 3 𝑦 2 + π‘₯ 4 𝑦 3 𝑧 2 = π‘₯ 2 𝑦 0 + π‘₯ 3 𝑦 1 + π‘₯ 4 𝑦 2 + π‘₯ 5 𝑦 3 𝑧 3 = π‘₯ 3 𝑦 0 + π‘₯ 4 𝑦 1 + π‘₯ 5 𝑦 2 𝑧 4 = π‘₯ 4 𝑦 0 + π‘₯ 5 𝑦 1 𝑧 5 = π‘₯ 5 𝑦 0

14 (Cross-)Correlation 𝑧 π‘˜ = 𝑖 π‘₯ 𝑖 𝑦 π‘–βˆ’π‘˜ = 𝑗 π‘₯ 𝑗+π‘˜ 𝑦 𝑗 = π±βˆ— 𝐲 𝑅 π‘˜+π‘šβˆ’1
𝑧 π‘˜ = 𝑖 π‘₯ 𝑖 𝑦 π‘–βˆ’π‘˜ = 𝑗 π‘₯ 𝑗+π‘˜ 𝑦 𝑗 = π±βˆ— 𝐲 𝑅 π‘˜+π‘šβˆ’1 If 𝐱 is of length 𝑛 and 𝐲 of length π‘š, where π‘šβ‰€π‘›, then π‘˜=βˆ’(π‘šβˆ’1),…,π‘›βˆ’1. Sometimes, only the values π‘˜=0,…,π‘›βˆ’π‘š, corresponding to a full overlap of 𝐱 with a shift of 𝐲, are of interest. Exercise: The correlation of two vectors of length 𝑛 and π‘š, where π‘šβ‰€π‘›, can be computed in 𝑂 𝑛 log π‘š time.

15 Counting mismatches [Fischer-Paterson (1974)]
Let Ξ£ be the alphabet of the pattern and text. We may assume that Ξ£ β‰€π‘š+1. (Why?) For every π‘ŽβˆˆΞ£ create two Boolean strings: 𝑃 π‘Ž 𝑗 =1 iff 𝑃 𝑗 =π‘Ž 𝑇 π‘Ž 𝑖 =1 iff 𝑇 𝑖 β‰ π‘Ž Correlation of 𝑃 π‘Ž and 𝑇 π‘Ž counts mismatches involving π‘Ž.

16 abraabracadabracadabraabara
Counting mismatches abraabracadabracadabraabara abracadabra

17 Counting mismatches abraabracadabracadabraabara abracadabra
abraabracadabracadabraabara abracadabra

18 Counting mismatches Let Ξ£ be the alphabet of the pattern and text.
We may assume that Ξ£ β‰€π‘š+1. (Why?) For every π‘ŽβˆˆΞ£ create two Boolean strings: 𝑃 π‘Ž 𝑗 =1 iff 𝑃 𝑗 =π‘Ž 𝑇 π‘Ž 𝑖 =1 iff 𝑇 𝑖 β‰ π‘Ž Correlation of 𝑃 π‘Ž and 𝑇 π‘Ž counts mismatches involving π‘Ž. Summing over all π‘ŽβˆˆΞ£ we get the total no. of mismatches. Complexity: 𝑂( Ξ£ 𝑛 log π‘š ) word operations. (Each word assumed to hold Θ log 𝑛 bits.) Fast only if Ξ£ is small.

19 Counting mismatches with wildcards [Fischer-Paterson (1974)]
For every π‘ŽβˆˆΞ£ create two Boolean strings: 𝑃 π‘Ž 𝑗 =1 iff 𝑃 𝑗 =π‘Ž 𝑇 π‘Ž 𝑖 =1 iff 𝑇 𝑖 β‰ π‘Ž and 𝑇 𝑖 β‰  βˆ— Complexity: 𝑂( Ξ£ 𝑛 log π‘š ) word operations.

20 Counting mismatches with wildcards
abraabraca*abracadabraabara abracada*ra abraabra*adabracadabraabara abracada*ra

21 Counting mismatches with wildcards
If we only want to find exact matches, replace each character π‘ŽβˆˆΞ£ by a specific log 2 |Ξ£| bit string

22 Counting mismatches with wildcards
b r βˆ— c 001 010 011 βˆ—βˆ—βˆ— 100 Count mismatches of the binary strings as before (2 convolutions) A result of 0 corresponds to a match Complexity drops to 𝑂( log Ξ£ 𝑛 log π‘š ). Can we get rid of the dependence on |Ξ£| ?

23 𝐿 2 -matching [Lipsky-Porat (2011)]
Standard string matching uses the Hamming distance. Two characters either match or they do not. π‘Ž is not closer to 𝑏 than to 𝑧. Suppose that each β€œcharacter” is a real number. We want to find approximate matches. For each π‘˜=0,1,…,π‘›βˆ’π‘š we want to compute 𝑑 π‘˜ = 𝑗=0 π‘šβˆ’1 𝑝 𝑗 βˆ’ 𝑑 π‘˜+𝑗 2 𝐿 2 -distance: π±βˆ’π² 2 = 𝑗=0 π‘šβˆ’1 π‘₯ 𝑗 βˆ’ 𝑦 𝑗 2

24 𝐿 2 -matching can be computed in 𝑂(𝑛 log π‘š ) time.
[Lipsky-Porat (2011)] 𝑗=0 π‘šβˆ’1 𝑝 𝑗 βˆ’ 𝑑 π‘˜+𝑗 2 = 𝑗=0 π‘šβˆ’1 𝑝 𝑗 2 βˆ’2 𝑗=0 π‘šβˆ’1 𝑝 𝑗 𝑑 π‘˜+𝑗 + 𝑗=0 π‘šβˆ’1 𝑑 π‘˜+𝑗 2 Constant. 𝑂(π‘š) time. Correlation. 𝑂 𝑛 log π‘š time. Easy in 𝑂 𝑛 time. 𝐿 2 -matching can be computed in 𝑂(𝑛 log π‘š ) time.

25 Exact matches with wildcards
[Clifford-Clifford (2007)] Replace each character by a positive integer. Replace the wildcard by 0. For each π‘˜=0,1,…,π‘›βˆ’π‘š compute 𝑑 π‘˜ = 𝑗=0 π‘šβˆ’1 𝑝 𝑗 𝑑 π‘˜+𝑗 𝑝 𝑗 βˆ’ 𝑑 π‘˜+𝑗 2 There is an exact match at position π‘˜ iff 𝑑 π‘˜ =0.

26 Exact matches with wildcards
[Clifford-Clifford (2007)] 𝑑 π‘˜ = 𝑗=0 π‘šβˆ’1 𝑝 𝑗 𝑑 π‘˜+𝑗 𝑝 𝑗 βˆ’ 𝑑 π‘˜+𝑗 2 = 𝑗=0 π‘šβˆ’1 𝑝 𝑗 3 𝑑 π‘˜+𝑗 βˆ’2 𝑗=0 π‘šβˆ’1 𝑝 𝑗 2 𝑑 π‘˜+𝑗 2 + 𝑗=0 π‘šβˆ’1 𝑝 𝑗 𝑑 π‘˜+𝑗 3 Compute three correlations of appropriate sequences in 𝑂 𝑛 log π‘š time. Running time is independent of |Ξ£| ! Assuming that each character fits in an Θ log 𝑛 -bit word and that operations on such words takes constant time.


Download ppt "Fast Fourier Transform"

Similar presentations


Ads by Google