Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Slides:



Advertisements
Similar presentations
Fourier Transform and its Application in Image Processing
Advertisements

Fast Fourier Transform for speeding up the multiplication of polynomials an Algorithm Visualization Alexandru Cioaca.
Parallel Fast Fourier Transform Ryan Liu. Introduction The Discrete Fourier Transform could be applied in science and engineering. Examples: ◦ Voice recognition.
Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Use of Frequency Domain Telecommunication Channel |A| f fcfc Frequency.
Fourier Transform (Chapter 4)
1 Chapter 16 Fourier Analysis with MATLAB Fourier analysis is the process of representing a function in terms of sinusoidal components. It is widely employed.
Image Processing A brief introduction (by Edgar Alejandro Guerrero Arroyo)
Fast Fourier Transform Lecture 6 Spoken Language Processing Prof. Andrew Rosenberg.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Reminder Fourier Basis: t  [0,1] nZnZ Fourier Series: Fourier Coefficient:
Insertion Sort.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.
Fast Fourier Transform. Agenda Historical Introduction CFT and DFT Derivation of FFT Implementation.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 8 Matrix-vector Multiplication.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
May 29, Final Presentation Sajib Barua1 Development of a Parallel Fast Fourier Transform Algorithm for Derivative Pricing Using MPI Sajib Barua.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4)
Image Fourier Transform Faisal Farooq Q: How many signal processing engineers does it take to change a light bulb? A: Three. One to Fourier transform the.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 4 Image Slides.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Introduction to Algorithms
Transforms: Basis to Basis Normal Basis Hadamard Basis Basis functions Method to find coefficients (“Transform”) Inverse Transform.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 2 Image Slides.
Chapter 8 Traffic-Analysis Techniques. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 8-1.
Fast Fourier Transform Irina Bobkova. Overview I. Polynomials II. The DFT and FFT III. Efficient implementations IV. Some problems.
Fast Fourier Transforms
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Parallel Programming in C with MPI and OpenMP
Fundamentals of Electric Circuits Chapter 9
CHAPTER 8 DSP Algorithm Implementation Wang Weilian School of Information Science and Technology Yunnan University.
CSC 7600 Lecture 18: Applied Parallel Algorithms 4 Spring 2009 HIGH PERFORMANCE COMPUTING: MODELS, METHODS, & MEANS APPLIED PARALLEL ALGORITHMS 4 Dr. Hartmut.
Fourier series. The frequency domain It is sometimes preferable to work in the frequency domain rather than time –Some mathematical operations are easier.
Transforms. 5*sin (2  4t) Amplitude = 5 Frequency = 4 Hz seconds A sine wave.
Fundamentals of Electric Circuits Chapter 17
Chapter 4: Image Enhancement in the Frequency Domain Chapter 4: Image Enhancement in the Frequency Domain.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Chapter 19.
CS 6068 Parallel Computing Fall 2013 Lecture 10 – Nov 18 The Parallel FFT Prof. Fred Office Hours: MWF.
FFT USING OPEN-MP Done by: HUSSEIN SALIM QASIM & Tiba Zaki Abdulhameed
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Chapter 7: The Fourier Transform 7.1 Introduction
1 © 2010 Cengage Learning Engineering. All Rights Reserved. 1 Introduction to Digital Image Processing with MATLAB ® Asia Edition McAndrew ‧ Wang ‧ Tseng.
Fundamentals of Electric Circuits Chapter 9
Part 4 Chapter 16 Fourier Analysis PowerPoints organized by Prof. Steve Chapra, University All images copyright © The McGraw-Hill Companies, Inc. Permission.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Inverse DFT. Frequency to time domain Sometimes calculations are easier in the frequency domain then later convert the results back to the time domain.
7- 1 Chapter 7: Fourier Analysis Fourier analysis = Series + Transform ◎ Fourier Series -- A periodic (T) function f(x) can be written as the sum of sines.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Fourier Transform.
Fourier and Wavelet Transformations Michael J. Watts
Fast Fourier Transforms. 2 Discrete Fourier Transform The DFT pair was given as Baseline for computational complexity: –Each DFT coefficient requires.
CS 376b Introduction to Computer Vision 03 / 17 / 2008 Instructor: Michael Eckmann.
Chapter 13 Transportation Demand Analysis. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display
1 EE2003 Circuit Theory Chapter 17 The Fourier Series Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The Frequency Domain Digital Image Processing – Chapter 8.
Presented by Huanhuan Chen University of Science and Technology of China 信号与信息处理 Signal and Information Processing.
Digital Image Processing Lecture 8: Fourier Transform Prof. Charlene Tsai.
Integral Transform Method
Part 4 Chapter 16 Fourier Analysis
Parallel Programming By J. H. Wang May 2, 2017.
Fourier and Wavelet Transformations
DFT and FFT By using the complex roots of unity, we can evaluate and interpolate a polynomial in O(n lg n) An example, here are the solutions to 8 =
Parallel Programming in C with MPI and OpenMP
Chapter 9 Computation of the Discrete Fourier Transform
Parallel Programming in C with MPI and OpenMP
Discrete Fourier Transform
Presentation transcript:

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 15 The Fast Fourier Transform

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Outline Fourier analysis Fourier analysis Discrete Fourier transform Discrete Fourier transform Fast Fourier transform Fast Fourier transform Parallel implementation Parallel implementation

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Discrete Fourier Transform Many applications in science, engineering Many applications in science, engineering Examples Examples  Voice recognition  Image processing Straightforward implementation:  (n 2 ) Straightforward implementation:  (n 2 ) Fast Fourier transform:  (n log n) Fast Fourier transform:  (n log n)

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Fourier Analysis Fourier analysis: Represent continuous functions by potentially infinite series of sine and cosine functions Fourier analysis: Represent continuous functions by potentially infinite series of sine and cosine functions Discrete Fourier transform: Map a sequence over time to another sequence over frequency Discrete Fourier transform: Map a sequence over time to another sequence over frequency  Signal strength as a function of time   Fourier coefficients as a function of frequency

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT Example (1/4) 16 data points representing signal strength over time

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT Example (2/4) DFT yields amplitudes and frequencies of sine/cosine functions

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT Example (3/4) Plot of four constituent sine/cosine functions and their sum

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT Example (4/4) Continuous function and original 16 samples.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. DFT of Speech Sample “An gorra cats are furrier...” Signal Frequency and amplitude Figure courtesy Ron Cole and Yeshwant Muthusamy of the Oregon Graduate Institute

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Computing DFT Matrix-vector product F n x Matrix-vector product F n x  x is input vector (signal samples)  f i,j =  n ij for 0  i, j < n and  n is primitive nth root of unity

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Example 1 Compute DFT of vector (2, 3) Compute DFT of vector (2, 3)  2, the primitive square root of unity, is -1  2, the primitive square root of unity, is -1

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Example 2 Compute DFT of vector (1, 2, 4, 3) Compute DFT of vector (1, 2, 4, 3) The primitive 4th root of unity is i The primitive 4th root of unity is i

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Fast Fourier Transform An  (n log n) algorithm to perform DFT An  (n log n) algorithm to perform DFT Based on divide-and-conquer strategy Based on divide-and-conquer strategy Suppose we want to compute f(x) Suppose we want to compute f(x) We define two new functions, f [0] and f [1] We define two new functions, f [0] and f [1]

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. FFT (continued) Note: f(x) = f [0] (x 2 ) + x f [1] (x 2 ) Note: f(x) = f [0] (x 2 ) + x f [1] (x 2 ) Problem of evaluating f (x) at n values of  reduces to Problem of evaluating f (x) at n values of  reduces to  Evaluating f [0] (x) and f [1] (x) at n/2 values of   Performing f [0] (x 2 ) + x f [1] (x 2 ) Leads to recursive algorithm with time complexity  (n log n) Leads to recursive algorithm with time complexity  (n log n)

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Iterative Implementation Preferable Well-written iterative version performs fewer index computations than recursive version Well-written iterative version performs fewer index computations than recursive version Iterative version evaluates key common sub-expression only once Iterative version evaluates key common sub-expression only once Easier to derive parallel FFT algorithm when sequential algorithm in iterative form Easier to derive parallel FFT algorithm when sequential algorithm in iterative form

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Recursive  Iterative (1/3) Recursive implementation of FFT

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Recursive  Iterative (2/3) Determining which computations are performed for each function invocation

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Recursive  Iterative (3/3) Tracking the flow of data values (input vector at bottom)

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Program Design Domain decomposition Domain decomposition  Associate primitive task with each element of input vector a and corresponding element of output vector y Add channels to handle communications between tasks Add channels to handle communications between tasks

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. FFT Task/Channel Graph

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Agglomeration and Mapping Agglomerate primitive tasks associated with contiguous elements of vector Agglomerate primitive tasks associated with contiguous elements of vector Map one agglomerated task to each process Map one agglomerated task to each process

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. After Agglomeration, Mapping Input Output

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Phases of Parallel FFT Algorithm Phase 1: Processes permute a’s (all-to-all communication) Phase 1: Processes permute a’s (all-to-all communication) Phase 2: Phase 2:  First log n – log p iterations of FFT  No message passing is required Phase 3: Phase 3:  Final log p iterations  Processes organized as logical hypercube  In each iteration every process swaps values with partner across a hypercube dimension

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Complexity Analysis Each process performs equal share of computation:  (n log n / p) Each process performs equal share of computation:  (n log n / p) All-to-all communication:  (n log p / p) All-to-all communication:  (n log p / p) Sub-vector swaps during last log p iterations:  (n log p / p) Sub-vector swaps during last log p iterations:  (n log p / p)

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Isoefficiency Analysis Sequential time complexity:  (n log n) Sequential time complexity:  (n log n) Parallel overhead:  (n log p) Parallel overhead:  (n log p) Isoefficiency relation: n log n  C n log p  log n  C log p  n  p C Isoefficiency relation: n log n  C n log p  log n  C log p  n  p C Scalability depends C, a function of the ratio between computing speed and communication speed. Scalability depends C, a function of the ratio between computing speed and communication speed.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Summary Discrete Fourier transform used in many scientific and engineering applications Discrete Fourier transform used in many scientific and engineering applications Fast Fourier transform important because it implements DFT in time  (n log n) Fast Fourier transform important because it implements DFT in time  (n log n) Developed parallel implementation of FFT Developed parallel implementation of FFT Why isn’t scalability better? Why isn’t scalability better?   (n log n) sequential algorithm  Parallel version requires all-to-all data exchange