Time-scale and pitch modification Algorithms review Alexey Lukin.

Slides:

Advertisements

Similar presentations

Sound Synthesis Part V: Effects. Plan Overview of effects Chorus effect Treble & bass amplification Saturation Pitch vocoder Summary.

Advertisements

Easily extensible unix software for spectral analysis, display modification, and synthesis of musical sounds James W. Beauchamp School of Music Dept.

ACHIZITIA IN TIMP REAL A SEMNALELOR. Three frames of a sampled time domain signal. The Fast Fourier Transform (FFT) is the heart of the real-time spectrum.

1 Acoustic Sampling Of Instruments Dan Starr Capstone Design Project Advisors: Prof. Catravas Prof. Postow.

What makes a musical sound? Pitch n Hz * 2 = n + an octave n Hz * ( …) = n + a semitone The 12-note equal-tempered chromatic scale is customary,

DFT/FFT and Wavelets ● Additive Synthesis demonstration (wave addition) ● Standard Definitions ● Computing the DFT and FFT ● Sine and cosine wave multiplication.

What is Sound? Sound is the movement of energy through substances in longitudinal (compression/rarefaction) waves. Sound is produced when a force causes.

Synthesis. What is synthesis? Broad definition: the combining of separate elements or substances to form a coherent whole. (

Chapter 2 Data and Signals

AES 120 th Convention Paris, France, 2006 Adaptive Time-Frequency Resolution for Analysis and Processing of Audio Alexey Lukin AES Student Member Moscow.

Overview of Real-Time Pitch Tracking Approaches Music information retrieval seminar McGill University Francois Thibault.

Auto-tuning for Electric Guitars using Digital Signal Processing Pat Hurney, 4ECE 31 st March 2009.

Final Year Project Pat Hurney Digital Pitch Correction for Electric Guitars.

Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.

Loris for Your Cough Roshan Mansinghani, Esmeralda Martinez, James McDougall, Travis McPhail Results: The noise frequencies were completely removed including.

Project by Fridman Eduard Supervisor and Escort Dr. Yizhar Lavner SIPL Lab experiment onTime-Scale and Pitch- Scale Modifications of Speech.

1 Machine learning for note onset detection. Alexandre Lacoste & Douglas Eck.

Adapted representations of audio signals for music instrument recognition Pierre Leveau Laboratoire d’Acoustique Musicale, Paris - France GET - ENST (Télécom.

Content-Based Classification, Search & Retrieval of Audio Erling Wold, Thom Blum, Douglas Keislar, James Wheaton Presented By: Adelle C. Knight.

Spectrum Analyzer. Another Oscilloscope??? Like an oscilloscope Oscilloscope in time domain Spectrum analyzer in frequency domain (selectable)

Gerald Leung.  Implementation Goal of Phase Vocoder  Spectral Analysis and Manipulation  Matlab Implementation  Result Discussion and Conclusion.

An Introduction to S-Transform for Time-Frequency Analysis S.K. Steve Chang SKC-2009.

FFT-based filtering and the Short-Time Fourier Transform (STFT) R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2003.

DEVON BRYANT CS 525 SEMESTER PROJECT Audio Signal MIDI Transcription.

System Microphone Keyboard Output. Cross Synthesis: Two Implementations.

Effects in frequency domain Stefania Serafin Music Informatics Fall 2004.

1 Manipulating Digital Audio. 2 Digital Manipulation  Extremely powerful manipulation techniques  Cut and paste  Filtering  Frequency domain manipulation.

EE2F2 - Music Technology 10. Sampling Early Sampling It’s not a real orchestra, it’s a Mellotron It works by playing tape recordings of a real orchestra.

PH 105 Dr. Cecilia Vogel Lecture 12. OUTLINE  Timbre review  Spectrum  Fourier Synthesis  harmonics and periodicity  Fourier Analysis  Timbre and.

Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.

Square wave Fourier Analysis + + = Adding sines with multiple frequencies we can reproduce ANY shape.

Human Psychoacoustics shows ‘tuning’ for frequencies of speech If a tree falls in the forest and no one is there to hear it, will it make a sound?

5. Multimedia Data. 2 Multimedia Data Representation  Digital Audio  Sampling/Digitisation  Compression (Details of Compression algorithms – following.

Digital Sound and Video Chapter 10, Exploring the Digital Domain.

GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.

Lecture 1 Signals in the Time and Frequency Domains

Power Spectral Density Function

ECE 8443 – Pattern Recognition EE 3512 – Signals: Continuous and Discrete Objectives: Linearity Time Shift and Time Reversal Multiplication Integration.

Harmonic Series and Spectrograms

Multiresolution STFT for Analysis and Processing of Audio

Dual-Channel FFT Analysis: A Presentation Prepared for Syn-Aud-Con: Test and Measurement Seminars Louisville, KY Aug , 2002.

Chapter 16 Speech Synthesis Algorithms 16.1 Synthesis based on LPC 16.2 Synthesis based on formants 16.3 Synthesis based on homomorphic processing 16.4.

Sound and audio. Table of Content 1.Introduction 2.Properties of sound 3.Characteristics of digital sound 4.Calculate audio data size 5.Benefits of using.

Copyright 2004 Ken Greenebaum Introduction to Interactive Sound Synthesis Lecture 11: Modulation Ken Greenebaum.

Introduction to Audio. What is "Audio"? Audio means "of sound" or "of the reproduction of sound“. Specifically, it refers to the range of frequencies.

Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.

Image Processing Architecture, © 2001, 2002, 2003 Oleh TretiakPage 1 ECE-C490 Image Processing Architecture MP-3 Compression Course Review Oleh Tretiak.

Chapter 6 Spectrum Estimation § 6.1 Time and Frequency Domain Analysis § 6.2 Fourier Transform in Discrete Form § 6.3 Spectrum Estimator § 6.4 Practical.

Linearity Recall our expressions for the Fourier Transform and its inverse: The property of linearity: Proof: (synthesis) (analysis)

Pre-Class Music Paul Lansky Six Fantasies on a Poem by Thomas Campion.

VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.

Auto-tuning for Electric Guitars using Digital Signal Processing Pat Hurney, 4ECE 22 nd January 2009.

Frequency Domain Coding of Speech 主講人：虞台文. Content Introduction The Short-Time Fourier Transform The Short-Time Discrete Fourier Transform Wide-Band Analysis/Synthesis.

Time Compression/Expansion Independent of Pitch. Listening Dies Irae from Requiem, by Michel Chion (1973)

Real-Time Speech Pitch Shifting on an FPGA

EEE 332 COMMUNICATION Fourier Series Text book: Louis E. Frenzel. Jr. Principles of Electronic Communication Systems, Third Ed. Mc Graw Hill.

CS 591 S1 – Computational Audio

Spectrum Analysis and Processing

CS 591 S1 – Computational Audio

LECTURE 11: FOURIER TRANSFORM PROPERTIES

III Digital Audio III.9 (Wed Oct 25) Phase vocoder for tempo and pitch changes.

FFT-based filtering and the

Digital Modulation oleh Risanuri Hidayat.

III Digital Audio III.9 (Wed Oct 24) Phase vocoder for tempo and pitch changes.

Sound shadow effect Depends on the size of the obstructing object and the wavelength of the sound. If comparable: Then sound shadow occurs. I:\users\mnshriv\3032.

INTRODUCTION TO THE SHORT-TIME FOURIER TRANSFORM (STFT)

Duration & Pitch Modification via WSOLA

Digital Audio Application of Digital Audio - Selected Examples

LECTURE 11: FOURIER TRANSFORM PROPERTIES

Digital Modulation oleh Warsun Najib.

Presentation transcript:

Time-scale and pitch modification Algorithms review Alexey Lukin

“Time-scale and pitch modification algorithms” 2/19 The problem Goal: change duration or tonality of musical piece Goal: change duration or tonality of musical piece Naïve approach: Naïve approach: ► (analog) record on tape and change playback speed ► (digital) resample the waveform Alas: pitch and duration change synchronously! Celine DionSpeed up by 20%

“Time-scale and pitch modification algorithms” 3/19 The problem Goal: independent control of times-scale and pitch, timbre should be natural! Goal: independent control of times-scale and pitch, timbre should be natural! Applications: Applications: ► Samplers and virtual instruments ► Production: synchronization of audio and video ► Post-production: pull-up, pull-down ► Entertainment: karaoke (changing key) ► Education: sonic microscope ► More?

“Time-scale and pitch modification algorithms” 4/19 Time domain Time-domain algorithms operate with the waveform, not spectrum Time-domain algorithms operate with the waveform, not spectrum 1.Break the signal into short granules 2.Repeat or discard (or shift) some granules to change duration 3.Resample to change pitch Some pictures in this presentation are taken from Ph.D. thesis of J. Bonada

“Time-scale and pitch modification algorithms” 5/19 Time domain Time-domain algorithms operate with the waveform, not spectrum Time-domain algorithms operate with the waveform, not spectrum 1.Break the signal into short granules 2.Repeat or discard (or shift) some granules to change duration 3.Resample to change pitch Problems: Problems: ► Granules can add in-phase (good) or out-of-phase (bad) ► Transients are duplicated or discarded Guitar+castanetsSlow down to 220% length

“Time-scale and pitch modification algorithms” 6/19 Time domain Solutions: Solutions: ► Ensure that pasted granules are in phase by selecting granule size to be multiple of pitch (requires autocorrelation or pitch analysis) ► Prohibit duplicating and skipping of transient granules (requires detection of transients and advanced scheduling of granules duplication) Fixed granule size Pitch-synchronous granule size (“PSOLA”) Pitch-synchronous granule size, transients detection

“Time-scale and pitch modification algorithms” 7/19 Time domain Pitch-synchronous overlap-add (PSOLA) Pitch-synchronous overlap-add (PSOLA) ► Granules are 2 pitch periods long ► Granules are repeated or discarded ► Requires pitch detection → unstable results for non-pitched or polyphonic material

“Time-scale and pitch modification algorithms” 8/19 Time domain Summary Summary ► Very fast (1…5% CPU) ► Good quality for pitched signals (solo instruments, vocal) ► Poor quality for non-pitched and polyphonic material: Amplitude modulation (out-of-phase overlapping of granules for some parts/instruments) Amplitude modulation (out-of-phase overlapping of granules for some parts/instruments) Repeated or discarded transients (unless special care taken) Repeated or discarded transients (unless special care taken) Implementations Implementations ► Editors, samplers: Audition, Cubase, Logic, Ableton, ACID ► Vocal correctors: Melodyne, Autotune + –

“Time-scale and pitch modification algorithms” 9/19 Vocoders Frequency-domain algorithms operate with a short- time spectrum of the signal Frequency-domain algorithms operate with a short- time spectrum of the signal Idea: build a spectrogram of a signal (using a short-time Fourier transform) and re-synthesize a signal from a spectrogram with a different time stride (hop) Idea: build a spectrogram of a signal (using a short-time Fourier transform) and re-synthesize a signal from a spectrogram with a different time stride (hop) Problem: during synthesis, signal granules can overlap out-of-phase Problem: during synthesis, signal granules can overlap out-of-phase Solution: phase modification Solution: phase modification at each frequency channel called phase unwrapping

“Time-scale and pitch modification algorithms” 10/19 Vocoders Traditional vocoder algorithm: Traditional vocoder algorithm: 1.Calculate shift-time Fourier transform (STFT) of a signal 2.Unwrap phases of each frequency channel (to compensate for change of synthesis stride at step 3), don’t modify magnitudes 3.Synthesize a signal using inverse STFT with a different time stride

“Time-scale and pitch modification algorithms” 11/19 Vocoders Magnitudes do not change Magnitudes do not change Phase unwrapping equations should provide in- phase overlapping of shifted granules at each frequency channel – “horizontal phase coherence” Phase unwrapping equations should provide in- phase overlapping of shifted granules at each frequency channel – “horizontal phase coherence” (phase increment) (phase unwrapping) (synthesis phase)

“Time-scale and pitch modification algorithms” 12/19 Vocoders Phase coherence problem Phase coherence problem ► Horizontal phase coherence is ensured by phase unwrapping ► How about vertical phase coherence (coherence of phases between different frequency bins) ? It is lost! (except cases of integer stretching ratios) This leads to: “Phasiness” due to out-of-phase signals in frequency bins within every signal harmonic “Phasiness” due to out-of-phase signals in frequency bins within every signal harmonic Transients are time-smeared along the whole granule Transients are time-smeared along the whole granule Guitar+castanetsVocoder 220% length

“Time-scale and pitch modification algorithms” 13/19 Vocoders Vertical phase coherence improvement: “phase locking” algorithm locks phases within each spectrum peak Vertical phase coherence improvement: “phase locking” algorithm locks phases within each spectrum peak 1.Divide frequency spectrum into intervals of harmonics 2.Unwrap phase of central (peak) frequency channel 3.Modify phases of other bins accordingly to the phase of the central channel This reduces phasiness, but still doesn’t help transients This reduces phasiness, but still doesn’t help transients No phase lockingPhase locking

“Time-scale and pitch modification algorithms” 14/19 Vocoders How to improve sharpness of transients? How to improve sharpness of transients? ► Frequency resolution of human hearing is not uniform: it is better at low frequencies and worse at high frequencies ► So, we can use longer STFT windows at bass (for getting better frequency resolution) and shorter windows at treble Just phase locking Phase locking and multiple window sizes

“Time-scale and pitch modification algorithms” 15/19 Vocoders How to improve sharpness of transients? How to improve sharpness of transients? ► We can directly paste transients to output without stretching (and phase modification) ► Unwrapping of steady harmonics through transients Phase locking and multiple window sizes + transients pasted

“Time-scale and pitch modification algorithms” 16/19 Vocoders Summary Summary ► Good quality for complex, polyphonic signals ► Some phasiness (even with phase locking) ► Smearing of transients (unless special care taken) ► Noises sometimes sound unnaturally ► CPU-intensive (but still faster than realtime) Implementations Implementations ► Specialized software: SlowGold, Serato Time’n’Pitch, iZotope Radius + –