Automatic Speech Processing Project

Slides:



Advertisements
Similar presentations
15-Nov-13www.fakengineer.com Seminar O n morphing.
Advertisements

Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
DSP II: Final presentation Vocoder - making music talk Van Damme Wim Hemeryck Martijn.
Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.
Overview of Real-Time Pitch Tracking Approaches Music information retrieval seminar McGill University Francois Thibault.
Prosody modification in speech signals Project by Edi Fridman & Alex Zalts supervision by Yizhar Lavner.
Vineel Pratap Girish Govind Abhilash Veeragouni. Human listeners are capable of extracting information from the acoustic signal beyond just the linguistic.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.
VOICE CONVERSION METHODS FOR VOCAL TRACT AND PITCH CONTOUR MODIFICATION Oytun Türk Levent M. Arslan R&D Dept., SESTEK Inc., and EE Eng. Dept., Boğaziçi.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Pitch Recognition with Wavelets Final Presentation by Stephen Geiger.
6/3/20151 Voice Transformation : Speech Morphing Gidon Porat and Yizhar Lavner SIPL – Technion IIT December
2001/05/24Chin-Kai Wu, CS, NTHU1 Improved frame erasure concealment for CELP-based coders Juan Carlos De Martin, Takahiro Unno, Vishu Viswanathan DSPS.
System Microphone Keyboard Output. Cross Synthesis: Two Implementations.
Effects in frequency domain Stefania Serafin Music Informatics Fall 2004.
1 Frequency Response Methods The system is described in terms of its response to one form of basic signals – sinusoid. The reasons of using frequency domain.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
A PRESENTATION BY SHAMALEE DESHPANDE
Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum:
Representing Acoustic Information
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
1 CS 551/651: Structure of Spoken Language Lecture 8: Mathematical Descriptions of the Speech Signal John-Paul Hosom Fall 2008.
A study on Prediction on Listener Emotion in Speech for Medical Doctor Interface M.Kurematsu Faculty of Software and Information Science Iwate Prefectural.
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Voice Production & Pathology 101 Speaker: Erin Walsh, MA, CCC-SLP.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Comparing Audio Signals Phase misalignment Deeper peaks and valleys Pitch misalignment Energy misalignment Embedded noise Length of vowels Phoneme variance.
Modified Patchwork Algorithm: Anovel Audio Watermarking Scheme In-Kwon Yeo and Hyoung Joong Kim.
Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.
1 ( قالوا سبحانك لا علم لنا الإ ما علمتنا إنك أنت العليم الحكيم ) صدق الله العظيم سورة البقرة آيه 32.
Math 5 Professor Barnett Timothy G. McManus Anthony P. Pastoors.
♥♥♥♥ 1. Intro. 2. VTS Var.. 3. Method 4. Results 5. Concl. ♠♠ ◄◄ ►► 1/181. Intro.2. VTS Var..3. Method4. Results5. Concl ♠♠◄◄►► IIT Bombay NCC 2011 : 17.
Speech Parameter Generation From HMM Using Dynamic Features Keiichi Tokuda, Takao Kobayashi, Satoshi Imai ICASSP 1995 Reporter: Huang-Wei Chen.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
DR.D.Y.PATIL POLYTECHNIC, AMBI COMPUTER DEPARTMENT TOPIC : VOICE MORPHING.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Implementation of a speech Analysis-Synthesis Toolbox using Harmonic plus Noise Model Didier Cadic 1, engineering student supervised by Olivier Cappé.
Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June,
Fourier and Wavelet Transformations Michael J. Watts
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Musical Sculpture A Final Project Block Diagram Presentation by: Clare Davis, Chen Li, & Austyn Hill.
Topic: Pitch Extraction
Speaker Verification System Middle Term Presentation Performed by: Barak Benita & Daniel Adler Instructor: Erez Sabag.
Codec 2 ● open source speech codec ● low bit rate (2400 bit/s and below) ● applications include digital speech for HF and VHF radio ● fills gap in open.
High Quality Voice Morphing
Signal Processing First
Figure 11.1 Linear system model for a signal s[n].
3.1 Introduction Why do we need also a frequency domain analysis (also we need time domain convolution):- 1) Sinusoidal and exponential signals occur.
ARTIFICIAL NEURAL NETWORKS
Vocoders.
Copyright © American Speech-Language-Hearing Association
Fourier and Wavelet Transformations
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
PROJECT PROPOSAL Shamalee Deshpande.
Speech Perception CS4706.
Linear Predictive Coding Methods
Presentation for EEL6586 Automatic Speech Processing
朝陽科技大學 資訊工程系 謝政勳 Application of GM(1,1) Model to Speech Enhancement and Voice Activity Detection 朝陽科技大學 資訊工程系 謝政勳
Measured Period VOICE SIGNAL
Speech Processing Final Project
Auditory Morphing Weyni Clacken
Presentation transcript:

Automatic Speech Processing Project Voice Morphing Peramananallur Ranganathan Gurumoorthy Student ID 9383-0698

What is Voice Morphing ?? Voice morphing is a technique for modifying a (source) speaker's speech to sound as if it were spoken by a different (target) speaker. In Simpler terms it is being able to change the speech of one speaker to that of another speaker. Applications for Voice Morphing range from recreational ones to security ones.

Time Domain Plots of Source and Target featuring the Pitch

We get the New LPC coefficients using the formula How to Morph Voice ?? We need to effectively change the pitch from that of a male speaker to that of a female speaker. If we reminisce the excitation signal has information about the speaker. We find the LPC coefficients for the Source and Target Signals and using these coefficients we are going to interpolate between the two Signals. We get the New LPC coefficients using the formula new lpc coeff = [const*(lpc source) + (1-const)(lpc target)] 0 <= const <= 1 …

How to Morph Speech ?? (contd…) The pitch of a female speaker will be close to twice that of the male speaker. In our example the pitch of the male speaker is 141Hz and that of the female speaker is 210Hz. So we need to develop some time stretching algorithm so that we can implement pitch shifting. We obtain the residue of the source signal and stretch it according to the value of the const. The const indicates what is the position of morphed signal in between the source and target. For example if const = 0.2 then the morphed signal will be closer in pitch to the source signal and a value of 0.8 for const will result in a pitch that is closer to the target signal.

How do we shift the Pitch ?? We break the residue signal into small windows and introduce fade in and fade out for each block. We recombine everything to form the pitch shifted signal. Based on the alpha we can time stretch the residue according to our requirements. How do we Morph finally ?? We now have the pitch shifted residue signal and the new LPC coefficients. We should resample the pitch shifted signal so that it is played at a faster rate. [Remember when we pitch shift then the residue will last longer]. If we inverse filter the resampled pitch shifted residue then we can effect morphing.