ACOUSTIC ANALYSIS Acoustic methods applied to language

Slides:



Advertisements
Similar presentations
Introduction to Non Parametric Statistics Kernel Density Estimation.
Advertisements

Teaching Pronunciation
Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.
Contrastive Analysis, Error Analysis, Interlanguage
Coarticulation Analysis of Dysarthric Speech Xiaochuan Niu, advised by Jan van Santen.
Jump to first page Language-mixing and research on bilingual acquisition of prosody a methodological review Olga Gordeeva Queen Margaret University College,
Modified Approximants in L2 Spanish Teacher Talk: What are Students Hearing in the L2 Classroom? Meghan V. Huff Department of Linguistics University of.
Infant sensitivity to distributional information can affect phonetic discrimination Jessica Maye, Janet F. Werker, LouAnn Gerken A brief article from Cognition.
AN ACOUSTIC PROFILE OF SPEECH EFFICIENCY R.J.J.H. van Son, Barbertje M. Streefkerk, and Louis C.W. Pols Institute of Phonetic Sciences / ACLC University.
Perception of syllable prominence by listeners with and without competence in the tested language Anders Eriksson 1, Esther Grabe 2 & Hartmut Traunmüller.
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
Analyzing Students’ Pronunciation and Improving Tonal Teaching Ropngrong Liao Marilyn Chakwin Defense.
Niebuhr, D‘Imperio, Gili Fivela, Cangemi 1 Are there “Shapers” and “Aligners” ? Individual differences in signalling pitch accent category.
Sentence Durations and Accentedness Judgments ABSTRACT Talkers in a second language can frequently be identified as speaking with a foreign accent. It.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Constructing and Evaluating Web Corpora: ukWaC Adriano Ferraresi University of Bologna Aston University Postgraduate Conference.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
Think or Sink: Chinese Learners ’ Acquisition of English Voiceless Interdental Fricative D. Victoria Rau Hui-Huan Ann Chang.
Building High Quality Databases for Minority Languages such as Galician F. Campillo, D. Braga, A.B. Mourín, Carmen García-Mateo, P. Silva, M. Sales Dias,
Are downloads and readership data a substitute for citations? The case of a scholarly journal? Christian Schlögl Institute of Information Science and Information.
Schizophrenia and Depression – Evidence in Speech Prosody Student: Yonatan Vaizman Advisor: Prof. Daphna Weinshall Joint work with Roie Kliper and Dr.
Background Infants and toddlers have detailed representations for their known vocabulary items Consonants (e.g., Swingley & Aslin, 2000; Fennel & Werker,
Funded by NIH grant RO1 HD-4152 to J. Arnold NSF BCS and NSF BCS to Z. Griffin Why do speakers modulate acoustic prominence? Listener-oriented.
Nasal endings of Taiwan Mandarin: Production, perception, and linguistic change Student : Shu-Ping Huang ID No. : NA3C0004 Professor : Dr. Chung Chienjer.
Some thoughts on modelling phonetic effects in corpora.
Una Y. Chow Stephen J. Winters Alberta Conference on Linguistics November 1, 2014.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Phonetic Variations between Mid-Vowels in Swiss French and Standard French Anna Buffington, Carly Kleiber, Rebecca Kopps, Dr. Jessica Miller
The role of prosody in dialect synthesis and authentication Kyuchul Yoon Division of English Kyungnam University Spring 2008 Joint Conference of KSPS.
The acoustics of Lewis Gaelic stop consonants Claire Nance and Jane Stuart-Smith University.
English Linguistics: An Introduction
LREC 2008, Marrakech, Morocco1 Automatic phone segmentation of expressive speech L. Charonnat, G. Vidal, O. Boëffard IRISA/Cordial, Université de Rennes.
Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University Korea.
Results Tone study: Accuracy and error rates (percentage lower than 10% is omitted) Consonant study: Accuracy and error rates 3aSCb5. The categorical nature.
5aSC5. The Correlation between Perceiving and Producing English Obstruents across Korean Learners Kenneth de Jong & Yen-chen Hao Department of Linguistics.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.
Data Preprocessing Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Multidisciplinary perspectives to learner corpora SLE Language contact: at the crossroads of disciplines and frameworks.
HEPY YUDO HARTOTO, THE ERRORS OF ENGLISH PRONUNCIATION AMONG THE SECOND GRADE STUDENTS OF TERSONO JUNIOR HIGH SCHOOL TERSONO BATANG.
Na1c0014 李羿霈.  An acoustic perspective of English vowel production and perception by Taiwanese EFL learners, as compared with native speakers of English.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
Imposing native speakers’ prosody on non-native speakers’ utterances: Preliminary studies Kyuchul Yoon Spring 2006 NAELL The Division of English Kyungnam.
2.3 Markedness Differential Hypothesis (MDH)
Speech recognition Home Work 1. Problem 1 Problem 2 Here in this problem, all the phonemes are detected by using phoncode.doc There are several phonetics.
Language in Cognitive Science. Research Areas for Language Computational models of speech production and perception Signal processing for speech analysis,
Merging Segmental, Rhythmic and Fundamental Frequency Features for Automatic Language Identification Jean-Luc Rouas 1, Jérôme Farinas 1 & François Pellegrino.
Control of prosodic features under perturbation in collaboration with Frank Guenther Dept. of Cognitive and Neural Systems, BU Carrie Niziolek [carrien]
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Dialect Simulation through Prosody Transfer: A preliminary study on simulating Masan dialect with Seoul dialect Kyuchul Yoon Division of English, Kyungnam.
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
Dr Rana Almbark r. 6th Annual Symposium for A-Level English Language Teachers (SALT) 1.
Sentence Durations and Accentedness Judgments
The 157th Meeting of Acoustical Society of America in Portland, Oregon, May 21, pSW35. Confusion Direction Differences in Second Language Production.
Data Mining.
Measuring Monolinguality
Fluency in Oral Interaction Workshop (FLOW)
Why Stochastic Hydrology ?
Investigating Pitch Accent Recognition in Non-native Speech
Theoretical Discussion on the
Ma Rui Tianjin Normal University
Studying Intonation Julia Hirschberg CS /21/2018.
Understanding Variation of VOT in spontaneous speech
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Suprasegmental Production by American Learners of Japanese: A Phonetic Investigation Yuna Hiranuma University of Montana/Kobe City University of Foreign.
Applied Linguistics Chapter Four: Corpus Linguistics
Analyzing F0 and vowel formants of Persian based on long-term features
Within-speaker variability in long-term F0
Language Transfer of Audio Word2Vec:
Presentation transcript:

ACOUSTIC ANALYSIS Acoustic methods applied to language Three main tools Praat¶ MATLAB Winpitch: F0, pitch tracking, pitch detection algorithms, on the fly alignment

Some PhDs on phonetics data collection / empirical - Adrien Méli: PhD on French pronunciation of vowels Maelle Amand: diphthongs in NECTE (co-supervision with Karen Corrigan and Aurélie Fischer) : analysing a Research project from the 1960’s with current data mining techniques : -Corpus analysis (vowel) Identification of variation / classification tasks - Aurélie Chlébowski : Acoustic analysis of nasal grunts -Esther Legrézause analysis of UM/HUM + misc. Analyses on data mining approaches, MA on accommodation: how you imitate natives as a learner (repetition task): Léa Burin

the kernel density estimate method Analysing the emergence of vowel categorisation in a longitudinal learner corpus: the kernel density estimate method Caen EPIP5 17 May 2017 Nicolas Ballier & Adrien Méli UFR études anglophones EA 3967, CLILLAC-ARP Univ Paris Diderot Sorbonne Paris Cité 3

Outline Research questions Corpus A Non-parametric approach: density estimation Unidimensional approach: Non-native and duration as cue to KIT/FLEECE Multidimensional approach: 3D representations of F1, F2 and duration The ‘window’ effect : the kernel method

Maptask from Anderson et al. 1991

Protocols and RQs Phonological awareness: Emphasis and read speech : ability to render italics (the ‘Landlady’) Stress patterns (Cameron’s speech) Missing: Perception tasks… Repetition tasks (ANGLISH) Reformulation tasks (LeaP)

Méli (PhD in progress)

DATA 20 female speakers / 5 male speakers Syllable features CVC / CV :

Speaker 20 Session 2: A preliminary analysis (Méli, in progress)

Unnormalized per-monopthong F1 and F2 values each dot represents the occurrence of one monophthong

Duration values for each monophthong (in seconds)

Duration for diphthongs

Per-diphthong distribution of the vectorial coefficients

DATA (duration of recordings)

MODELS Cosine transforms - filtering DATA lm models and repeated measures : mi random // word: fixed effects FREQUENCY OF LEARNERS: + FREQUENCY OF logistic regression -> parametric analysis Central limit theory >> to too many outliers. summarize the word: AVERAGE of word position PAIRWISE DIFFERENCES for time series CT ree ctree() C5.0 package rattle

Kernels (Paroissin 2015)

Density estimates Don’t use with this kind of data:

NO ASSUMPTIONS: PLOTTING THE DATA continuous functions : assumptions about the data : no discontinus DTAA - cross validation for the better - variance / bias (biased:one number // ) density estimates ONE D - bandwidth is a box: cross validation in boxes - bandwidth -> SAILING AND NORMALIZING THE DATA : means(=variance 1) SD divide by

Kde

3D representations of kde

Distribution per word

R packages NPDEN TESTING THE RELEVANCE NPUDIST Favorite mgcv PACKAGE library(splines)

DURATION > 0.03 s (aligners…)

Kde unidimensional density Bimodal distributions of duration ?

F2 x duration

Kernel effects (?)

Similar

Kitchen sink methods All learners All dimensions Pairwise comparisons for sessions

NEXT PLANS PhD viva R package with most of the coding Paper describing the syllable algorithm > Github Data paper with some subsamples of the data

REFERENCES Ballier, N., & Martin, P. (2013). Developing corpus interoperability for phonetic investigation of learner corpora. Automatic Treatment and Analysis of Learner Corpus Data, 59. Baayen, R. H. 2008. Analyzing linguistic data (Vol. 505). Cambridge, UK: Cambridge University Press.Best, C. T. 1995. A direct realist view of cross-language speech perception. In: Strange, W., (ed),Speech perception and linguistic experience: Theoretical and methodological issues. Baltimore: York Press, 171– 204. Bybee, J. 2007.Frequency of Use and the Organization of Language. Oxford: Oxford University Press. Bybee, J. 2010.Language, Usage and Cognition. Cambridge: Cambridge University Press Boersma, P. & Weenink, D. (2005). Praat: doing phonetics by computer (Version 5.3.71). Retrieved from http://www.praat.org. Bigi, B. (2012). Sppas: A tool for the phonetic segmentations of speech. In Proceedings of LREC 2012, pp. 1748–1755. De Cara B, & Goswami U. (2002). Similarity relations among spoken words: The special status of rimes in English. Behavior Research Methods, Instruments, & Computers, 34 (3), 416-423 R Development Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051- 07-0, URL http://www.R-project.org Gramacki, A., & Gramacki, J. (2015). FFT-Based Fast Computation of Multivariate Kernel Estimators with Unconstrained Bandwidth Matrices. arXiv preprint arXiv:1508.02766. http://arxiv.org/pdf/1508.02766.pdf Wand, M. and B. Ripley (2015). Functions for Kernel Smoothing Supporting Wand & Jones (1995). R package version 2.23-15. Wand, M. and M. Jones (1995). Kernel Smoothing. Chapman & Hall.

Some references Cauvin, E, 2013, « Intonational phrasing as a potential indicator for establishing prosodic learner profiles ». In S. Granger, G. Gilquin & F. Meunier (eds) (2013) Twenty Years of Learner Corpus Research: Looking back, Moving ahead. Corpora and Language in Use - Proceedings 1, Louvain-la-Neuve: Presses universitaires de Louvain, 75-88. Díaz Negrillo, A. 2007. A Fine-Grained Error Tagger for Learner Corpora. Ph.D. thesis, University of Jaén, Spain. Díaz-Negrillo, A., Ballier, N., & Thompson, P. (Eds.). 2013. Automatic Treatment and Analysis of Learner Corpus Data (Vol. 59). John Benjamins Publishing Company. Granger et. al., Sylviane. 2009. “The LONGDALE Project Longitudinal Database of Learner English”. http://cecl.fltr.ucl.ac.be/LONGDALE.html Gries, S. [2009] 2013. Statistics for Linguists: A Practical Introduction. Berlin, New York: Mouton de Gruyter Herment, S., Loukina, A., Tortel, A., Hirst, D., Bigi, B., 2013, AixOx, a multi-layered learners corpus: automatic annotation. In Díaz Pérez Javier & Díaz Negrillo Ana (eds.). Specialisation and variation in language corpora. Bern : Peter Lang. Méli, A., 2010, Aspects of a longitudinal corpus-based study of French learners of English. A preliminary investigation, Mémoire de Master 2 non-publié, sous la direction de N. Ballier, Université Paris Diderot. Méli, A. 2013, Phonological acquisition in the French-English interlanguage. Rising above the phoneme in Díaz-Negrillo, A., N. Ballier and P. Thompson (eds.), Automatic Treatment and Analysis of Learner Corpus Data, Amsterdam :Benjamins, 207–226. Milde, Jan-Torsten. & Gut, Ulrike (2002). “A Prosodic Corpus of Non-native Speech”. Speech prosody 2002. (10/09) http://aune.lpl.univ-aix.fr/sp2002/pdf/milde-gut.pdf Myssyk, A, 2011 Predicting and evaluating a speaker's level of English : a proposal for pronunciation criteria , Mémoire de Master 2 non-publié, sous la direction de N. Ballier, Université Paris Diderot. Tortel, A. & Hirst, D. 2008. ANGLISH. (10/09) http://crdo.fr/crdo000731 PRAAT : praat.org R : http://www.r-project.org/ SPPAS : http://aune.lpl.univ-aix.fr/~bigi/sppas/ WinPitch: winpitch.com

Gries, S. [2009] 2013. Statistics for Linguists: A Practical Introduction. Berlin, New York: Mouton de Gruyter Herment, S., Loukina, A., Tortel, A., Hirst, D., Bigi, B., 2013, AixOx, a multi-layered learners corpus: automatic annotation. In Díaz Pérez Javier & Díaz Negrillo Ana (eds.). Specialisation and variation in language corpora. Bern : Peter Lang. Méli, A., 2010, Aspects of a longitudinal corpus-based study of French learners of English. A preliminary investigation, Mémoire de Master 2 non-publié, sous la direction de N. Ballier, Université Paris Diderot. Méli, A. 2013, Phonological acquisition in the French-English interlanguage. Rising above the phoneme in Díaz-Negrillo, A., N. Ballier and P. Thompson (eds.), Automatic Treatment and Analysis of Learner Corpus Data, Amsterdam :Benjamins, 207–226. Milde, Jan-Torsten. & Gut, Ulrike (2002). “A Prosodic Corpus of Non-native Speech”. Speech prosody 2002. (10/09) http://aune.lpl.univ-aix.fr/sp2002/pdf/milde-gut.pdf http://bookzz.org/dl/2298946/b555a2

THANKS ! adrien.meli@gmail.com nballier@free.fr 37

ALTERNATES

Some Perspectives after Adrien Méli PhD Correlation with usage frequency ? Attractors (« lexical » magnets, modelling realisations for lexical sets (people) ? « Templatic effects » transfers of French syllable structures (CVC vs. CV)