Measures of Variation Among English and American Dialects Robert Shackleton U.S. Congressional Budget Office.

Slides:



Advertisements
Similar presentations
Major Immigrant Groups of Colonial North Carolina
Advertisements

Average Earnings by Highest Qualification and Region 2006.
Dialect Subordinate variety of a language English language has many dialects These dialects may be of different kinds Regional dialecSocial dialect Where.
9. The individual & group in earlier AAE. Primary issues 1.Role of intracommunity individual variation in earlier AAE 2.Role of individual variation in.
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Ch. 5 Language Key Issue 1: Where are English-Language Speakers Distributed? Origin and diffusion of English Dialects of English.
Oklahoma’s Facts and Climate
We processed six samples in triplicate using 11 different array platforms at one or two laboratories. we obtained measures of array signal variability.
Extreme precipitation Ethan Coffel. SREX Ch. 3 Low/medium confidence in heavy precip changes in most regions due to conflicting observations or lack of.
May 7, 2015S. Mathews1 Human Geography By James Rubenstein Chapter 5 Key Issue 1 Where Are English-Language Speakers Distributed?
EBI Statistics 101.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Happy New Year! On your desks: Textbook Pen Highlighter
Welcome to the World of Investigative Tasks
Promoting the Economic and Social Vitality of Rural America: The Demographic Context Rural Education Conference New Orleans, LA April 14, 2003 by Dr. Daryl.
Chronicles of catches from marine fisheries in the Eastern Central Atlantic for Luca Garibaldi and Richard Grainger Fishery Information, Data.
Populations A population is made up of the individuals of a species within a particular area: –each population lives in patches of suitable habitat Habitats.
2: Population genetics. Problem of small population size Small populations are less fit (more vulnerable) than large populations.
Quantitative Genetics
Out-of-Africa Theory: The Origin Of Modern Humans
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Language Chapter 5 An Introduction to Human Geography
Language Chapter 5 An Introduction to Human Geography
Language as a Weapon People use language as a cultural conflict and political strife Spanish speakers and their advocates are demanding the use of Spanish.
The new HBS Chisinau, 26 October Outline 1.How the HBS changed 2.Assessment of data quality 3.Data comparability 4.Conclusions.
Steps in Using the and R Chart
Quantifying Vowel Space Using Recordings of the IPA Vowels Bob Shackleton Congressional Budget Office Quantitative Linguistics and Dialectology University.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
© 2011 Pearson Education, Inc. Chapter 12: Services The Cultural Landscape: An Introduction to Human Geography.
AN INTRODUCTION DATA COLLECTION AND TERMS POSTGRADUATE METHODOLOGY COURSE.
Chapter 4 Folk and Popular Culture. Folk & Popular Culture I.Intro A. Culture combines values, material artifacts, & political institutions B. Habit vs.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Chapter 5: Language Section 5-1. Language Quiz 1) How many distinct languages are in the world today? A) about 100 B) between 500 – 1000 C) between 2000.
Chapter 5 Language PPT by Abe Goldman An Introduction to Human Geography The Cultural Landscape, 8e James M. Rubenstein.
Online measurements of chemical composition and size distribution of submicron aerosol particles in east Baltic region Inga Rimšelytė Institute of Physics.
Indian and Northern Affaires indiennes Affairs Canada et du Nord Canada First Nation and Inuit Community Well-Being : Describing Historical Trends ( )
What do we know about the causes of regional growth? Part 3 ECON 4480 State and Local Economies 1.
Chapter 5 Language. French Road Signs, Québec Language Language is: a system of communication through speech & a collection of symbols that a group of.
General Register Office for S C O T L A N D information about Scotland's people Scottish Demography - Local Perspectives Explores differences between parts.
So, what’s the “point” to all of this?….
E. S. Poloni, A. Sanchez-Mazas, G. Jacques, L. Sagart 2005.
US Religions and Distribution. Protestants The majority religion in the colonies was Protestantism. Protestants rejected many of the traditions and hierarchy.
Dialects and Isogloss Chapter 5 section 5. Terms/Concepts Dialect Isogloss.
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
© 2014 Pearson Education, Inc. Language Why do individual languages vary among places? © 2014 Pearson Education, Inc.
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
Evaluation Institute Qatar Comprehensive Educational Assessment (QCEA) 2008 Summary of Results.
CHAPTER 5 SECTION 1 LANGUAGE Unit III. Where are English-Language speakers distributed ?
WHY DO INDIVIDUAL LANGUAGES VARY AMONG PLACES? DIALECTS OF ENGLISH BOUNDARIES OF WHERE REGIONAL WORDS ARE USED CAN BE MAPPED; SUCH A WORD USAGE BOUNDARY.
5.1 Where Are English-Speakers Distributed? Briana Hurta.
Lines on Maps and Globes Social Studies Standard Earth Attributes: Hemispheres The learner will be able to identify the Northern, Southern, Eastern,
Parves Khan Andrew Gostelow 7 October 2009 Tourist Information provision A national economic impact review Tourism Management Institute National Conference.
Language – What Should I Say? ___________ – set of mutually intelligible sounds and symbols that are used for communication. Many languages also have literary.
EGS-AGU-EUG Joint Assembly Nice, France, 10th April 2003
EGS-AGU-EUG Joint Assembly Nice, France, 7th April 2003
What Is Cluster Analysis?
Why Do Individual Languages Vary among Places?
GHOTI.
WHY DO INDIVIDUAL LANGUAGES VARY AMONG PLACES?
Geography and Language: Dialects
Key Issues Where are folk languages distributed? Why is English related to other languages? Why do individual languages vary among places? Why do people.
Dr. Unnikrishnan P.C. Professor, EEE
Key Issues Where are folk languages distributed? Why is English related to other languages? Why do individual languages vary among places? Why do people.
Chapter 5 Language.
Demographic Analysis and Evaluation
Cougar Time Missing quiz or test? Chapter 6 Guided Reading.
There is a Great Diversity of Organisms
Analysis of protein-coding genetic variation in 60,706 humans
Social studies vocabulary
Presentation transcript:

Measures of Variation Among English and American Dialects Robert Shackleton U.S. Congressional Budget Office

 Compare speech variants used by English and American speakers, using easily accessible data  Use several different quantitative methods to assess variation among speakers  Compare different quantitative methods  Use results to gain some insight into English origins of American speech variants Goals

Data Nearly all data from Kurath & McDavid’s Pronunciation of English in the Atlantic States; some from Kurath’s Dialect Structure of Southern England All or nearly all data collected by Guy Lowman 82 phonemes classified into 285 variants by Kurath and McDavid

Data Four regions –Southern England (59 informants); settled <700 –Southeastern Massachusetts (22 informants); settled <1650 –S.E. Virginia / N.E. North Carolina (31 informants); settled ~<1690 –S.W. Virginia / S. West Virginia (19 informants); settled ~ Informants largely older, rural, long-settled families In some cases, more than one variant per informant Some missing data Some data arbitrarily attributed to one of two or three possible informants in a given locality

Shared variants: based on proportion of variants shared between two speakers Genetic distance: based on relative frequencies of variants, treating variants of a given phoneme as analogous with allelles of a given gene Linguistic distance: measured as a Euclidean distance between variants in an idealized geometric grid (e.g. ² and e are closer to each other than i and Þ ) Each measure involves arbitrary assumptions Choice of phonemes to include Classification of responses into variants Quantification of distances among variants Important difference: first two approaches assume that variants are discrete; linguistic approach does not Methods

Genetic Approach Nei's genetic distance D measures how closely related populations of pronunciation patterns are if: –Change is always to a completely new variant –All phonemes have the same rate of change –Population sizes remain constant over time Occurrence of variant = 1; absence = 0 Occasionally, frequency of variant in a set of similar words (0 < x < 1) In some cases, more than one variant per speaker Each informant represented by a vector of 285 numbers, each between 0 and 1 In this sample: –D ranges from 0.00 to 1.70 –50% shared pronunciations => D = 0.7

Linguistic Approach Variants are characterized by a set of numbers representing degrees of height, backing, rounding, rhoticity, length

Difference between variants measured as Euclidean distance Distance between two speakers LD measured as the average Euclidean distance between their variants Could also measure the dispersion of distances, etc. In this sample: –LD ranges from 0.00 to 1.68 –50% shared variants => LD = 0.70 to 1.16 Linguistic Approach

Cluster Analysis Methods of grouping informants on the basis of similarity of their speech patterns Many different approaches –Different measures of similarity—Pearson correlations, Euclidean distances, cosines, genetic or linguistic distances –Different methods of grouping similar observations into clusters—single, average, and complete linkages, various algorithms for estimating phylogenetic relationships Results highly dependent on approach –English speakers tend to group into five regions (East Midlands, East Anglia, Southeast, Southwest, Devonshire) –North American regions tend to be distinct, and to cluster most closely with to Southeast England –EVNC and SWVA consistently cluster together

Results Distance measures are generally correlated Nei’s distance and shared variants are very similar, despite nonlinearity Linguistic distance is least similar—contains different information about similarity of speech forms

Shared Variants East Midlands East Anglia Southeast Southwest Devonshire Massachusetts EVNC SWVA

Nei’s Genetic Distance East Midlands East Anglia Southeast Southwest Devonshire Massachusetts EVNC SWVA

Linguistic Distance East Midlands East Anglia Southeast Southwest Devonshire Massachusetts EVNC SWVA

Distribution of Variants  Some variants are widespread; others not  12% appear in all 8 regions  29% appear in 7 regions  42% appear in 6 regions  59% appear in 5 regions Even within regions, lots of variation Informants in a given region typically share 60% to 75% of variants, but range is 33% to 90% Degree of variation reflected in genetic and linguistic distance measures  Southern England More diversity than in North America  91% of variants found somewhere  23% found in every region  20% found only in southern England Shared variants between English informants 22% to 83% Shared variants between English and American informants 18% to 63%

Distribution of Variants  North American regions  Less diversity than in England—22% of southern English variants absent  80% of variants found somewhere  37% found in every region  9% found only in North America (12% of North American variants)  Nearly half of American “innovations” shared across all N. American regions  Many “innovations” are known to have existed in southern England, but were not recorded North American distribution of southern English variants Slightly greater frequency of eastern (esp. southeastern) English variants in American regions Of 41 variants found in eastern but not in western England, 14 (34%) appear in Massachusetts and in the South, 7 (17%) in Massachusetts but not in the South, 13 (32%) in the South but not in Massachusetts Of 33 variants found in western but not in eastern England, 5 (15%) appear in Massachusetts and in the South, 2 (6%) in Massachusetts but not in the South, 11 (33%) in the South but not in Massachusetts

Distribution of Variants Massachusetts Both more and fewer shared variants with English informants than the South On average, more shared variants with the South than with English informants By all measures, MA informants show somewhat greater affinity with eastern English The South EVNC and SWVA comparatively homogeneous and similar Similar intra- and interregional variation Similar variation with MA and England Slightly greater affinity with western English than MA Southern American informants have greatest number of shared variants with Devonshire informants, but lowest linguistic distance with southeastern English informants Can illustrate differences using average values, or values for “typical” informants who have the greatest average number of shared variants or lowest average distance with all other speakers in region

Regional Comparison: Averages

Regional Comparison: “Typical” Informants

Conclusions  Different measures yield somewhat different, complementary insights into linguistic variation  By all measures, extensive variation in and among regions  Patterns of variation—increasing in population and age of settlement—are reminiscent of species-area relationship  American settlement resulted in lower variation in American regions, leveling, and somewhat different populations of variants in different regions  Slightly dominant influence from the metropolitan area  Greater eastern influence in the north, western influence in the south  Relatively little innovation  Leveling process analagous to loss of species during reduction in habitat  Results are largely consistent with the historical record of early English immigration to North America (except for absence of East Anglian influence in Massachusetts)