Presentation is loading. Please wait.

Presentation is loading. Please wait.

CORPUS APPROACHES TO LANGUAGE STUDIES FL, AWL

Similar presentations


Presentation on theme: "CORPUS APPROACHES TO LANGUAGE STUDIES FL, AWL"— Presentation transcript:

1 CORPUS APPROACHES TO LANGUAGE STUDIES FL, AWL
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES FL, AWL Bambang Kaswanti Purwo

2 a 10-year-old native speaker of English has a vocabulary of
Word Levels a 10-year-old native speaker of English has a vocabulary of around 10,000 word families A word family describes the base form of a word plus its closely related inflected and derived forms. For example, here's the word family for absent: absent absented absenting absents absentee absentees absenteeism absently

3 the vocabulary size of native speakers
rough estimate: the vocabulary size of native speakers by adding 1,000 word families for each year of their life up to the age of about 20  a native speaker of English (a university graduate) probably knows at least 20,000 words goals for a learner of English as a second language [20,000 words – very ambitious] split up the vocabulary they need to learn into four levels: high-frequency, low-frequency, academic, and technical

4 frequency of a word: how often it occurs in a text
Word Frequency frequency of a word: how often it occurs in a text word most frequently used in written English: the a frequency of around seven in every 100 words of text = the occurs in almost every line of a written text [when Paul Nation started studying vocabulary teaching] to see how often each word occurred counted a 1,000-word text word-by-word • manually: a whole weekend • now with computers: less than a second

5 [original text: 1,906 words long, 532 different word types]
[original text: 1,906 words long, 532 different word types] Word Frequency the 100 wide 1 of 74 will to 58 without and 56 work words 46 working a 41 write in 39 yet vocabulary 38 you is 30 young are 25 yourself

6 A small number of words cover a lot of the text.
A small number of words cover a lot of the text. • “running words” or “tokens”: all the words in a text, including repeated words • 11 running words, a and of occur twice High- and Low-Frequency Words • a relatively small group of words (around 2,000) much more frequently used than other words in the lang • the 2,000 high-frequency words include the function words and content words. Function words: articles (a, the), conjunctions (because, but, although, and), prepositions (in, below, above), determiners (each, every, this, those), numbers Content words: nouns, verbs, adjectives, and adverbs

7 General Service List of English Words (GSL) by Michael West
General Service List of English Words (GSL) by Michael West • 2,000 word families • lots of useful information about frequency and meanings • it's been proven to work in graded readers. graded readers: books specially written in a limited vocabulary easy to read for learners of English (e.g. some books may have 300 words or less) • the rest of vocabulary is made up of low-frequency words • most conservative estimate: 120,000 low-frequency English words (not including proper names) • low-frequency words always a problem for lang. Ls n Ts (unpredictable when they'll occur in a text) • Ts need to deal with low- n high-frequency words differently

8 Academic and Technical Words
Academic and Technical Words • academic vocabulary: additional high-frequency word list known as the Academic Word List (AWL) • to be learned after students acquire the 2,000 high-frequency words • AWL (developed by Averil Coxhead): 570 word families (not in the most frequent 2,000 words); for anyone doing academic study in almost any subject area • technical vocabulary of particular subject areas e.g., in computing: mouse, pixel, rom, and retrieve

9 Vocabulary Level Number of Words Text Coverage
Vocabulary Level Number of Words Text Coverage high-frequency 2,000 70% academic 570 5% technical 1,000 20% low-frequency 6,000

10 Academic Word List (AWL) – Averil Coxhead (1998) An Academic
Word List. English Language Institute Occasional Publication No. 18. • developed at the School of Linguistics and Applied Language Studies at Victoria University of Wellington, NZ • a list of 570 words, excluding words in the most frequent 2000 words of English • to be used for Ls at tertiary level study • the headwords = the stem form of the words • the headwords of the AWL are listed on pp. 7–11 • the word families of the AWL are listed in sublists 1–10 • the word family analyse, for example, include the regular inflections of the verb: analysed, analysing, analyses the derivations of the word: analysis, analyst, analysts, analytical, analytically, etc. the American spelling: analyze, analyzed, analyzes, analyzing • the most frequently used member of the family is in italics e.g. analysis the most common form the word family analyse

11 • the word families of the AWL selected from the words
in the Academic Corpus (AC), approx. 3,500,000 words • the AC is a written corpus of academic English: journal articles, book chapters, course workbooks, laboratory manuals, and course notes • four faculty sections: ▪ Arts ▪ Commerce ▪ Law ▪ Science • each faculty section approx. 875,000 running words • each faculty section divided into seven subject areas, approx. 125,000 running words

12

13

14

15

16

17

18


Download ppt "CORPUS APPROACHES TO LANGUAGE STUDIES FL, AWL"

Similar presentations


Ads by Google