Presentation is loading. Please wait.

Presentation is loading. Please wait.

Corpus Linguistics Lexicography. Questions for lexicography in corpus linguistics How common are different words? How common are the different senese.

Similar presentations


Presentation on theme: "Corpus Linguistics Lexicography. Questions for lexicography in corpus linguistics How common are different words? How common are the different senese."— Presentation transcript:

1 Corpus Linguistics Lexicography

2 Questions for lexicography in corpus linguistics How common are different words? How common are the different senese for a given words? Do words have systematic associations with other words? Do words have systematic associations with particular registers or dialects?

3 6 major research questions in lexicography 1.What are the meanings associated with a particular word? 2. What is the frequency of a word relative to other related words? 3. What non-linguistic association patterns does a particular word have (e,g, to registers, historical periods, or dialects)? 4. What words commonly co-occur with a particular word, and what is the distribution of these “collocational” sequences across registers? 5. How are the senese and uses of a word distributed? 6. How are seemingly synonymous words used and distributed in different ways?

4 Meaning of Words KWIC can usually reveal the different meanings of words. CL: p. 27 Figs 2.1 and 2.2 Applications: meanings of a particular word in a textbook, learner groups, register

5

6

7 Frequency of Words Listing can serve the purpose. Listing methods: various forms of a word, comparison basis, multiple grammatical functions of a word Forms of a word: lemma is more useful than the raw form of a word in a list of all words. (Figs. 2.3 & 2.4, CL:28, 29)

8 Frequency of Words Word frequency: percentage or on the basis of one million, to get information about the commonness of a word, help solve the problems caused by a small corpus. A tagged corpus shows the distribution of grammatical forms of a word. (Table Fig. 2.5 CF: 32) Applications: distribution of a word in a curriculum, in a test, in a book, in a register

9 Distribution across registers. Overall characterizations of a word can be misleading, because words are often used in a very different ways in different registers. Table 2.1 shows the frequency by register for DEAL as a noun (Table 2.1 CL: 32) Table 2.2 shows frequency of DEAL as a noun and verb in two registers (Table 2.2 CL: 34) Raw count: the actual number of occurrences of the word

10 Distribution of senses across registers One way to begin investigating the senses of words is to look at their collocates. Identifying the most common collocates of a word provides an efficient ad effective means to begin analyzing senses. Table 2.3 Common collocates of DEAL as a noun (CL: 37)

11 Distribution of senses across registers The meaning referring to an amount is the most common use The sense of amount is the most common meaning in both academic prose and fiction, and other uses are relatively common in fiction. Table 2.4 Common dictionary definitions of DEAL as a noun.

12 Distribution of senses across registers Findings: 1. DEAL in the sense of amount is not covered until very late in some dictionaries. 2. The use of big deal to mean unimportance is not covered by these dictionaries. 3. register differences are disregarded by all these dictionaries.

13 Synonymous words Table 2.5 Frequency of big, large and great There is great variation between the two registers. big is over ten times more common in fiction than in academic prose. great is over one-and a halftimes more common in fiction.

14 Synonymous words large is three times more common in academic prose. Big: physical size, most common in both registers Large: quantity or amount in academic prose while physical size in fiction. Great: used as in chunks in academic prose: great deal, great number. In fiction it is used in a much wider range of senses. E.g. great man.

15 Distribution across registers. Normed count: on a fixed basis In Table 2.2 the total sample has a slight difference between DEAL as a noun and as a verb, but the difference changed a lot when registers is taken into account. Find the possible causes. Applications: distribution of different forms of words in college English curriculum and students writing


Download ppt "Corpus Linguistics Lexicography. Questions for lexicography in corpus linguistics How common are different words? How common are the different senese."

Similar presentations


Ads by Google