Presentation on theme: "Research Methods Workshop Introducing Corpus Linguistics Techniques (1): Making the Most of the VIEW."— Presentation transcript:
Research Methods Workshop Introducing Corpus Linguistics Techniques (1): Making the Most of the VIEW
A reminder Corpus Linguistics is a methodology, which tends to: –involve the analysis of “actual” language use in natural texts (but the analysis of literary texts is also possible) –utilise a large and principled collection of natural texts, known as a “corpus”, as the basis for analysis –makes extensive use of computers, utilising both automatic and interactive techniques –depend on both quantitative and qualitative analytical techniques: “The goal of corpus-based investigations is not simply to report quantitative findings, but to explore the importance of these findings for learning about the patterns of language use” (adapted from Biber et al 1998: 4-5)
Concordancing =An alphabetical listing of the words in a text, given together with the contexts in which they appear. –The most common form of concordance today is the Keyword-in- Context (KWIC) index: Figure 1: Concordance of poor in Tale of Two Cities, Book taste it is that such poor cattle always have in their mouths 948 of sparing the poor child the inheritance of any part of 778 small property of my poor father, whom I never saw--so long 1870 desolate, while your poor heart pined away, weep for it 1947 Miss, if the poor lady had suffered so intensely 1884 the love of my poor mother hid his torture from me 1615 stockings, and all his poor tatters of clothes, had, in a long 1577 faded away into a poor weak stain. So sunken and 1001 on your way to the poor wronged gentleman, and, with a 1036 detachment from the poor young lady, by laying a brawny
What do concordancers let you do? –let you look at a word in context, see how common it is, see the style associated with it. –Let you compare your usage with that of others (very useful in EFL) –Let you compare usage across different genres/registers (very useful in ESL) –More advanced users can explore attitudes (the thought processes that lie behind the words) The recall problem: Although concordancers allow you to specify search words, it’s worth remembering that … –Some tools will only give you the results for what you said you were looking for, which may not be the same thing as what you thought you were looking for. –You notice only what you get back; you will not notice what you did not find.
Becoming familiar with VIEW = Variation In English Words and Phrases You can find it at: So what’s so good about VIEW? –allows you to quickly and easily search for a wide range of words and phrases of English in the 100 million word BNC. –BNC = represents modern English of the late 20 th century –As with some other BNC interfaces, you can search for words and phrases by exact word or phrase wildcard or part of speech combinations of word/phrase and wildcard/part of speech. –Time permitting, we’re going to master the first two on the list.
Search: ‘corporation’ Clicking on the word brings up a concordance We can search the whole of the BNC – or just a small part of it (i.e. W_commerce) – but remember to tick the “limit” box!!!
KWIC concordance of ‘corporation’ (in w_commerce)
Sorting our entries … We can sort our entries according to ‘left’ and ‘right’ context, by using an * However, if we want to look at the same results, we have to pick the appropriate register in the left hand column … e.g. w_commerce
Results for … corporation * What strikes you about the results?
Results for … * corporation What strikes you about these results?
Group Task How is ‘corporation’ used in newspaper tabloids? Is it used in similar ways to the use of ‘corporation’ in W_commerce? Let’s explore some other words …. You choose …. !
Check out the “CHART” button … CHART is useful when you want to see the extent to which specific words are utilised in the different genres. …
Using VIEW to search for collocates We use the ‘surrounding’ display … remembering to: Make sure we’re in the TABLE display Define the size of window (the smaller the window, the closer our words will be to X) Put the ‘min freq’ to X (i.e. any number between 2-7) Tick the ‘limit’ box Choose the register
Using VIEW to search for collocates Search word = market Click on ‘surrounding’ Register = W_commerce Tick limit box What strikes you about these results? Replicate the search on your computer, and then answer the following: Are there any collocates that are predictable in your view? Do any of the collocates of ‘market’ surprise you?
We can use a similar process to search for antonyms and synonyms …
Comparing synonyms ‘Search String’ to worker/employee ‘Surrounding’ on (5/5 window) ‘Register’ = W_commerce ‘Limit’ to on ‘Min freq.’ to 5
Comparing synomyns Your chosen words No. of times that word X appears near to chosen words
Group task: The collocates of worker/employee What collocates with ‘worker’? What collocates with ‘employee’? Change your search so that you use the whole BNC …register 1 = -- IGNORE – –Have the collocates for ‘worker’ remained the same? –Have the collocates for ‘employee’ remained the same?
Lexical priming and semantic prosody Lexical priming: “ Every word is primed for use in discourse as a result of the cumulative effects of an individual's encounters with the word...Every word is primed to occur with particular words; these are its collocates.” (Hoey 2005) Semantic prosody: …occurs when t he habitual collocates of a word (or phrase) colour its meaning so it can no longer be seen in isolation from its semantic prosody. Some questions to ponder … How do we study 'semantic prosody'? What can it tell us? Where can we find it? How can we find it?
Searching for meaningful patterns residual/core meaning DENOTATION COLLOCATION COLLIGATION SEMANTIC ASSOCIATION: semantic PREFERENCE semantic PROSODY textual meaning = literal meaning = patterns of words appearing together = collocation patterns based on syntactic groups rather than individual words = tendency of a word to keep company with a semantic set or class; some members of this set or class will usually be collocates. = colouring of meaning (? Permanently ?) Patterns contribute to the creation of a network of textual meanings; computers and human interpretation can be used in conjunction to identify (and make sense of) these patterns...
Group Task Do a search for the following: –“slump”, “slumped”, slumps”, “jinxed”, “shortfall”, “demand” –How are they used in context and are they always negative? –Are the meanings of any of these terms “coloured” (i.e. can no longer be seen in isolation from its semantic prosody)?
Now let’s explore parts of speech What do you think the most common noun in English is? –Write down your answers on a piece of paper –Now do the following search to find out whether your “hunch” was correct: [nn*]
The most frequent nouns in the BNC We search for nouns by including [n*] here … What strikes you about the results?
Most frequent nouns in spoken section of BNC ( = 10 million words) Notice that TIME is now the second most frequent noun … but there are a lot of other nouns relating to periods of time … Indeed - YEAR, DAY, YEARS, WEEK, NIGHT, MORNING – are all in the top 25! Question: How much does this result suggest we are preoccupied with time in Britain?
Other parts of speech worthy of exploration [vv*] [v*] [aj*] [av*]
CL: Best Practice We need to balance a quantitative approach with a qualitative approach We need to know our data – or be prepared to become very familiar with it! We need to be prepared to engage with theory
References Biber, D., Conrad, S., and R. Reppen (1998) Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press., Barnbrook (1996) Language and computers. Edinburgh: Edinburgh University Press. Hoey, M. (2005). Lexical priming: a new theory of words and language. London: Routledge. Nelson, Mike ‘Computers and Semantic Prosody’. Online paper, available at Sinclair, J. (2004). Trust the text. London: Routledge. Stubbs, M. (1996). Text and corpus analysis. Oxford: Blackwell.