Presentation on theme: "Corpora in grammatical studies"— Presentation transcript:
1Corpora in grammatical studies Corpus LinguisticsRichard Xiao
2Aims of this session Lecture Lab session Corpus-based grammar: Scope and principlesThe state of the art of using corpora in grammatical studiesUsing corpora to improve grammatical descriptions: Infinitival complementation of helpLab sessionPosition of if-clauses in ICE-GB
3Corpus revolutionLike lexicographic and lexical studies, grammar is another area which has frequently exploited corpus dataA balanced representative corpus provides a reliable basis for quantifying grammatical categories and syntactic featuresIt is also useful in testing hypotheses derived from grammatical theoryThere has been increasing consensus that non-corpus-based grammars can contain biases while corpora can help to improve grammatical descriptions (McEnery & Xiao 2005)Corpora have had a strong influence on recently published reference grammar books (at least for English)‘even people who have never heard of a corpus are using the product of corpus-based investigation’ (Hunston 2002: 96)
4Principles of corpus grammar (Leech 2000) Data-oriented grammarallowing the combination of a quantitative and a qualitative description of the dataa grammar accountable to observed data of attested language useFunctional Grammarestablishing a relation between phenomena that are external to the language system and system-internal phenomena (form vs. meaning)their explanation of grammar in terms of the wider context of human psychology and behaviourVariety Grammarallowing the description of the full range of varieties (e.g. conversation, fiction writing, news writing, academic writing)Integrative Grammarallowing an integrated description of syntactic, lexical, and discourse featuresclose to communicative grammar as opposed to ‘autonomous syntax’ view of grammar
5A new milestone in English grammar Longman Grammar of Spoken and Written English (i.e. LGSWE, Biber et al 1999)A new milestone following Quirk et al (1985) Comprehensive GrammarBased entirely on the 40-million-word Longman Spoken and Written English CorpusGiving “a thorough description of English grammar, which is illustrated throughout with real corpus examples, and which gives equal attention to the ways speakers and writers actually use these linguistic resources” (Biber et al 1999: 45)
6Features of corpus-based grammars Paying attention to the differences in speech and writingTaking account of register/genre variationsProviding frequency informationTreating lexis as an integral part of grammatical descriptionGiving authentic examples
7Some examples of corpus grammars Corpus-based English grammars focusing on speechCarter, R. and McCarthy, M. (1997) Exploring Spoken English. Cambridge: Cambridge University Press.McCarthy, M. (1998) Spoken Language and Applied Linguistics. Cambridge: Cambridge University Press.
8Some examples of corpus grammars Corpus-based grammars with a focus on lexisFrancis, G., Hunston, S. and Manning, E. (1996) Collins COBUILD Grammar Patterns 1: Verbs. London: HarperCollins.Francis, G., Hunston, S. and Manning, E. (1998) Collins COBUILD Grammar Patterns 2: Nouns and Adjectives. London: HarperCollins.Hunston, S. and Francis, G Pattern Grammar. Amsterdam: John Benjamins.
9Some examples of corpus grammars Corpus-based grammar exploring taking account of register variationBiber, D., Johansson S., Leech G., Conrad S. and Finegan, E. (1999) Longman Grammar of Spoken and Written English. London: Longman.
10A case study Using corpora to improve grammatical descriptions Infinitival complementation of HELP
11A commonly used word In the 100-million-word BNC 245th most frequent word529 instances per million words72nd most frequent verb as a lemma
12A verb with a distinctive syntax English has two main-clause verbs that can control either a full or a bare infinitive: dare and help (Biber et al 1999: 735)The choice between a full and bare infinitive is only available when dare is used as a lexical verb (as a modal verb, always followed by a bare infinitive)HELP is the only English verb that can control either a full or bare infinitive AND occur either with or without an intervening NPHELP to VPerhaps the book helped to prevent things from getting even worse.HELP NP to VI thought I could help him to forget.HELP VSavings can help finance other Community projects.HELP NP VWe helped him get to his feet and into the chair.Dare can occur with or without an intervening NP, but it cannot control a bare infinitive when such an intervening NP is presentErnest <…> dared Archie to punch him in the stomach.
13A unique verb of great interest A verb that has often been given prominence in textbooks, grammars and dictionariesE.g. Chalker (1984); Murphy (1985); Quirk et al (1972, 1985); Eastwood (1992); Biber et al (1999)A verb that has aroused much interest and debateLanguage varietyLanguage changeRegister variationSemantic distinctionSyntactic conditions
15Language variety: AmE vs. BrE Bare infinitives are much more common in AmE (cf. Biber et al 1999)80% (AmE) vs. 52% (BrE)LL=23 (1 df), p<0.001British preference for full infinitivesYou’re going to help me make to make a birthday cake for Jim remember. (BNC)A construction of American provenance, which has penetrated rapidly into BrEZandvoort (1966): ‘except in American English, however, to help usually takes an infinitive with to’No longer valid
16Language change: 1961-1991 Changing labels for bare infinitives (OED,1933) “vulgar” -> (Vallins 1951) “not seriously questioned now…” -> (Mair 1995) “lost the informal ring”An increase in the proportions of bare infinitives over the three decades in both AmE and BrEAmE: 68% -> 82% (+14%)LL=10.6 (1 df), p=0.001BrE: 22% -> 60% (+38%)LL=47.5 (1 df), p<0.001A greater shift towards the use of bare infinitives in BrE because AmE was already more “tolerant” of bare infinitives in the 1960s
17Spoken vs. writtenBare infinitives are slightly more frequent in speech than in writing, in both AmE and BrEThe differences are not statistically significantAmE: LL=2.71 (1 df), p=0.10BrE: LL=2.16 (1 df), p=0.142No predictable distribution pattern for bare infinitives in 15 written genresCommon in some formal genres (e.g. official documents) but infrequent in other formal genres (e.g. academic writing)
18Semantic distinction The debate has a long history Some “pre-corpus” argumentsWood (1962: 107-8): to ‘can be omitted only when the helper does some of the work, or shares in the activity jointly with the person that is helped’ – Wood’s “unacceptable” examplesThese tablets will help you sleep.But tablets do not sleepWriting out a poem will help you learn it.But writing does no learningAccording to Quirk et al (1972: 841), the choice ‘is conditioned by the subject’s involvement’With a bare infinitive, ‘external help is called in’With a full infinitive, ‘assistance is outside the action proper’
19Semantic distinction Dixon (1991) Duffley (1992) Lu (1996: 813) John helped Mary eat the puddingJohn ate part of the pudding as Mary didJohn helped Mary to eat the puddingJohn fed the pudding to MaryDuffley (1992)A bare infinitive evokes helping as ‘direct or active involvement’… help to V evokes help as a condition which enables the person being helped to realize the eventLu (1996: 813)When the subject of ‘help’ does not take part in the helping activity, the infinitive must take toThe book helped me to see the truth.What do your intuitions tell you?
20Semantic distinctionNot reported in more recent corpus-based works (e.g. Longman 1993/1996; Collins 1995; Biber et al 1999)Quirk et al (1985) dropped the argument for semantic distinctionCollins CoBuild Dictionary“If you help someone, you make it easier for them to do something, for example by doing part of the work for them or by giving them advice or money.”It is not always easy or even possible to make a distinction between whether or not the helper actually takes part in the helping activityCounter examples are abundant in corporaI help people stop smoking. (FLOB)oh it says if you have a dose last thing at night it helps you sleep. (BNC)
21Syntactic condition: Intervening NP The previous claim (Lind 1983; Kjellmer 1985; Biber et al 1999) that an intervening NP increases the proportion of bare infinitives is only partly supported by our corporaOnly valid in AmE, both written and spokenUnpredictable results, no statistical significance in BrE
22Syntactic condition: Intervening adverbial Lind (1983) claims that ‘an intervening adverbial will preclude omission of to’The whisky helped me not to stagger under this blow.This claim is ungrounded, esp. in AmE (CPSA)Some counter examplesSo, to help people not jump all over it as soon as they see it <…> (CPSA)<…> that would even help perhaps focus some of those responses. (CPSA)Mr. Clinton <…> also helped, to a much lesser degree, organize a huge march in Washington <…> (Frown)...helping dramatically reduce poverty. (Time Magazine 2005/12/05)Now my daughter...is helping digitally restore the Disney films her grandfather worked on. (Time Magazine 2006/04/10)
23Syntactic condition: to preceding help To preceding help is a decisive syntactic condition that encourages the omission of to (cf. Lind 1983; Kjellmer 1985; Biber et al 1999)HELP (lemma): 60%help (finite verb): 65%to help (infinitive): 88% (+23%)Consecutive repetition of to tends to be avoided on the grounds of euphony (cf. Lind 1983)They took on an estate manager and wine-maker to help run the business. (FLOB)A statistical norm, not categorical distinctionIn the BNC, to help V (2,161) is 17 times as frequent as to help to V (127)
24Syntactic condition: Passive voice Palmer (1965: 169) observes that ‘passive occurs <…> only with to: They were helped to do it.’All of the 9 instances of passivized HELP in our corpora take a full infinitive with no exceptionNo instance of BE helped V is found in the whole BNC or the 100-million-word Time corpus of AmEExplanation (?): An analogy can be drawn between HELP and verbs such as MAKE, LET, SEE and HEAR: oC = bare infinitiveThe infinitive shifts from oC to sC in passive transformationSo they should be made to bring their prices down. (BNC)So the authorities should make them (*to) bring their prices down.Pupils should be helped to investigate topics on their own. (BNC)Teachers should help pupils (to) investigate topics on their own.
25Case study: A summaryThe choice of a full or bare infinitive following HELP is conditioned by a wide range of factors including, for example, language variety, language change, as well as various syntactic conditionsNon-corpus-based grammars are likely to contain biased descriptions that do not accord with attested language use
26Adverbial clauses: Position vs. semantic types Greenbaum and Nelson (1995)
27Exploring if-clauses in ICE-GB One million words500 samples (300 spoken written)Parsed corpusPosition of if-clausesClause initial positionIf it’s a really nice day we could walk.Clause-final positionWe could walk if it’s a really nice day.ReferenceNelson, G., Wallis, S. and Aarts, B. (2002) Exploring Natural Language: Working with the British Component of ICE. Amsterdam: John Benjamins
42Frequencies of initial / final positions Initial position appears to be the “unmarked” position for if-clausesInitial position (886, 61.4%)Final position (556, 38.6%)
43Written registersGreenbaum and Nelson's (1995) observation of conditional clause (64.8% for initial and 35.32% final) only applies to written registers
44Spoken registersIn the spoken data as a whole, the final position is preferred, though there is considerable internal variation.The more "formal" spoken registers (parliamentary debates, legal presentations and non-broadcast (scripted) speeches show a marked preference for the initial position.