Presentation is loading. Please wait.

Presentation is loading. Please wait.

Combining intuition with corpus linguistic analysis: A study of lexical chunks in four Chinese undergraduate students’ writing Maria LeedhamFLaRN 2010.

Similar presentations


Presentation on theme: "Combining intuition with corpus linguistic analysis: A study of lexical chunks in four Chinese undergraduate students’ writing Maria LeedhamFLaRN 2010."— Presentation transcript:

1 Combining intuition with corpus linguistic analysis: A study of lexical chunks in four Chinese undergraduate students’ writing Maria LeedhamFLaRN 2010 m.e.leedham@open.ac.uk

2 BACKGROUND TO STUDY 2 FLaRN 2010 Maria Leedham

3 Chunking through intuition: Study 1 RQ: To what extent can NSs and NNSs chunk NNS speech? Data: transcripts of 2 intermediate-level Japanese students’ speech students were recorded 3 times with a 2-month gap between each total of approx.1500 words across the 6 transcripts Method Step 1: 3 NS linguists asked to underline chunks in the 6 transcripts (training, examples and practice given first) Step 2: Japanese students asked to identify chunks in their own transcripts Step 3: author chunks transcripts with assistance from WordSmith Tools (Leedham, 2006) FLaRN 2010 Maria Leedham

4 Example of chunked transcript from Study 1 Key: italics - words classified by the NNS as a chunk. underline – words 2 or 3 out of the 3 NSs classified as a chunk 1ahh…first err I, I learned, learnt? (mmhmm I learnt) err (2.0) I should err.. I 2should be more positive? (right) positive… in UK because ahh…when, when I 3went to London err… last Sunday (mhmm) ahh (2.0) some, some of the 4underground line (mm) line was no service (oh dear) ((speaker laughs)) I was 5really surprised and, because it can, cannot be (mm) in Japan (mm) you know, 6sun- in, in Sunday, on? (mm) on Sunday many, many people (mm) come to 7London (mm) and go around some place (mm)... so everyone need to, need a 8train (mm) so, but maybe four or five lines… was not, no service (mm) so… 9I… I have to think err what I should do ((speaker laughs)) and no, I’ve never, I 10have never been to London that, so, this was the first time I’ve been to London 11(mm) so… FLaRN 2010 Maria Leedham

5 Findings from Study 1 Findings: little inter or intra-rater reliabilitiy many ‘missing’ chunks (eg ‘of course’, ‘you know’) both across and within raters frustrating and time-consuming task for NSs BUT… the Japanese ss could do this task AND also offered insights into when/why… (eg student M: “I used to say that but now I know it’s not usual”.) the more time spent looking for chunks, the more will be found Coda a further recording, transcribing & awareness-raising cycle suggests that this resulted in uptake both students found it highly motivating to record and analyse transcripts of their talk FLaRN 2010 Maria Leedham

6 Chunking through intuition: Study 1 Method Step 1: 3 NS linguists asked to underline chunks in the 6 transcripts (training, examples and practice given first) Step 2: Japanese students asked to identify chunks in their own transcripts Step 3: author chunks transcripts with assistance from WordSmith Tools v.5 FLaRN 2010 Maria Leedham

7 STUDY 2: FLaRN 2010 Maria Leedham

8 Outline 1.Research questions 2.The students and the texts 3.The two methods 4.Findings 4.1. Method 1 4.2 Method 2 5.Conclusions and Implications FLaRN 2010 Maria Leedham

9 Research Questions 1.What can a study of lexical chunks reveal about these Chinese students’ writing? 2.What does each method contribute? FLaRN 2010 Maria Leedham

10 The Students Wei Male BSc Engineering Feng Female BSc Food Science with Business Ping Female BA Hospitality, Leisure & Tourism Management (HLTM) Hong Male BA HLTM FLaRN 2010 Maria Leedham Criteria - L1 Chinese (Mandarin or Cantonese) - All secondary education in home country - Contributions from years 1 & 2 and year 3 of undergraduate study

11 The texts FLaRN 2010 Maria Leedham Reference corpora

12 Combining intuition and corpus searches FLaRN 2010 Maria Leedham Method 1: Manual analysis Read all 4 Chinese students’ texts Read twice, with 6 months between Read equivalent, randomly- selected English students’ texts Noted ‘salient’ features, then searched corpora of the individual’s texts, the discipline, all Chinese students’ writing, all English students’ writing. Method 2: Key n-gram searches Used WordSmith Tools, v.5 (Scott, 2008) Searched for key n-grams in the corpus of texts from each student, using relevant discipline corpus from L1 English as reference Setting p=0.00001, deleted short n- grams within longer n-grams Compiled key n-gram lists Looked at concordance lines and texts for more context

13 Formulaic sequences in sample of Wei’s writing (Engineering) Introduction A design methodology for a gearbox is presented in this report. The input horse power, the input speed and net reductions in the gearbox are the parameters to be specified. A gearbox takes an input shaft rotating and converts it via a gear train into up to three outputs, the process of designing a gearbox is to figure out which ratios are needed and to implement those ratios in the form of positioning various sizes of connected gears. The specification of the gearbox depends on its area of application. In this report, a gearbox is designed for a commercial meat slicer which has its final shaft rotating at between 80 and 100 rev/min. The input of the meat slicer is a constant speed AC motor running at 1800 rev/min and delivering 1.2 kW. A few points have to be considered on this system, the size of the gearbox is severe restricted, since it has to go onto a work surface where there is severe competition for space. And the motor may be in-line or at right angles to the grinder. Furthermore, the duty is expected to be up to 6 hours per day. FLaRN 2010 Maria Leedham

14 Outline 1.Research questions 2.The students and the texts 3.The two methods 4.Findings 4.1. Method 1 4.2 Method 2 5.Conclusions and Implications FLaRN 2010 Maria Leedham

15 Idiosyncratic language In one word computer based tools contribute an… In one word the overall system can be described…(Wei, years 2 & 3) In light of this, it is suggested that buying IHG… In light of this, it can be suggested that… In light of this, it is recommended that buying IHG… (Ping, year 3, in 1 text) … but simply writing a responsible tourism policy is no longer enough. It is a must to show practical action,… (Hong, Year 1) a winning city, the authorities of Liverpool have to rebuild its image to get rid of the negative picture. (Hong, Year 2) …and boost its marketing campaigns in order to catch the world’s eyes on Scotland. (Hong, year 3) FLaRN 2010 Maria Leedham

16 Vague language In catering services, restaurants in Oxford and Bath are more or less the same. (Hong, Year 1) From those tables, the same thing as section 3.1 could be found … (Wei, Year 1). …a measurement system for measuring low-lever force, a kind of cantilever rig which is called… A kind of variable inductance sensor has been chosen… …Furthermore, with processing data, a kind of filter is always needed to separate certain… (Wei, year 2, same assignment) At that time, I found that this hotel is a little bit out of my expectation. (Hong, Year 2) FLaRN 2010 Maria Leedham

17 Vague language FLaRN 2010 Maria Leedham L1 English students use: ‘a bit of a ‘ + N eg ‘a bit of a problem’, ‘a bit of a shock’, ‘a bit of a dog’s breakfast’ Often this is from reflective writing ‘ The conclusion was also a bit of a victim in my editings, bringing it down to one small sentence for each of the areas of discussion’. (6101c Cybernetics Year 3 essay)

18 Chunks with – and without – ‘I’ & ‘we’ From the experiment, it was known that the mechanical properties of carbon steel AN and carbon steel N…. It was found out the mechanical properties of carbon steel AN was incorrect in this experiment,… (Wei, Year 1) Meanwhile, if we clipped the current probe round one of the motor supply leads, and connected it to Ch1 of the oscilloscope, we could get two copies of the transient starting current of the motor from the oscilloscope. From these two copies, we could calculated… FLaRN 2010 Maria Leedham

19 Chunks with – and without – ‘I’ & ‘we’ L1English students FLaRN 2010 Maria Leedham

20 Linkers This can create a positive image for Scotland, on the other hand, (Ping Year 3) …In other words, people are buying expectations... (Hong, year 3) As a consequence, it can attract many travelers… (Hong, Year 2) On the contrary, the predominance of SMEs... (Ping, Year 2) First of all, the dimension of the brake disc is decided. (Wei, Year 3) What is more, Bath is served by a large number of local bus services… (Hong, Year 1) References to data ‘as shown in table’ (Wei x 2, Ping x 2) ‘according to’ (Wei x 4) ‘as illustrated in table + NUMBER’ (Ping x 2) FLaRN 2010 Maria Leedham

21 Summary of method 1 findings Salient chunks in the Chinese students’ writing were: Idiosyncratic chunks (‘in light of the’) Vague language (‘a bit of’) – though note English students’ use of ‘a little bit of’ High use of chunks with ‘we’ and low use of chunks with ‘I’ – partly due to English students’ reflective writing Use of favoured linkers (‘on the other hand’) Reference to data in tables and figures (‘according to the equation’) BUT… very difficult to intuit chunks in unfamiliar disciplines FLaRN 2010 Maria Leedham

22 Outline 1.Research questions 2.The students and the texts 3.The two methods 4.Findings 4.1. Method 1 4.2 Method 2 5.Conclusions and Implications FLaRN 2010 Maria Leedham

23 Method 2: Key n-gram searches Used WordSmith Tools, version 5 (Scott, 2008) Searched for key n-grams (= ‘key clusters’) in the corpus of texts from each of the 4 students Relevant discipline corpus from L1 English used as reference corpus P=0.00001, deleted short n-grams within longer n-grams Compiled a key n-gram list for each student Grouped these key n-grams into themes Looked at concordance lines for more context FLaRN 2010 Maria Leedham

24 N-grams

25 Idiosyncratic language FLaRN 2010 Maria Leedham Ping's year 2 proposal ‘aim of the’ ‘of the assignment is to design’ ‘to develop an understanding of’ (Wei)

26 FLaRN 2010 Maria Leedham Discipline-specific n-grams “Marriott Liverpool city centre”, “the Liverpool tourism industry”, ‘the tourism industry’ (Hong) ‘the hospitality industry’, ‘recruitment and selection’, ‘in the hospitality industry’ (Ping) Passive voice ‘be worked out’, ‘can be calculated’ (Wei) ‘there will be’, ‘it is believed that’ (Ping) References to data ‘with reference to appendix’, ‘please see appendix’ (Ping) ‘in the appendix’, ‘briefing sheet in appendix’, ‘is shown as’, ‘tables of data’, ‘were recorded as below’ ‘was calculated with eq.’ (Wei)

27 Favoured linkers decrease over time FLaRN 2010 Maria Leedham

28 Summary of method 2 findings Many of the same findings from method 1 – idiosyncratic chunks – some linkers –esp. ‘on the other hand’ – low use of chunks with ‘I’ – references to data Also…. discipline-specific chunks Easy to compare one student’s texts with the discipline reference corpus & each L1 reference corpus Similar findings occur within the Chinese students overall NB Keyness measures difference FLaRN 2010 Maria Leedham

29 Outline 1.Research questions 2.The students and the texts 3.The two methods 4.Findings 4.1. Method 1 4.2 Method 2 5.Conclusions and Implications FLaRN 2010 Maria Leedham

30 30 of 10 Intuitive reading Key n-grams analysis Finds frequent chunks (n-grams) Plus Large quantities of data can be analysed quickly Accurate Easily replicable Minus Single chunks are missed Arbitrary parameters Conflation of writing from lots of individuals Sense of text as complete document is lost Finds semantically whole units (formulaic sequences) Plus A person can recognise single instances that a computer would miss The text is read as a complete document - as intended by the writer Minus Time-consuming and tiring Problem of inter-rater reliability Problem of intra-rater consistency Hard to replicate

31 Combining methods… Combine the two methods through a recursive process of reading texts and checking the sequences in a corpus, also searching for key n-grams for less intuitive sequences. “ultimately, the most revealing insights… will be gained from a closer look at the texts, the speakers, and the situational variables; quantitative analysis alone can never provide a satisfactory picture” (Simpson, 2004:41). FLaRN 2010 Maria Leedham

32

33 References Foster, P. (2001). "Rules and routines: A consideration of their role in the task-based langage production of native and non-native speakers", in M. Bygate, P. Skehan, and M. Swain, (eds.), Task-Based Learning: Language Teaching, Learning and Assessment. Longman: London. Heuboeck, A., Holmes, J. & Nesi, H. 2007 The Bawe Corpus Manual. Retrieved from http://www.coventry.ac.uk/researchnet/d/505/a/5160.http://www.coventry.ac.uk/researchnet/d/505/a/5160 Leedham, 2006. “Do I speak better? – A longitudinal study of lexical chunking in the spoken language of two Japanese students”. In The East Asian Learner. Scott, M. 2008. WordSmith Tools v.5. Oxford University Press. Wray, A. (2002). Formulaic Language and the Lexicon. Cambridge University Press. BAWE corpus- ESRC project number: RES-000-23-0800 FLaRN 2010 Maria Leedham


Download ppt "Combining intuition with corpus linguistic analysis: A study of lexical chunks in four Chinese undergraduate students’ writing Maria LeedhamFLaRN 2010."

Similar presentations


Ads by Google