Presentation is loading. Please wait.

Presentation is loading. Please wait.

Today Writing: using the comma –Writing task Corpus linguistics talk, Part 2 Re-organize groups –Group news discussion.

Similar presentations


Presentation on theme: "Today Writing: using the comma –Writing task Corpus linguistics talk, Part 2 Re-organize groups –Group news discussion."— Presentation transcript:

1 Today Writing: using the comma –Writing task Corpus linguistics talk, Part 2 Re-organize groups –Group news discussion

2 1. 2.He left the scene of the accident and tried to forget that it had happened. 3. Oil which is lighter than water rises to the surface. 4. Madame de Stael was an attractive gracious lady. 5. Nice is a word with many meanings and some of them are contradictory. 6. Taxicabs that are dirty are illegal in some cities. 7. The uninvited guest wore a dark blue tweed suit. 8. I hope that some day he will learn how to be polite. 8. Mark Twain's early novels I believe stand the test of time. 9. Write the editor of the Atlantic 8 Arlington Street Boston Massachusetts 02116. 10. He replied "I have no idea what you mean." 11. After a good washing and grooming the pup looked like a new dog. 12. Men who are bald are frequently the ones who are the most authoritative on the subject of baldness. 13. Hello Kitty cellphones which are very popular in Japan have not really caught on in Taiwan.

3 Introduction to corpus linguistics Simon Smith & Adam Kilgarriff

4 Plan for today Short review of corpus basics 4 ages of corpus research – From pre-computer age, to SkE Functions of SkE Demonstration of SkE in use

5 Quiz What’s a (linguistic) corpus? What does the Latin word mean? What are corpora?corpora What’s the BNC? How big is the British National Corpus? What is the advantage of having a very big corpus? What can corpora be used for?

6 5 major uses for linguistic corpora Language learning and teaching Theoretical research on Language and Linguistics Literary research and analysis Language technology Lexicography (=dictionary making) – Cobuild, Longman, … – All learner dictionaries now use corpora

7 How do you make a dictionary? (What sources can you use?) Use your own knowledge of words Ask all your friends for their knowledge Consult other dictionaries – and copy them Read thousands of books – and take lots of notes Use a corpus

8 Taiwan, Dec 2006 Four ages of corpus research (in lexicography) Kilgarriff, Lexical Computing Slide: 8 Age 1: Pre-computer Age 2: KWIC concordance (KWIC=?) Age 3: Corpus query tools e.g. Sketch Engine

9 Taiwan, Dec 2006 Kilgarriff, Lexical Computing Slide: 9 Age 1: Pre-computer First Oxford English (1860) Dictionary: 20 million index cards – a word (usually rare) and a citation

10 Taiwan, Dec 2006 Kilgarriff, Lexical Computing Slide: 10 Age 2: KWIC Concordance

11 Taiwan, Dec 2006 Kilgarriff, Lexical Computing Slide: 11 Age 2 (~1980-1990): KWIC Concordances Using computers List of lines which contain a keyword The keyword is in the middle

12 Taiwan, Dec 2006 Kilgarriff, Lexical Computing Slide: 12 4 person in an agreement/dispute 1 political association 4 person in an agreement/dispute 2 social event 5 to be party to something... 3 group of people The coloured pens method

13 Taiwan, Dec 2006 Kilgarriff, Lexical Computing Slide: 13 Age 2: limitations as corpora get bigger: too much data 50 lines for a word: read all 500 lines: could read all, takes a long time 5000 lines: impossible

14 Taiwan, Dec 2006

15 Why do corpora keep getting bigger? (anyone?) Improvements in technology – Price of storage is going down – Speed of access is going up Representativeness – Small corpus  many examples of common words, maybe – But not enough examples of unusual words

16 Lexical distribution What’s the most common word in English? What % does it make up of a whole corpus? The 100 most common words make up __% of all the words in a corpus? The 7500 most common words make up __% Answers: – The, 5%, 45% and 90% So: – you need massive corpora, if you want to really represent rare words properly

17

18 18 Limitation of KWIC analysis A s corpora get bigger: too much data – 50 lines for a word: read all – 500 lines: could read all, takes a long time – 5000 lines: no Instead, look at a Word Sketch from Sketch Engine – a statistical summary of word usage – shows most common collocates

19 Taiwan, Dec 2006 19

20 Taiwan, Dec 2006 20

21 Taiwan, Dec 2006 Maybe stop here Kilgarriff, Lexical Computing Slide: 21

22 Functions of SkE KWIC concordance – Sorting, filtering etc Word sketch Automatic thesaurus Sketch difference – discriminate near-synonyms 22

23 23 Lexical approach to language learning Lewis (1993) and Schmitt (2000) say – the vocab is stored in the brain in collocations – Bacon is stored near eggs – 蛋 is stored near 炒飯 – scotch is stored with whisky Saying strong car or powerful tea or broken house seems very “foreign”

24 24 From www.teachingenglish.org - a lexical approach activity, based on a story textwww.teachingenglish.org

25 News task 4 sentences News story must be from the current week Please include the date when you print it Make two lists of adjectives: –(+) exciting; dramatic; unusual… –(-) dull; complicated; bloodthirsty… Choose the best story from your group –I’m not very keen on that story because… –I prefer this story because…

26 Collocations and sentences 5 words Use the SkE beta Say which corpus you used 3 collocations for each word –State the frequency –State the salience ( 顯著性 ) Example sentence from SkE should use one of the collocations you chose If you don’t understand the sentence, don’t use it!

27 Before this week’s reading, ask: How many different cuisines can you name, from around the world? Which cuisine do you think is the healthiest?


Download ppt "Today Writing: using the comma –Writing task Corpus linguistics talk, Part 2 Re-organize groups –Group news discussion."

Similar presentations


Ads by Google