Presentation is loading. Please wait.

Presentation is loading. Please wait.

Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus Linguistics and Language Education: Development and Utility of the Corpus of Cameroon English.

Similar presentations


Presentation on theme: "Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus Linguistics and Language Education: Development and Utility of the Corpus of Cameroon English."— Presentation transcript:

1 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus Linguistics and Language Education: Development and Utility of the Corpus of Cameroon English Daniel A. Nkemleke Department of English Ecole Normale Supérieure University of YaoundeI Outline  Introduction: Corpus Linguistics, history  Some (main) existing corpora  Development of the Corpus of Cameroon English (CCE)  Corpus utility with reference to the CCE  Prospect

2 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus Linguistics and Language Education: Development and Utility of the Corpus of Cameroon English Daniel A. Nkemleke Department of English Ecole Normale Supérieure University of YaoundeI Plan  Introduction: Corpus Linguistics, history  Some (main) existing corpora  Development of the Corpus of Cameroon English (CCE)  Corpus utility with reference to the CCE  Prospect

3 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Introduction: what is Corpus Linguistics?  The study of language based on examples of “real life“ language use, collected, stored and processed via computer  Facilitated by the advent of computer technology (1960s)  Latin: corpus (body): body of text  any collection of more than one text, written or spoken

4 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Introduction (con’t): brief history  Before 1940s/1950s: “early corpus linguistics“  corpus-based methodology (“Primitive corpora?“)  Between 1960s and 1980s: minority of linguists continued working on corpus-based work (Quirk: SEU, Francis & Kucera: Brown corpus, Svartik: London-Lund corpus)  Computer technology: major support for CL  First African Corpus: 1989 (ICE-East Africa) (Schmied 1989)  Second African Corpus: 1992 CCE (Tiamajou 1993)/ Nigeria??

5 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Introduction (con’t): brief history “Thirty years ago when this research started it was considered impossible to process texts of several million words in length. Twenty years ago it was considered marginally possible but lunatic. Ten years ago it was considered quite possible but still lunatic. Today it is very popular“ (Thomas/Short 1996: 4)

6 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Some (main) existing corpora L1 Corpora  Brown Corpus of American English  Lancaster-Oslo/Bergen Corpus (LOB)  London-Lund Corpus  British National Corpus (BNC)  Birmingham Corpus of British English L2 Corpora  ICE-East Africa (Kenya & Tanzania)  Corpus of Cameroon English  Corpus of Nigerian English ??  Kolhapur Corpus of Indian English Multinational Corpus Project  International Corpus of English (ICE)

7 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 4 main characteristics of a corpus 1. Sampling & representativeness  Interest in whole variety of English  Attempts to construct a “representative” sample corpus  Which maximally represents variety  Aim: picture as accurate and reasonable as possible of a language population

8 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Four main characteristic of a corpus (Con‘t) 2. Finite size  Body of finite amount of words, e.g. 1,000,000  Figure determined at beginning of project  monitor corpus: constant addition of texts

9 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Four main characteristics of a corpus (con‘t) 3. Machine-readable form  Past: reference to printed text  Nowadays: implication, machine-redable  Few in book form (e.g. original London-Lund)  Occasionally other forms of media (microfiche, recordings)

10 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Four main characteristics of a corpus (con‘t) 4. Standard reference  Tacitly a corpus constitutes a standard reference  Presupposition: wide availability to other researchers  Direct comparison of results with other varieties

11 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Development of the Corpus of Cameroon English (CCE)  Began in 1992 with the collaboration of two British universities (Birmingham/Liverpool)  Assistance of the British council in Yaoundé  Target of a million words reached in 1994  Data use for classroom activities/research since then  2005: project benefited from a grant of the AvH → Goal: Further development (tagging) of the database (TU-Chemnitz)

12 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Objective  Provide authentic data for the description of the main features and problems inherent in the variety of English which is written in Cameroon  Provide a source of authentic material for English language teaching/learning in Cameroon  Serve as a database for comparative studies on CamE in relation to other varieties of English

13 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Text categories: written component Text categoriesNo. of textsNo. of words A: Official Press257126,539 B: Private Press4249,098 C: Novels & Short Stories2177,096 D: Religion1996,380 E: Tourism526,881 F: Official letters7712,285 G: Private letters25079,386 H: Students’ Essays83137,399 I: Government Memos1671,368 J: Advertisement104,875 K: Miscellaneous22139,247 TOTAL802820,554

14 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Text categories: spoken component  Dialogues 1. Conversations 2. Phone calls 3. Broadcast discussions 4. Classroom lessons 5. Interviews 6. Parliamentary debates 7. Legal cross- examination 8. Business transactions  Monologues 1. Commentaries 2. Demonstrations 3. Legal Presentations 4. Broadcast News 5. Broadcast Talks 6. Non-broadcast Talks

15 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus utility with reference to CCE  13 possible ways in which a corpus may be useful 1. Corpora as a source of empirical data 2. Corpora in language teaching and learning 3. Corpora in Lexical studies 4. Corpora in grammar studies 5. Corpora in speech research 6. Corpora and semantic studies 7. Corpora in pragmatic and discourse studies 8. Corpora in sociolinguistic studies 9. Corpora and stylistic studies 10. Corpora in historical linguistics 11. Corpora in dialectology and variational studies 12. Corpora in Psycholinguistics 13. Corpora in cultural studies

16 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 1. Corpus as a source of empirical data  Linguists can make more objective statements on language use in the variety, comparing other varieties Nkemleke /Mbangwana (2001) Nkemleke (2003) Nkemleke (2004a, 2004b) Nkemleke (2005) Nkemleke(2006) Nkemleke (2007a, 2007b) Nkemleke(fc: 2008a, 2008b, 2008c) Schmied/Nkemleke (fc:2008a, 2008b) A number of post-graduate projects in ENS/Faculty

17 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 2. Corpora in language teaching/learning  CCE data used for classroom activities over the years

18 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Concordances : arrive _ NP (Simplification)

19 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Value of concordances  Support teachers’ classroom explanation  Learner’s as researchers  Data-driven learning  Critical look at existing language teaching material

20 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Natural data for textbook  CCE data used for studies on aspects of Cameroon English usage, E.g. Hans-Georg Wolf used data from the corpus in his book English in Cameroon, published in 2001 by Mouton de Grouter (Berlin/New York).

21 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 3. Corpora in Lexical Studies  Keep informed about new words, changing meanings  Call up word combinations, co-occurring words

22 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Prospect  ICE-Cameroon is on-going  Future possibility of more specialized corpora E.g. Academic texts, Fiction

23 Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 END Thank You!


Download ppt "Daniel Nkemleke, Humboldt Kolleg Kamerun, 30/07/2008 Corpus Linguistics and Language Education: Development and Utility of the Corpus of Cameroon English."

Similar presentations


Ads by Google