Applied Linguistics Chapter Four: Corpus Linguistics

Slides:



Advertisements
Similar presentations
Uses of a Corpus “[E]xplore actual patterns of language use”
Advertisements

Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
What is VOICE? VOICE, the Vienna-Oxford International Corpus of English, is a structured collection of language data, the first computer-readable corpus.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
The origins of language curriculum development
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
Choosing Your Primary Research Method What do you need to find out that your literature did not provide?
Chapter 14 Overview of Qualitative Research Gay, Mills, and Airasian
National Curriculum Key Stage 2
Memory Strategy – Using Mental Images
CORPUS LINGUISTICS: AN INTRODUCTION Susi Yuliawati, M.Hum. Universitas Padjadjaran
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
The Situational Language Teaching
Communicative Language Teaching Vocabulary
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
What is Corpus Linguistics?
U SING C ORPUS - BASED R ESEARCH FOR L ANGUAGE T EACHING AND L EARNING ENGLISH 510 Hee Sung (Grace) Jun & Kimberly LeVelle.
Social Dimensions of Telecollaborative Foreign Language Study Julie A. Belz The Pennsylvania State Univeristy Presentation by Kathryn Sederberg, Nov 2008.
Computational Investigation of Palestinian Arabic Dialects
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Classroom Research Workshop at Darunsikkhalai, 2 November 2012 Richard Watson Todd King Mongkut’s University of Technology Thonburi
Corpus approaches to discourse
1 TESL Evaluating CALL Packages:Curriculum/Pedagogical/Lingui stics Dr. Henry Tao GUO Office: B 418.
Levels of Linguistic Analysis
Yr 7.  Pupils use mathematics as an integral part of classroom activities. They represent their work with objects or pictures and discuss it. They recognise.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
Discourse Analysis Week 10 Riggenbach (1999) Chapter 1 - Quotes.
Chapter 5 The Oral Approach.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
PRIMENJENA LINGVISTIKA I NASTAVA JEZIKA II 3 rd class.
AssessPlanDo Review Next Steps… Now you are ready to undertake the evaluation. This will normally include: 1.Collecting data 2.Analysing data The type.
Use of Literature in Language Teaching
Advanced Computer Systems
Collecting Written Data
Theories of Language Acquisition
E303 Part II The Context of Language Research
Qualitative Data Analysis
The Specialist Study Unit
Corpus Linguistics Anca Dinu February, 2017.
1 TOOL DESIGN A Review of Learning Design:
Introduction to Corpus Linguistics
ALE161 國際行銷英文簡報技巧 International Marketing Presentation Techniques
CORPUS LINGUISTICS Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. An approach to derive at a set of.
IB Assessments CRITERION!!!.
Searching corpora.

Using Corpora in Linguistics
Computational and Statistical Methods for Corpus Analysis: Overview
Exploring the BNC Corpus
Compelling, Convincing
Introduction to Corpus Linguistics: Exploring Collocation
Chapter 13 Quantitative Analysis of Text
Content Analysis What is it? How do you do it? What are the advantages and disadvantages of it?
The European Centre for Modern Languages of the Council of Europe
Mechanical, Meaningful, and Communicative Practice
Consistency of Teacher Judgement
Learning and Teaching Principles
European Network of e-Lexicography
Corpus Linguistics I ENG 617
Ethnography of Communication Somayyeh Pedram GS31063
What writing practices international students bring in EAP programmes
TEACHING READING Indawan Syahri 12/8/2018 indawansyahri.
Priprema za kolokvijum i
The Nature of Learner Language (Chapter 2 Rod Ellis, 1997) Page 15
Levels of Linguistic Analysis
Mechanical, Meaningful, and Communicative Practice
Using GOLD to Tracking L2 Development
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 3 prof. ssa Laura Liucci –
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 2 prof. ssa Laura Liucci –
HOW DOES THIS RELATE TO SECTION A OF YOUR ENGLISH LANGUAGE EXAM?
Presentation transcript:

Applied Linguistics Chapter Four: Corpus Linguistics

Corpus Linguistics Exploring actual patterns of language use and use of this exploration in developing material for language classroom instrucion CL / It provides a powerful tool for natural language analysis and insights on how language use varies in different siruations: spoken/ written, formal / casual

Characteristics of corpus based- analysis of Language It is empirical: analyzing the actual patterns of language use in natural texts. Using large and principled collection of natural texts known as corpus/ the basis for analysis Extensive use of computers for analysis, using both interactive and automatic techniques Dependent of both quantitavie and qualitattive techniques

- Although computer provides a wide range of sophisticated statistical techniques and acomplish mechanical tasks fastly and accurately , human analysis remains indispensable to decide upon which information to extract from the corpus and giving appropriate interpretation. Hence, the greatest contribution of CL lies in bringing together aspects of quantitaive and qualitative techniques

1- the quantitative analyses provides an accurate insight on macro-level characteristics. 2- qualitative analyses provide the complementary micro-level perspectives .

Corpus Design and Compilation No minimum size for for a text collection to be considered as a collection. yet, the larger the corpus is the more valuable it is. Therefore, it is of great importance to know how corpora are designed and compiled to examine the existing corpora and to understand what sorts of analyses they are best suitable for.

Types of Corpora Specialized Corpora: General Corpora: aims to present Language in its broadest sense. Includes texts that are from different types. May include both spoken an written language. Specialized Corpora: Designed with more specific research goals. Includes also both and written language. It may include historical texts corpora, fiction texts corpora , newspaper writing corpora... Learners’ corpus: a corpus of spoken and written language samples of non-native speakers ; the most of all is the International Corpus of Learner English- ICLE

Issues in Corpus Design Reliability of the results: the composition of the corpus should reflect the anticipated research goals. ( an intended corpus for exploring lexical questions needs to be very large to allow for accurate presentation of a large number of words of different senses, meanings

Corpus Compilation When creating a corpus, data collection involves obtaining or creating electronic versions of the target texts– storing – organizing them. Data collection of written corpus means using scanner and optical character recognition software to scan documents into electronic text files. Materials for written texts are mostly keyboarded manually.

Data collection of building a spoken corpus is lengthy and costy. Deciding of a transcription system ( most spoken corpors use an orthographic transcription system that does not capture prosodic details). - Choosing the transcription system is deciding how the interactional characteristics of the speech will be represented in the transcrips.

What can a corpus tell us? Word counts and basic corpus tools: ranged from simple word lists – catalogues- complex grammatical structures- interactive analysis- linguistic and non-linguistic association patterns- individual linguistic features - identification of features that characterize particular registers- frequency of occurrence information.

Concordancing backages provides additional information about lexical co-occurence patterns. Therefore , the search ( word/ phrase is selected , the programme can provide a concordancing listing showing the occurrence of a target word in context. ( this display is known as ‘Kew word in context’)

Working with tagged texts: carrying more sophisticated types of corpus analyses, it is often necessary to have a tagged corpud. When a corpus is tagged each word in the corpus is given a grammatical label and sometimes it is a complex process: Can : Model, Verb, Noun

Overview of different types of corpus studies Corpora has addressed a number of issues: - Language change intrigues researchers, teachers and language learners. Historical changes led to specialized corpora to gain insights into related language development. - Exploring differences and similarities across different national and regional varieties of a language. - Exploring differences between spoken and written language. - Describing sub-registers that provide valuable resources for both teachers and learners.

Corpora and language teaching Deciding on which language features and structures are important and how various features and structures are used.

Bringing corpora into language classrrom Teachers can shape instruction based on corpus based information. ( consulting corpus based studies to gain information about the features they intend to teach ; directing their efforts to the instructions to what grammatical structures the learners are supposed to encounter according to language functions.