EUROCALL 2007 - University of Ulster, 5 - 8 September Developing annotation solutions for online data-driven learning Pascual Pérez-Paredes and Jose María.

Slides:



Advertisements
Similar presentations
Some impressions from the school visits and the conference -No systematic report 1 st Some general wisdom 2 nd Key analysis questions of the project Conference.
Advertisements

Discovering research: a teacher- friendly approach Deborah Bullock.
Integrated Learning Environment ??? Changing School Culture – Using IT to Cope with Individual Learning Differences in Schools 1 st December 2003 Final.
Group of teachers and PhD Students who teach Research Methods in Education using ICT and study them. At this moment, we have different research projects.
María Sánchez-Tornel Pascual Pérez-Paredes José M. Alcaraz Calero Authenticating Language Learning: Web Collaboration Meets Pedagogic Corpora February.
Comparing L1 and L2 reading
Uses of a Corpus “[E]xplore actual patterns of language use”
1 Corpora for all Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds and Sussex.
Creator: Wendi South Diffusion and Integration of Technology in Education.
Multimedia Call: Lessons to be learned from research on instructed SLA by Carol chapelle Iowa State University Daniel, Rania, Alice.
Høgskolen i Oslo Using Self-Compiled, Discipline- Specific Corpora as a Practical Learning-Research Tool for Developing Written Language Skills in English.
A Language Environment for Second Language Writers Ola Knutsson KTH Nada.
Session 1 Getting started with classroom research DAVID NUNAN.
Multilingual eLearning in LANGuage Engineering. Project Overview  Project span: Oct 2004 – Oct 2007  Kick-off meeting Oct  Project goals:
Social model as catalyst for innovation in design and pedagogical change Frederic Fovet, Director Office for Students with Disabilities & My Access McGill.
Spoken multimedia corpora for pedagogical purposes Sabine Braun (University of Surrey) Pascual Pérez-Paredes (Universidad de Murcia) Ylva Berglund (Oxford.
The origins of language curriculum development
Pedagogic uses of a corpus of student writing and their implications for sampling and annotation Alois Heuboeck University of Reading, UK.
Data-Driven South Asian Language Learning SALRC Pedagogy Workshop June 8, 2005 J. Scott Payne Penn State University
Learner-Centered South Asian Language Instruction SALRC Pedagogy Workshop June 6, 2005 J. Scott Payne Penn State University
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Second Language Acquisition and Real World Applications Alessandro Benati (Director of CAROLE, University of Greenwich, UK) Making.
Presented by Jennifer Robison TexTESOL II March 12, 2010 San Antonio, TX.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
14: THE TEACHING OF GRAMMAR  Should grammar be taught?  When? How? Why?  Grammar teaching: Any strategies conducted in order to help learners understand,
CORPUS LINGUISTICS: AN INTRODUCTION Susi Yuliawati, M.Hum. Universitas Padjadjaran
CYCO Professional Development Packages (PDPs) Teacher Responsiveness to Student Scientific Inquiry 1.
Extensive Reading Research in Action
Education and Culture Main initiatives and events 2013 Multilingualism.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
JSP  To show different aspects taking part in the didactic approaches to language teaching.  To know the.
Online Corpora in L2 Writing Class Zawan Al Bulushi Indiana University Bloomington November 15,
Linguistics, Pragmatics & Natural Grammar
CURRENT TRENDS IN COMMUNICATIVE LANGUAGE TEACHING
U SING C ORPUS - BASED R ESEARCH FOR L ANGUAGE T EACHING AND L EARNING ENGLISH 510 Hee Sung (Grace) Jun & Kimberly LeVelle.
South African Education Portal
Translation Studies 8. Research methods in Translation Studies Krisztina Károly, Spring, 2006 Sources: Károly, 2002; Klaudy, 2003.
ITEC224 Database Programming
Reflections on Using Corpora Data in EFL Teaching CHEN BO Chongqing Jiaotong University 2006.
Mobile Technologies in Education Advanced Technologies in Education Conference Athens, Greece, November , 2004 Malliou Eleni Ellinogermaniki Agogi.
Researching language with computers Paul Thompson.
Multimedia CALL: Lessons to Be Learned from Research on Instructed SLA Carol A. Chapelle Presenters: Thorunn April.
Enquiring into Entrepreneurial School Leadership Sue Robson.
Constructivism and Innovation in English Language Teaching Prof Bob Adamson Hong Kong Institute of Education
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Cultural Competency and the Inclusive Classroom Professional Development Session Kalyn Estep.
Seminar in Applied Corpus Linguistics: Introduction APLNG 597A Xiaofei Lu August 26, 2009.
Systems Analysis and Design in a Changing World, Fourth Edition
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
UNIT 7. DIDACTIC APPROACHES
Human Computer Interaction CITB 243 Chapter 1 What is HCI
Corpus Linguistics in Research Doctorate in Education University of Warwick 6th November 2008.
Introduction to the Framework: Unit 1, Getting Readyhttp://facultyinitiative.wested.org/1.
A Curriculum for the future The new Secondary Curriculum What’s next? Phase 3.
TEFL METHODOLOGY I COMMUNICATIVE LANGUAGE TEACHING.
Eliana Ulhôa Godoy Colégio Logosófico González Pecotche, Belo Horizonte, MG (BRAZIL) Dácio Guimarães de Moura Centro Federal de Educação Tecnológica, Belo.
1 STO A Lexical Database of Danish for Language Technology Applications Anna Braasch Center for Sprogteknologi Copenhagen SPINN Seminar, October 27, 2001.
11 TOPIC 1: INTRODUCTION TO CONTENT- BASED INSTRUCTION (CBI) IN SECOND LANGUAGE ACQUISITION. DEFINITION DEFINITION  CBI- the integration of a particular.
Corpus Linguistics MOHAMMAD ALIPOUR ISLAMIC AZAD UNIVERSITY, AHVAZ BRANCH.
Harry Goossens Centre of Competence on Data Warehousing.
COURSE AND SYLLABUS DESIGN
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
INTRODUCTION TO APPLIED LINGUISTICS
CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom.
Applied Linguistics Applied Linguistics means
WiMi Pedagogic Corpora – exploiting real language for authenticated learning Kurt Kohn Chair of Applied English Linguistics University of Tübingen Germany.
Corpus Linguistics Anca Dinu February, 2017.
Computational and Statistical Methods for Corpus Analysis: Overview
Corpus-Based ELT CEL Symposium Creating Learning Designers
Applied Linguistics Chapter Four: Corpus Linguistics
Presentation transcript:

EUROCALL University of Ulster, September Developing annotation solutions for online data-driven learning Pascual Pérez-Paredes and Jose María Alcaraz SACODEYL Universidad de Murcia, Spain

EUROCALL University of Ulster, September System Aided Compilation and Open Distribution of European Youth Language CP ES-MINERVA-M

EUROCALL University of Ulster, September Developing annotation solutions for online data-driven learning 1.Annotation in CL 2.Annotating corpora for the FL classroom 3.Challenges of pedagogical annotation 4. Developing annotation solutions 5.SACODEYL annotator Domain analysis Requirements and software specification

EUROCALL University of Ulster, September 1. Annotation in Corpus Linguistics

EUROCALL University of Ulster, September Add-on Needs of the research community Annotation = analysis Annotation = processing Annotation in Corpus Linguistics

EUROCALL University of Ulster, September Why annotate? Annotation allows corpus users for both refined information retrieval capabilities and the subsequent treatment of the data

EUROCALL University of Ulster, September Annotation Can be automatic, semi-automatic or manual Can be performed by one or different annotators or software operators Does reflect the different nature of the ultimate aim of the meta-information being added to the corpus

EUROCALL University of Ulster, September Non polysemic ambiguity: Poesio and Artstein (2005) Interest in L2 speakers’ errors: Abe and Tono (2005)

EUROCALL University of Ulster, September Strong research paradigm rooted on grammatical tagging, including morphological and syntactical information (Garside, R., Leech, G., and McEnery 1997).

EUROCALL University of Ulster, September 2 Annotating corpora for the FL classroom 2.1 Corpora in the FL classroom

EUROCALL University of Ulster, September Interest in corpora and FLT: Volumes: Sinclair 2004, Braun, Kohn and Mukherkee 2006, Hidalgo, Quereda and Santana 2007 SIG EUROCALL 1st International Conference on Corpus- Based Approaches to ELT, November 2007

EUROCALL University of Ulster, September Normalisation is still an issue: Mauranen (2004:99) points out that for a teaching method to become an important innovation, it has to “make its way to the normal classroom where teachers and students can use it as part of their everyday routine, with not too much extra hassle”. Chambers 2007: major obstacles Braun 2007: secondary education

EUROCALL University of Ulster, September 2 Annotating corpora for the FL classroom 2.2 Annotating with a view on learning

EUROCALL University of Ulster, September Braun (2007): pedagogically motivated corpora (a) provide a more systematic range of material than individual texts or scattered collections of activities and, if well- designed, (b) offer a wider range of idiolects than the average material.

EUROCALL University of Ulster, September Braun (2006) states that thematic annotation, including topic keys and section titles, are particularly useful in the implementation of pedagogically motivated corpora.

EUROCALL University of Ulster, September

What we do 02 What we do In the 60s, in the late 60s, I had worked in Germany for a while and I decided that I wanted to have my children reared in Ireland. So we came back from Germany, working for the Irish Tourist Board and started this enterprise. It's lovely now with the sunshine, we don't always have it like this, but very often. We started with 12 and then 20 caravans, and now we have about 35. And it's been a basis of what which we can live as a family, raise our children in a nice environment. We work very hard for three months and then have a very relaxed time of it, nine months. And in that time then I took on as a hobby computers, and Mary took on tour-guiding. So we have various different aspects to what we do. The horse caravans is a very intensive work just for those three months, but it's very enjoyable because we mix in the family a quiet nine months where we are very much en famille with the children, you can concentrate on them much more than if we were nine-to-five workers. And then the intensity of the three months means that we can also have our children employed, and learning how to work, learning how to deal with people. So, good mixture, isn't it.

EUROCALL University of Ulster, September The annotators have a pedagogical use of the text in mind when approaching the annotation stage. The tags, and highlight the relevance of the communicative purpose of texts, that is, the topics and the contents that characterize them.

EUROCALL University of Ulster, September

3 Annotation challenges

EUROCALL University of Ulster, September Remember the why annotate? slide Annotation allows corpus users for both refined information retrieval capabilities and the subsequent treatment of the data PEDAGOGY

EUROCALL University of Ulster, September Linguistic analysis of interest in FLT Tsui (2004) Corpus-based studies focus on 4 areas of description: 1.Lexical collocation 2.Syntactic patterning 3.Genre analysis 4.Discourse structure and cohesion Word based and relying on co-occurrence of grammatical word-class tags

EUROCALL University of Ulster, September Researcher/Linguist End user Linguistic analysis of interest in FLT > Linguistics comes first > DDL materials Concordances and corpus

EUROCALL University of Ulster, September Pedagogical analysis (and annotation) of language corpora > Pedagogy comes first > Pedagogy-driven DDL Material developer/Teacher/ Learner End user

EUROCALL University of Ulster, September Problem-oriented tagging Corpus applications in FLT still need to gain a status on their own CHALLENGES

EUROCALL University of Ulster, September CHALLENGES TECHNOLOGY DESIGN EPISTEMOLOGY

EUROCALL University of Ulster, September Leech (1993) maxims –remove the annotation from the text; –if desired, the annotation could be extracted –based on guidelines everyone could reach; –it should be made clear how and by whom the annotation was carried out, –it should be based on widely agreed and theory-neutral principles DESIGN

EUROCALL University of Ulster, September Presuppositions and foundations: antecedent implications in the literature Annotation oriented towards pedagogical uses EPISTEMOLOGY

EUROCALL University of Ulster, September Mukherjee (2006): copora in language pegagogy for (a) dictionaries and material, (b) database and (c) representative samples of learner language. EPISTEMOLOGY

EUROCALL University of Ulster, September Meunier (2002): methodological influence ----  use of classroom concordancing and inductive approach to learning leading to “rehabilitation” of grammar (p. 135) EPISTEMOLOGY

EUROCALL University of Ulster, September Bernardini (2000): inductive and deductive learning, probabilistic notion of language and learning pedagogy that resolves the attention to form /meaning dichotomy EPISTEMOLOGY

EUROCALL University of Ulster, September Bernardini (2000): learners as either researchers or travellers EPISTEMOLOGY

EUROCALL University of Ulster, September Bernardini (2004): potential of corpora as a linguistic aid: favour descriptive insights and discovery learning EPISTEMOLOGY

EUROCALL University of Ulster, September Pérez-Paredes (2003,2004): integrative paradigm of CL in FLT EPISTEMOLOGY

EUROCALL University of Ulster, September TECHNOLOGY User-friendly: non-computational linguists Multilingual support Standard-compliant: reusability and valorisation

EUROCALL University of Ulster, September 4. Developing Annotation Solutions

EUROCALL University of Ulster, September Developing Annotation Solutions From Challenges  To Requirements From software engineering perspective, development can be considered as the following process: From Requirements  To Solutions

EUROCALL University of Ulster, September Input Requirements Input = User Requirement Changing Approach = Changing Requirements Identifying New Requirement –Five Perspectives

EUROCALL University of Ulster, September Actors & Context. Linguistic Engineering vs Pedagogical Engineering Researching Powerful Tool Research Oriented Extensible & Modular Specific Domain Efficient Complexity Ad-Hoc Solutions Mandatory Teaching Pedagogic Tool Learning Oriented Friendly General Domain Practical Simplicity Organizational Optional

EUROCALL University of Ulster, September Data. Grammatical vs Pedagogical Linguistic Engineering Large amount of data (representative Corpora) Grammatical Annotation Oriented to retrieve statistical Information Learning Reduced set of data Pedagogy Annotation Oriented to retrieve learning information (Hierarchical Structures & Selective Information)

EUROCALL University of Ulster, September Epistemological & Empirical Multi-Disciplinarily support Multi-Lingual support Multi-Corpus Management Multi-Purpose Support Based on Standards

EUROCALL University of Ulster, September Choosing Software Life Cycle Spiral Approach Why?

EUROCALL University of Ulster, September 5 SACODEYL Annotator

EUROCALL University of Ulster, September Output. SACODEYL Annotator SACODEYL Annotator characteristics: Pedagogical Motivation Teaching Oriented Friendly Interface Multi-Language (UTF) Standardization (TEI) Multi-Purpose

EUROCALL University of Ulster, September Developing annotation solutions for online data-driven learning Contact information Pascual Pérez-Paredes Jose María Alcaraz Universidad de Murcia, Spain