EUROCALL University of Ulster, September Developing annotation solutions for online data-driven learning Pascual Pérez-Paredes and Jose María Alcaraz SACODEYL Universidad de Murcia, Spain
EUROCALL University of Ulster, September System Aided Compilation and Open Distribution of European Youth Language CP ES-MINERVA-M
EUROCALL University of Ulster, September Developing annotation solutions for online data-driven learning 1.Annotation in CL 2.Annotating corpora for the FL classroom 3.Challenges of pedagogical annotation 4. Developing annotation solutions 5.SACODEYL annotator Domain analysis Requirements and software specification
EUROCALL University of Ulster, September 1. Annotation in Corpus Linguistics
EUROCALL University of Ulster, September Add-on Needs of the research community Annotation = analysis Annotation = processing Annotation in Corpus Linguistics
EUROCALL University of Ulster, September Why annotate? Annotation allows corpus users for both refined information retrieval capabilities and the subsequent treatment of the data
EUROCALL University of Ulster, September Annotation Can be automatic, semi-automatic or manual Can be performed by one or different annotators or software operators Does reflect the different nature of the ultimate aim of the meta-information being added to the corpus
EUROCALL University of Ulster, September Non polysemic ambiguity: Poesio and Artstein (2005) Interest in L2 speakers’ errors: Abe and Tono (2005)
EUROCALL University of Ulster, September Strong research paradigm rooted on grammatical tagging, including morphological and syntactical information (Garside, R., Leech, G., and McEnery 1997).
EUROCALL University of Ulster, September 2 Annotating corpora for the FL classroom 2.1 Corpora in the FL classroom
EUROCALL University of Ulster, September Interest in corpora and FLT: Volumes: Sinclair 2004, Braun, Kohn and Mukherkee 2006, Hidalgo, Quereda and Santana 2007 SIG EUROCALL 1st International Conference on Corpus- Based Approaches to ELT, November 2007
EUROCALL University of Ulster, September Normalisation is still an issue: Mauranen (2004:99) points out that for a teaching method to become an important innovation, it has to “make its way to the normal classroom where teachers and students can use it as part of their everyday routine, with not too much extra hassle”. Chambers 2007: major obstacles Braun 2007: secondary education
EUROCALL University of Ulster, September 2 Annotating corpora for the FL classroom 2.2 Annotating with a view on learning
EUROCALL University of Ulster, September Braun (2007): pedagogically motivated corpora (a) provide a more systematic range of material than individual texts or scattered collections of activities and, if well- designed, (b) offer a wider range of idiolects than the average material.
EUROCALL University of Ulster, September Braun (2006) states that thematic annotation, including topic keys and section titles, are particularly useful in the implementation of pedagogically motivated corpora.
EUROCALL University of Ulster, September
What we do 02 What we do In the 60s, in the late 60s, I had worked in Germany for a while and I decided that I wanted to have my children reared in Ireland. So we came back from Germany, working for the Irish Tourist Board and started this enterprise. It's lovely now with the sunshine, we don't always have it like this, but very often. We started with 12 and then 20 caravans, and now we have about 35. And it's been a basis of what which we can live as a family, raise our children in a nice environment. We work very hard for three months and then have a very relaxed time of it, nine months. And in that time then I took on as a hobby computers, and Mary took on tour-guiding. So we have various different aspects to what we do. The horse caravans is a very intensive work just for those three months, but it's very enjoyable because we mix in the family a quiet nine months where we are very much en famille with the children, you can concentrate on them much more than if we were nine-to-five workers. And then the intensity of the three months means that we can also have our children employed, and learning how to work, learning how to deal with people. So, good mixture, isn't it.
EUROCALL University of Ulster, September The annotators have a pedagogical use of the text in mind when approaching the annotation stage. The tags, and highlight the relevance of the communicative purpose of texts, that is, the topics and the contents that characterize them.
EUROCALL University of Ulster, September
3 Annotation challenges
EUROCALL University of Ulster, September Remember the why annotate? slide Annotation allows corpus users for both refined information retrieval capabilities and the subsequent treatment of the data PEDAGOGY
EUROCALL University of Ulster, September Linguistic analysis of interest in FLT Tsui (2004) Corpus-based studies focus on 4 areas of description: 1.Lexical collocation 2.Syntactic patterning 3.Genre analysis 4.Discourse structure and cohesion Word based and relying on co-occurrence of grammatical word-class tags
EUROCALL University of Ulster, September Researcher/Linguist End user Linguistic analysis of interest in FLT > Linguistics comes first > DDL materials Concordances and corpus
EUROCALL University of Ulster, September Pedagogical analysis (and annotation) of language corpora > Pedagogy comes first > Pedagogy-driven DDL Material developer/Teacher/ Learner End user
EUROCALL University of Ulster, September Problem-oriented tagging Corpus applications in FLT still need to gain a status on their own CHALLENGES
EUROCALL University of Ulster, September CHALLENGES TECHNOLOGY DESIGN EPISTEMOLOGY
EUROCALL University of Ulster, September Leech (1993) maxims –remove the annotation from the text; –if desired, the annotation could be extracted –based on guidelines everyone could reach; –it should be made clear how and by whom the annotation was carried out, –it should be based on widely agreed and theory-neutral principles DESIGN
EUROCALL University of Ulster, September Presuppositions and foundations: antecedent implications in the literature Annotation oriented towards pedagogical uses EPISTEMOLOGY
EUROCALL University of Ulster, September Mukherjee (2006): copora in language pegagogy for (a) dictionaries and material, (b) database and (c) representative samples of learner language. EPISTEMOLOGY
EUROCALL University of Ulster, September Meunier (2002): methodological influence ---- use of classroom concordancing and inductive approach to learning leading to “rehabilitation” of grammar (p. 135) EPISTEMOLOGY
EUROCALL University of Ulster, September Bernardini (2000): inductive and deductive learning, probabilistic notion of language and learning pedagogy that resolves the attention to form /meaning dichotomy EPISTEMOLOGY
EUROCALL University of Ulster, September Bernardini (2000): learners as either researchers or travellers EPISTEMOLOGY
EUROCALL University of Ulster, September Bernardini (2004): potential of corpora as a linguistic aid: favour descriptive insights and discovery learning EPISTEMOLOGY
EUROCALL University of Ulster, September Pérez-Paredes (2003,2004): integrative paradigm of CL in FLT EPISTEMOLOGY
EUROCALL University of Ulster, September TECHNOLOGY User-friendly: non-computational linguists Multilingual support Standard-compliant: reusability and valorisation
EUROCALL University of Ulster, September 4. Developing Annotation Solutions
EUROCALL University of Ulster, September Developing Annotation Solutions From Challenges To Requirements From software engineering perspective, development can be considered as the following process: From Requirements To Solutions
EUROCALL University of Ulster, September Input Requirements Input = User Requirement Changing Approach = Changing Requirements Identifying New Requirement –Five Perspectives
EUROCALL University of Ulster, September Actors & Context. Linguistic Engineering vs Pedagogical Engineering Researching Powerful Tool Research Oriented Extensible & Modular Specific Domain Efficient Complexity Ad-Hoc Solutions Mandatory Teaching Pedagogic Tool Learning Oriented Friendly General Domain Practical Simplicity Organizational Optional
EUROCALL University of Ulster, September Data. Grammatical vs Pedagogical Linguistic Engineering Large amount of data (representative Corpora) Grammatical Annotation Oriented to retrieve statistical Information Learning Reduced set of data Pedagogy Annotation Oriented to retrieve learning information (Hierarchical Structures & Selective Information)
EUROCALL University of Ulster, September Epistemological & Empirical Multi-Disciplinarily support Multi-Lingual support Multi-Corpus Management Multi-Purpose Support Based on Standards
EUROCALL University of Ulster, September Choosing Software Life Cycle Spiral Approach Why?
EUROCALL University of Ulster, September 5 SACODEYL Annotator
EUROCALL University of Ulster, September Output. SACODEYL Annotator SACODEYL Annotator characteristics: Pedagogical Motivation Teaching Oriented Friendly Interface Multi-Language (UTF) Standardization (TEI) Multi-Purpose
EUROCALL University of Ulster, September Developing annotation solutions for online data-driven learning Contact information Pascual Pérez-Paredes Jose María Alcaraz Universidad de Murcia, Spain