Presentation is loading. Please wait.

Presentation is loading. Please wait.

EUROCALL 2007 - University of Ulster, 5 - 8 September Developing annotation solutions for online data-driven learning Pascual Pérez-Paredes and Jose María.

Similar presentations


Presentation on theme: "EUROCALL 2007 - University of Ulster, 5 - 8 September Developing annotation solutions for online data-driven learning Pascual Pérez-Paredes and Jose María."— Presentation transcript:

1 EUROCALL 2007 - University of Ulster, 5 - 8 September Developing annotation solutions for online data-driven learning Pascual Pérez-Paredes and Jose María Alcaraz SACODEYL Universidad de Murcia, Spain

2 EUROCALL 2007 - University of Ulster, 5 - 8 September System Aided Compilation and Open Distribution of European Youth Language 225836-CP-1-2005-1-ES-MINERVA-M

3 EUROCALL 2007 - University of Ulster, 5 - 8 September Developing annotation solutions for online data-driven learning 1.Annotation in CL 2.Annotating corpora for the FL classroom 3.Challenges of pedagogical annotation 4. Developing annotation solutions 5.SACODEYL annotator Domain analysis Requirements and software specification

4 EUROCALL 2007 - University of Ulster, 5 - 8 September 1. Annotation in Corpus Linguistics

5 EUROCALL 2007 - University of Ulster, 5 - 8 September Add-on Needs of the research community Annotation = analysis Annotation = processing Annotation in Corpus Linguistics

6 EUROCALL 2007 - University of Ulster, 5 - 8 September Why annotate? Annotation allows corpus users for both refined information retrieval capabilities and the subsequent treatment of the data

7 EUROCALL 2007 - University of Ulster, 5 - 8 September Annotation Can be automatic, semi-automatic or manual Can be performed by one or different annotators or software operators Does reflect the different nature of the ultimate aim of the meta-information being added to the corpus

8 EUROCALL 2007 - University of Ulster, 5 - 8 September Non polysemic ambiguity: Poesio and Artstein (2005) ----------- Interest in L2 speakers’ errors: Abe and Tono (2005)

9 EUROCALL 2007 - University of Ulster, 5 - 8 September Strong research paradigm rooted on grammatical tagging, including morphological and syntactical information (Garside, R., Leech, G., and McEnery 1997).

10 EUROCALL 2007 - University of Ulster, 5 - 8 September 2 Annotating corpora for the FL classroom 2.1 Corpora in the FL classroom

11 EUROCALL 2007 - University of Ulster, 5 - 8 September Interest in corpora and FLT: Volumes: Sinclair 2004, Braun, Kohn and Mukherkee 2006, Hidalgo, Quereda and Santana 2007 SIG EUROCALL 1st International Conference on Corpus- Based Approaches to ELT, November 2007

12 EUROCALL 2007 - University of Ulster, 5 - 8 September Normalisation is still an issue: Mauranen (2004:99) points out that for a teaching method to become an important innovation, it has to “make its way to the normal classroom where teachers and students can use it as part of their everyday routine, with not too much extra hassle”. Chambers 2007: major obstacles Braun 2007: secondary education

13 EUROCALL 2007 - University of Ulster, 5 - 8 September 2 Annotating corpora for the FL classroom 2.2 Annotating with a view on learning

14 EUROCALL 2007 - University of Ulster, 5 - 8 September Braun (2007): pedagogically motivated corpora (a) provide a more systematic range of material than individual texts or scattered collections of activities and, if well- designed, (b) offer a wider range of idiolects than the average material.

15 EUROCALL 2007 - University of Ulster, 5 - 8 September Braun (2006) states that thematic annotation, including topic keys and section titles, are particularly useful in the implementation of pedagogically motivated corpora.

16 EUROCALL 2007 - University of Ulster, 5 - 8 September

17 What we do 02 What we do In the 60s, in the late 60s, I had worked in Germany for a while and I decided that I wanted to have my children reared in Ireland. So we came back from Germany, working for the Irish Tourist Board and started this enterprise. It's lovely now with the sunshine, we don't always have it like this, but very often. We started with 12 and then 20 caravans, and now we have about 35. And it's been a basis of what which we can live as a family, raise our children in a nice environment. We work very hard for three months and then have a very relaxed time of it, nine months. And in that time then I took on as a hobby computers, and Mary took on tour-guiding. So we have various different aspects to what we do. The horse caravans is a very intensive work just for those three months, but it's very enjoyable because we mix in the family a quiet nine months where we are very much en famille with the children, you can concentrate on them much more than if we were nine-to-five workers. And then the intensity of the three months means that we can also have our children employed, and learning how to work, learning how to deal with people. So, good mixture, isn't it.

18 EUROCALL 2007 - University of Ulster, 5 - 8 September The annotators have a pedagogical use of the text in mind when approaching the annotation stage. The tags, and highlight the relevance of the communicative purpose of texts, that is, the topics and the contents that characterize them.

19 EUROCALL 2007 - University of Ulster, 5 - 8 September

20 3 Annotation challenges

21 EUROCALL 2007 - University of Ulster, 5 - 8 September Remember the why annotate? slide Annotation allows corpus users for both refined information retrieval capabilities and the subsequent treatment of the data PEDAGOGY

22 EUROCALL 2007 - University of Ulster, 5 - 8 September Linguistic analysis of interest in FLT Tsui (2004) Corpus-based studies focus on 4 areas of description: 1.Lexical collocation 2.Syntactic patterning 3.Genre analysis 4.Discourse structure and cohesion Word based and relying on co-occurrence of grammatical word-class tags

23 EUROCALL 2007 - University of Ulster, 5 - 8 September Researcher/Linguist End user Linguistic analysis of interest in FLT ------> Linguistics comes first -------> DDL materials Concordances and corpus

24 EUROCALL 2007 - University of Ulster, 5 - 8 September Pedagogical analysis (and annotation) of language corpora ------> Pedagogy comes first -------> Pedagogy-driven DDL Material developer/Teacher/ Learner End user

25 EUROCALL 2007 - University of Ulster, 5 - 8 September Problem-oriented tagging Corpus applications in FLT still need to gain a status on their own CHALLENGES

26 EUROCALL 2007 - University of Ulster, 5 - 8 September CHALLENGES TECHNOLOGY DESIGN EPISTEMOLOGY

27 EUROCALL 2007 - University of Ulster, 5 - 8 September Leech (1993) maxims –remove the annotation from the text; –if desired, the annotation could be extracted –based on guidelines everyone could reach; –it should be made clear how and by whom the annotation was carried out, –it should be based on widely agreed and theory-neutral principles DESIGN

28 EUROCALL 2007 - University of Ulster, 5 - 8 September Presuppositions and foundations: antecedent implications in the literature Annotation oriented towards pedagogical uses EPISTEMOLOGY

29 EUROCALL 2007 - University of Ulster, 5 - 8 September Mukherjee (2006): copora in language pegagogy for (a) dictionaries and material, (b) database and (c) representative samples of learner language. EPISTEMOLOGY

30 EUROCALL 2007 - University of Ulster, 5 - 8 September Meunier (2002): methodological influence ----  use of classroom concordancing and inductive approach to learning leading to “rehabilitation” of grammar (p. 135) EPISTEMOLOGY

31 EUROCALL 2007 - University of Ulster, 5 - 8 September Bernardini (2000): inductive and deductive learning, probabilistic notion of language and learning pedagogy that resolves the attention to form /meaning dichotomy EPISTEMOLOGY

32 EUROCALL 2007 - University of Ulster, 5 - 8 September Bernardini (2000): learners as either researchers or travellers EPISTEMOLOGY

33 EUROCALL 2007 - University of Ulster, 5 - 8 September Bernardini (2004): potential of corpora as a linguistic aid: favour descriptive insights and discovery learning EPISTEMOLOGY

34 EUROCALL 2007 - University of Ulster, 5 - 8 September Pérez-Paredes (2003,2004): integrative paradigm of CL in FLT EPISTEMOLOGY

35 EUROCALL 2007 - University of Ulster, 5 - 8 September TECHNOLOGY User-friendly: non-computational linguists Multilingual support Standard-compliant: reusability and valorisation

36 EUROCALL 2007 - University of Ulster, 5 - 8 September 4. Developing Annotation Solutions

37 EUROCALL 2007 - University of Ulster, 5 - 8 September Developing Annotation Solutions From Challenges  To Requirements From software engineering perspective, development can be considered as the following process: From Requirements  To Solutions

38 EUROCALL 2007 - University of Ulster, 5 - 8 September Input Requirements Input = User Requirement Changing Approach = Changing Requirements Identifying New Requirement –Five Perspectives

39 EUROCALL 2007 - University of Ulster, 5 - 8 September Actors & Context. Linguistic Engineering vs Pedagogical Engineering Researching Powerful Tool Research Oriented Extensible & Modular Specific Domain Efficient Complexity Ad-Hoc Solutions Mandatory Teaching Pedagogic Tool Learning Oriented Friendly General Domain Practical Simplicity Organizational Optional

40 EUROCALL 2007 - University of Ulster, 5 - 8 September Data. Grammatical vs Pedagogical Linguistic Engineering Large amount of data (representative Corpora) Grammatical Annotation Oriented to retrieve statistical Information Learning Reduced set of data Pedagogy Annotation Oriented to retrieve learning information (Hierarchical Structures & Selective Information)

41 EUROCALL 2007 - University of Ulster, 5 - 8 September Epistemological & Empirical Multi-Disciplinarily support Multi-Lingual support Multi-Corpus Management Multi-Purpose Support Based on Standards

42 EUROCALL 2007 - University of Ulster, 5 - 8 September Choosing Software Life Cycle Spiral Approach Why?

43 EUROCALL 2007 - University of Ulster, 5 - 8 September 5 SACODEYL Annotator

44 EUROCALL 2007 - University of Ulster, 5 - 8 September Output. SACODEYL Annotator SACODEYL Annotator characteristics: Pedagogical Motivation Teaching Oriented Friendly Interface Multi-Language (UTF) Standardization (TEI) Multi-Purpose

45 EUROCALL 2007 - University of Ulster, 5 - 8 September Developing annotation solutions for online data-driven learning Contact information Pascual Pérez-Paredes pascualf@um.es Jose María Alcaraz jmalcaraz@gmail.com Universidad de Murcia, Spain


Download ppt "EUROCALL 2007 - University of Ulster, 5 - 8 September Developing annotation solutions for online data-driven learning Pascual Pérez-Paredes and Jose María."

Similar presentations


Ads by Google