Presentation on theme: "Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’"— Presentation transcript:
Learner Corpus Research Conference, Bergen, Norway, September 27, 2013 Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’ written narratives Giagkou Maria Kantzou Vicky Stamouli Spyridoula Tzevelekou Maria
Background Specification of CEFR functional descriptions: criterial features (specific lexical and grammatical features used differently by L2 learners at different proficiency levels) Cambridge English Profile Programme (Hawkins & Filipovic 2012, Hawkings & Buttery 2010) CEFR proficiency levels and L2 acquisition of various linguistic features SLATE network (Second Language Acquisition and Testing in Europe) Different languages: Dutch, Italian and Spanish (Kuiken, Vedder & Gilabert 2010), Finnish (Alanen, Huhta & Tarnanen 2010, Martin et al. 2010), French (Forsberg & Bartning 2010, Prodeau et al. 2012), Norwegian (Carlsen 2010) Mostly adult learners, smaller number of studies on young L2 learners (Pallotti 2010).
Research objectives Identification of criterial properties for Greek as L2 -> specification of the CEFR proficiency levels with respect to the linguistic features of Greek -> help educators and researchers discriminate the language production of each level from the production of adjacent levels. Focus: young L2 learners of Greek enrolled in Greek state schools (immigrants and indigene minorities): 18% of students nowadays are immigrants or repatriated Greeks learning Greek as L2 (Gropas and Triandafyllidou (2011) written narratives because children are familiarized with from an early age; and because narratives have been widely investigated in L1 and L2 acquisition. Investigation of the developing narrative ability at micro- and macro- level, as indicated by: Narrative length Clause Subordination Discourse markers Modifiers Grammatical accuracy Lexical density Previous research findings in Greek L1 and L2 acquisition (Kantzou 2010, 2012, Stamouli 2010, Varlokosta & Triantafyllidou 2003), and in evaluation of Greek text difficulty (Giagkou 2012)
Elicitation task and level allocation Two writing tasks performed by ca immigrant and repatriated children (October 2011 to February 2012): a narrative based on the Cat Story picture series (Hickman 2003) a letter or diary entry Two evaluators placed each student at a CEFR level on the basis of the two written productions Rating was based on CEFR descriptors and more specifically on the Overall Written Production, Creative Writing and Lexical, Grammatical and Orthographic Competence scales.
Corpus Narratives based on the on the Cat Story picture series (letters and diary entry excluded from further analysis) Only narratives placed at the same level by both evaluators are included Corpus of 150 scripts (9742 tokens). Levels A2, B1 and B2 are represented in the corpus, with 50 scripts in each level. scriptstokensclauses NNMean (std) MinMaxNMean (std) MinMax A ,68 (14,34) ,22 (3,19) 418 B ,86 (13,2) ,08 (3,2) 822 B ,30 (22,19) ,84 (4,11) 933 Total ,95 (22,37) ,38 (4,43) 433
Sample 150 primary school pupils 83 boys, 67 girls grades 3 to 6 (aged 8-14) different linguistic backgrounds, mainly Albanian (49%) and Russian (15%) resided in geographically diverse regions of Greece
Transcription and annotation Manual transcription: a) a version preserving learner’s spelling, and b) a corrected version Clause separation: clause expresses a single situation and has one predicate (Berman and Slobin 1994). Annotation: Type of clause Clitics within the verb frame Adjectives and adverbs Discourse markers
Transcription and annotation Type of clause Independent Dependent Relative clauses Complement clauses Clauses of purpose Clause of cause Clause of time Center-embedding mia mera // mia γata citaksa kala kala ta mikra pulacia [pu i mitera iχe pai] [na vri trofi] One day // a cat looked at the little birdies [that the mother had left] [to find food]
Transcription and annotation Clitics Clitics within the verb frame Appropriate use: ce o scilos tis δaγose tin ura tis (A2) and the dog bit its (=the cat’s) tail Inappropriate use: ce i γata arχize na treχi ce o scilos ton ciniγuse (B1) and the cat started running and the dog was chasing him* (inappropriate gender marking)
Transcription and annotation Adjectives and adverbs Adjectives Descriptive: itan ena mikro spitaci pu iχe mikra pulacia (A2) there was a little house that had little birdies Evaluative: i kakurγa γata pinuse (B2) the wicked cat was hungry Adverbs Descriptive: i γata skarfani apano sto δendro (A2) the cat was climbing on the tree Evaluative: ta pulacia citusan ti γata paraxena (B2) the birdies were looking the cat weirdly
Transcription and annotation Discourse markers Additive Temporal Contrastive Inferential Other
Analysis Annotated linguistic features -> metrics based on frequency of occurrence per level Means comparison : One-Way ANOVA Post-hoc multiple comparisons between levels A2, B1 and B2: Bonferroni tests
Results Narrative length Main effect and all post-hoc comparisons were significant for: Number of tokens [F (2, 147)=54.673, p=0,000] Number of clauses [F (2, 147)=44.000, p=0,000] The lengthier the narrative the higher the level
Results Clause subordination (1/2) Main effect is significant [F (2, 147)=40.172, p=0.000] and so are all post-hoc comparisons A successful discriminator both between A2-B1, and between B1-B2 A script with no dependent clauses is most likely to be below B2 Percentage of dependent clauses: Zero occurrences are possible in A2 and B1 (though rarer), but at least one dependent clause is expected in B2
Results Clause subordination (2/2) Percentage of the different types of dependent clauses: Complement, relative, purpose and causal clauses did not significantly discriminate levels Only temporal clauses achieved a significant main effect [F (2, 114)=6.109, p=0.003] but only in discriminating A2 from B1 and from B2. In A2 narratives sequential events are not subordinated. Temporal clauses are used from B1 onwards.
Results Center-embedding Percentage of embedded clauses: Significant main effect [F (2, 147)=6.417, p=0.001] Post-hoc tests: A2 - B2 More than one embedding, indicates a B2 learner. Embedding used by: A2: 3 learners B1: 9 learners B2: 29 learners. More than one embedding in the same script
Results Clitics Percentage of correct clitics to clitics: Significant main effect [F (2, 120)= , p=0.000) and all post-hoc comparisons A B2 learner should be expected to use clitics correctly in terms of gender, number, person and case agreement A2: minimum=0%, maximum=100% B1: more than half of learners have got all their clitics correct B2: occasional inappropriate uses by only 3 learners
Results Discourse markers: general metrics All features were found statistically significant: average number of discourse markers per clause [F (2, 147)=14.141, p=0.000] percentage of discourse markers to tokens [F (2, 147)=19.958, p=0.000) Both are successful discriminators of A2-B2 and B1-B2
Results Discourse markers: type of marker Mean # of the different types of markers per clause: statistically significant Additive markers : all levels Temporal markers: A2-B2 and B1-B2 Contrastive markers: A2-B1 and A2-B2 Inferential markers : A2-B2 Other markers: B1-B2 Exclusive use of the additive και /ce/ (=and) the temporal μετά /meta’ / (=then) is expected in A2. All other additive or temporal markers should indicate an above A2 learner. B1 learners reduce the use of και and μετά, and they start marking contrast. Inference marking is never encountered in A2. It should be expected from learners in B1 or above.
Results Verb and noun modifiers Not statistically significant: average number of adjectives per clause percentage of adjectives to tokens average number of adverbs per clause and percentage of adverbs to tokens Systematic use of evaluative adjectives and adverbs indicates a learner above level A2, and most likely of level B2 Main effect statistically significant: percentage of evaluative adjectives to adjectives: B1-B2 and A2-B2 percentage of evaluative adverbs to adverbs: all level pairs
Results Lexical density Not statistically significant
Results at a glance MetricsA2 – B1B1 – B2A2 – B2 Narrative lengthNumber of tokens and clauses SubordinationPercentage of dependent clauses Percentage of temporal clauses Percentage of embedded clauses CliticsPercentage of correct clitics Discourse markers Mean number of discourse markers per clause Percentage of discourse markers to tokens Percentage of temporal discourse markers Percentage of contrastive discourse markers Percentage of additive discourse markers Percentage of inferential discourse markers ModifiersPercentage of evaluative adjectives Percentage of evaluative adverbs
Criterial features at a glance A2B1B2 SubordinationTemporal clauses are not expected Systematic use of temporal clauses At least one dependent clause Embedding is encountered more than once DiscourseExclusive use of the additive και and the temporal μετά is expected No inference Start marking contrast Start marking inference Systematize inference marking Grammatical accuracy Clitics used correctly in terms of gender, number and case agreement EvaluationSystematic use of evaluative adjectives and adverbs
Further research… Larger sample of A2-B2 learners and C1-C2 More fine-grained analysis of indices, e.g. temporal clauses denoting simultaneity New indices, e.g. verbal morphology, vocabulary growth Different discourse types and modalities
References Alanen, Riikka, Huhta, Ari & Tarnanen, Mirja (2010). Designing and assessing L2 writing tasks across CEFR proficiency levels. In Bartning, Martin & Vedder (Eds.), Bartning, Inge, Martin, Maisa & Vedder, Ineke (eds.) (2010) Communicative development and linguistic development: intersections between SLA and language testing research. Eurosla Monographs Series 1. Available at: (date accessed 21/05/2013). Carlsen, Cecilie (2010) Discourse connectives across CEFR-levels: A corpus based study. In Bartning, Martin & Vedder (Eds.), Forsberg, Fanny & Bartning, Inge (2010) Can linguistic features discriminate between the communicative CEFR-levels? A pilot study of written L2 French. In Bartning, Martin & Vedder (Eds.), Giagkou, Maria. (2012). A readability statistical model for pedagogically relevant text retrieval. In Papadopoulou & Recythiadou (Eds), Proceedings of the 32nd Annual Meeting Department of Linguistics, AUTH (pp 65-76). Thessaloniki: Institute of Modern Greek Studies. Gropas, R. & Triandafyllidou, A. (2011). Greek education policy and the challenge of migration: an ‘intercultural’ view of assimilation. Race Ethnicity and Education, 14(3), Hawkins, John A. & Buttery, Paula (2010) Criterial Features in Learner Corpora: Theory and Illustrations. English Profile Journal 1(1): Hawkins, John A. & Filipović, Luna (2012) Criterial Features in L2 English: Specifying the Reference Levels of the Common European Framework (English Profile Studies). Cambridge: Cambridge University Press. Hickmann, Maya (2003) Children’s discourse: Person, space and time across languages. Cambridge: Cambridge University Press. Kantzou, Vicky (2010) The temporal structure of narrative in the acquisition of Greek as a first and as a second language. Phd Thesis. Athens: National and Kapodistrian University of Athens. [In Greek] Kantzou, Vicky (2012) The temporal structure of narratives in second language acquisition of Greek. In: Gavriilidou Ζoi, Efthymiou Αggeliki, Thomadaki Εvangelia. & Kambakis-Vougiouklis Penelope (eds) Selected Papers – The 10th International Conference of Greek Linguistics (pp ) Komotini/Greece: Democritus University of Thrace. Available at: (date accessed 21/05/2013). Kuiken, Folkert, Ineke Vedder & Roger Gilabert (2010) Communicative adequacy and linguistic complexity in L2 writing. In Bartning, Martin & Vedder (Eds.), Pallotti, Gabriele (2010) Doing interlanguage analysis in school contexts. In Bartning, Martin & Vedder (Eds.), Prodeau, Mireille, Lopez, Sabine & Véronique, Daniel (2012) Acquisition of French as a Second Language: Do developmental stages correlate with CEFR levels? Journal of Applied Language Studies 6(1): 47–68. Stamouli, Spyridoula (2010) Narrative development in Greek L1 and child L2. Phd Thesis. Athens: National and Kapodistrian University of Athens. [in Greek] Varlokosta, Spyridoula & Triantafillidou, Leda (2003). Proficiency Levels in Greek as a Second Language. Athens: Centre for Intercultural Education, University of Athens. [in Greek]
Thank you! Part of this work, data collection and rating, was funded by the educational project “Education of Repatriate and Immigrant Students”, Action 1 “Linguistic and Educational Support for Reception Classes”, Aristotle University of Thessaloniki (National Strategic Reference Framework and the Ministry of Education and Religious Affairs)