LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université.

Slides:



Advertisements
Similar presentations
Eurosoil Freiburg 2004 – Education in Pedology E-Learning in Soil Science – What are the Perspectives? Ludger Herrmann University of Hohenheim.
Advertisements

1 jNIK IT tool for electronic audit papers 17th meeting of the INTOSAI Working Group on IT Audit (WGITA) SAI POLAND (the Supreme Chamber of Control)
Christopher Graham Garnet Education UK. I dont do rhetorical questions !
The way to open resources Laurent Romary CNRS. Two aspects of scientific communication Research papers –All types (Conferences, journals, grey literature.
IRCS Workshop on Open Language Archives IMDI & Endangered Languages Archives Heidi Johnson / AILLA.
The Seven Pillars of Open Language Archiving: Introducing the OLAC Vision Gary Simons SIL International LSA Symposium: The Open Language Archives Community.
E-Learning Models Desk Study Chris Fowler. Purpose To explain our current thinking and specification of the E-Learning Models Advisor.
Copyright, UCL LEADERS: Linking EAD to Electronically Retrievable Sources Developing a Generic Toolkit: Architecture and technology issues ALLC/ACH Conference.
Centre for the Enhancement of Learning and Teaching Supporting & Enhancing Online Teaching & Learning by Catherine Ogilvie Centre for the Enhancement of.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Online Qualitative Data Resources: Best Practice in Metadata Creation.
A Common Standard for Data and Metadata: The ESDS Qualidata XML Schema Libby Bishop ESDS Qualidata – UK Data Archive E-Research Workshop Melbourne 27 April.
ELearning Constructing New Environments for Learning Manuel Ortega, Pedro P. Sánchez-Villalón, Asunción Sánchez-Villalón, Celina de Diego.
Mitglied der Leibniz-Gemeinschaft Querying Spoken Language Corpora Thomas Schmidt IDS Mannheim.
SDMX in the Vietnam Ministry of Planning and Investment - A Data Model to Manage Metadata and Data ETV2 Component 5 – Facilitating better decision-making.
English and ELT Methodology and Pedagogy Courses 2005 Some guidelines.
Interoperability aspects in the The Virtual Language Observatory Dieter Van Uytvanck Max Planck Institute for Psycholinguistics
Computational Paradigms in the Humanities – eHumanities and their role and impact in transdisciplinary research Gerhard Budin University of Vienna.
Natalie Fong English Centre, The University of Hong Kong Good Practices in a Second Language Classroom: An Alternating Use of ICT in Independent Learning.
A Narrative Inquiry of Autonomy Development and ICT Use: The Story of an ELT Colombian Department. By: Jenny Mendieta Department of Applied Language Studies.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
Multilingual eLearning in LANGuage Engineering. Project Overview  Project span: Oct 2004 – Oct 2007  Kick-off meeting Oct  Project goals:
1 Dialogue in Network- supported Language Learning and Teaching.
ELearning Constructing New Environments for Learning.
INTRODUCTION.- PROGRAM EVALUATION
The Nature of Approaches and Methods in Language Teaching
Interuniversity Center for Educational Research and Advanced Training Paolo Tosato, Juliana Raffaghelli European Distance and E-Learning Network Teachers’
The Uses of Blackboard in IDARI
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Experiences and requirements in teacher professional development: Understanding teacher change Sylvia Linan-Thompson, Ph.D. The University of Texas at.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
CMC Data in Learning and Teaching (LETEC) Corpora Thierry Chanier, Université Blaise Pascal with Christophe Reffay, with Christophe Reffay, École Normale.
Sharing linguistic multi-media resources Jacquelijn Ringersma Paul Trilsbeek Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands.
Interstate New Teacher Assessment and Support Consortium (INTASC)
The Development of Intercultural Dimension in Language Teaching
Maria Eracleous, MA, MPhil, Phd Department of In-Service Training, Cyprus Pedagogical Institute.
The Language Archive – Max Planck Institute for Psycholinguistics Nijmegen, The Netherlands Increasing the usage of endangered language archives in the.
New Challenges for Multilingualism in Europe, Dubrovnik, 2010 Gee Macrory, Manchester Metropolitan University, UK, Lucette Chrétien, Université de Poitiers,
The role of Parthenos for CLARIN ERIC Steven Krauwer CLARIN ERIC Executive Director 1.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Smart Qualitative Data: Methods and Community Tools for Data Mark-Up SQUAD Libby Bishop Language and Computation Day University of Essex 4 October 2005.
FRE 2661 CSCL Conference, Bergen, june 2003C. Reffay, T. Chanier 1 How Social Network Analysis can help to measure cohesion in collaborative distance-learning.
Customizing the IMDI metadata schema for endangered languages Heidi Johnson (AILLA) Arienne Dwyer (DOBES)
Aude Dufresne and Mohamed Rouatbi University of Montreal LICEF – CIRTA – MATI CANADA Learning Object Repositories Network (CRSNG) Ontologies, Applications.
The Pedagogical ICT Licence ICT in initial teacher training Professional development of teachers in ICT Denmark.
Tracking Language Development with Learner Corpora Xiaofei Lu CALPER 2010 Summer Workshop July 12, 2010.
Sharing Design Knowledge through the IMS Learning Design Specification Dawn Howard-Rose Kevin Harrigan David Bean University of Waterloo McGraw-Hill Ryerson.
Conversation Analysis Introduction to Conversation Analysis 2e Anthony J. Liddicoat, March 2011.
A Data Category Registry- and Component- based Metadata Framework Daan Broeder et al. Max-Planck Institute for Psycholinguistics LREC 2010.
Consortium Corpus-écrits SIG TEI-CMC Open Resources and TOols for LANGuage Thierry Chanier, Céline.
Resources in Moodle Dubravka Crnić. Moodle supports a range of resource types which teachers can add to their courses. In edit mode, a teacher can add.
University of Bremen Jürgen Friedrich University of Bremen – Bärbel Kühn Language Centre of the Universities of Bremen (FZHB) –
The Exeter Model of ITE Induction for Initial Teacher Education Coordinators, Mentors, Principal School Tutors, University Visiting Tutors and External.
Design Evaluation Overview Introduction Model for Interface Design Evaluation Types of Evaluation –Conceptual Design –Usability –Learning Outcome.
© 2013 TILA Petra Hoffstaedter – Steinbeis-Transferzentrum Sprachlernmedien 1 Tila Teacher Training Tools for Synchronous and Asynchronous Telecollaboration.
Pedagogical aspects in assuring quality in virtual education environments University of Gothenburg, Sweden.
ENHANCING QUALITY IN ONLINE LEARNING Nadeosa Conference Durban University of Technology 8-9 July 2015 Dr Ephraim Mhlanga.
Eurocall Nottingham 2011, United Kingdom 1. Introduction ENHANCING WRITING SKILLS THROUGH BLOGS IN AN EFL CLASS Ruby Vurdien, White Rose Language School,
DESIGN In CBI, language learning incidental to the learning of content. Content is important. Language learning=secondary, content=priority.
Generating data with enacted methods
Chapter 3 Choosing Information & Communications Technologies that Fit the Research Design Janet Salmons, PhD.
PLANNING AND DESIGNING A RESEARCH STUDY
An Overview of Data-PASS Shared Catalog
Patternization of an Inquiry-based design process for the construction of a structurally sound educational tool: The paradigm of a secondary development.
Institutional role in supporting open access, open science, open data
DITA component-based authoring and learning design
The InWEnt Blended-learning approach; GC21 as an e-learning and Blended-learning platform 22/02/2019 An introduction course on InWEnt Blended-learning.
Enacted: Generating data in research events
Educational Technology Lab, National Kapodistrian
Wiki, Wiki Sanden, S., & Darragh, J. (2011). Wiki use in the 21st-century literacy classroom: A framework for evaluation. Contemporary Issues in Technology.
Presentation transcript:

LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions Ciara R. Wigham & Thierry Chanier Clermont Université LRL: Publications: PPT: 1 4th WorldCALL Conference, July 2013, Glasgow

2 Simuligne (2001) UK-FR fre Copéas (2005) eng UK-FR Tridem ( ) UK-FR-USA eng, fre Ecofralin (2008) CO-FR fre,spa VMT- teamC (2006) math UK-USA-SG INFRAL (2009) deu,fra DE-FR FR FAVI ( ) fra ARCHI21 (2011) eng,fra FR SLIC (2013) USA-FR fra

Data validity & reliability in CALL research? Problem in Social Sciences and CALL: ▫visibility, accessibility and validity of research data ▫data representative / anecdotal? ▫no access to data when reading a publication ▫links between data and publications 3

CALL data from online learning situations CALL data is often: ▫not contextualised – pedagogical & technological situations (Kern et al., 2004) ▫tangled in specific software using proprietary formats Replication for interaction analysis in online learning near impossible: ▫variables that are difficult to control ▫replication does not imply that phenomenon previously observed will reoccur (Reffay et al., 2012) 4

Mulce project & LETEC Multimodal Corpora Exchange 5

Research data quality: Mulce project Interoperability: ▫Structured and coherent data sets = > analyses can be completed by researchers who did not participate in the course Sustainability: ▫Independent from online platforms ▫Stored in independent formalisms Open access to research data & appropriate licences Accessibility: ▫Finding the research data through standard metadata – OLAC (Open Language Archives Community) 6

Learner Corpora / LETEC Learner Corpora (see Granger, 2002; Meunier et al., 2011) ▫SLA research ▫learners' productions ▫test situations (Reffay et al., 2008) ▫learner- native speaker comparative studies (Boulton et al., 2012) LEarning and TEaching Corpora ▫all participants considered (learners, tutors, etc.) ▫interaction data ▫context 7

LETEC Components Instanciation Pedagogical scenario Research protocol Public licence Private licence Analyses 8 "A LETEC corpus collects in a systematic and structured way all the data from interactions which occur during a course which is partially or entirely online. These data are enriched by technical, pedagogical and scientific information as well as information about the participants and are organized to allow contextualized analyses to be performed.“ (Mulce-documentation, 2013) ethics & rights

9

Staged process stages= Data analyses 10

Illustration of methodology- European project KA2 Languages CLIL approach (Content and Language Integrated Learning) ▫Architecture + French / English L2 Hybrid course "Building Fragile Spaces" : 5-day studio Feb students, 2 architecture tutors, 1 EFL tutor, 1 FFL tutor  Working with external partners: exchanges 11

12

Elaboration of research areas Interplay between verbal and non verbal modes Role of nonverbal in identity construction Interplay between textchat & voicechat modalities Support for L2 verbal participation and production Wigham (2012) – PhD Thesis Stage 1: Design 13

Pedagogical Design Macro-task– collaboratively elaborate a model in a synthetic world (Second Life) as a response to an architectural problem brief Architectural studio, hybrid CLIL approach 4 workgroups Stage 1: Design Learning design Online environments Participants’ roles Learning & support activities 14

Learning & support activities ActivityArchitecture objectivesL2 objectives Introduction to Second Life Introduce students to multimodal nature of SL Establish a communication protocol Collaborative building activity Introduce students to building techniques to aid them develop their model Develop L2 communication techniques concerning the referencing of objects Group reflective session Develop critical thinking by negotiation Distinguish pertinent information for overall problem identification in their design brief Help students to skill-up their L2 Acquire domain-specific vocabulary Develop a professional discourse Stage 1: Design Detailed in: Rodrigues et al., in press; Wigham & Chanier,

Research protocol Research protocol design ▫Protocol for data collection ▫Researchers' roles ▫Timetable of research activities Stage 1: Design researcher 16 Wigham & Chanier, 2013 ReCALL

17

Data collection & coverage Data collected Pre- questionnaires Session dataPost questionnaires Semi- directive interviews Environ ment KwiksurveysSecond LifeVoiceForumKwiksurveysSkype Data typeSpreadsheet file Video screen captures Audio recordings Spreadsheet fileAudio recordings Quantity & coverage of data 17 student questionnaires 20 group sessions & 2 presentation sessions 19h40m 64 forum messages 16 student questionnaires 5 student interviews 2h30 pre-course post-course during course Stage 2: Data collection 18

19

Primary data (anonymised)Each resources has an ID and a description given LETEC global corpus: content packaging Manifest : structured data Structured Interaction Data Model (Mce_sid, 2011) XML Information about each component of the corpus General metadata(OLAC standards)Environnements usedInformation on participants: language biographies and group organisation Description of the environment, course length, participants, tools Activities described in the pedagogical scenario Stage 3: Data organisation 20

Corpus deposit Mulce corpus repository (Mulce-repository, 2013) Stage 3: Data organisation 21

Corpus diffusion Description of corpus; interface to browse structure; zip file to download Stage 3: Data organisation 22

23

verbal mode non verbal mode audiotextchat proxemic transmission radio transmission public private not detailed here, see Wigham & Chanier, (2013) ReCALL 25(1) Multimodal data transcription Stage 4: Data transcription & diffusion 24

Elaboration of transcription methodology Characterized by communication modes & modalities ▫Systematic approach to studying online environments New environments = new modalities ▫Added to transcription methodology Communication mode Communication modality Act type and transcription code Explanation verbal audio audio act (tpa)verbal turn in the public audio channel silence (sil) interval between two audio acts greater than three seconds textchattextchat act (tpc)message entered in the textchat window nonverbal proxemics movement (mvt) avatar movement in the environment, e.g. avatar sits down, flies, walks backwards entrance into /exit from environment (es) avatar enters or exits the synthetic world kinesicskinesic (kin) avatar gestures and movements made by an avatar's body part e.g. nod, point, clap productionproduction (prod) production or display of an object in the SL environment Stage 4: Data transcription & diffusion 25

Multimodal transcription using ELAN video screen capture multimodal transcription aligned using timeline participants & modality view of annotations for one participant in one modality Max Planck Institute for Psycholinguistics (2001). ELAN [software]. The Netherlands: Max Planck Institute for Psycholinguistics. [ Stage 4: Data Analyses 26

Production & deposit of LETEC distinguished corpus Particular analysis of a selected part of the global LETEC corpus Chanier, T. Saddour, I. & Wigham, C.R. (2012). (dir.) Distinguished Corpus: Transcription of Verbal and Nonverbal Interactions of the Second Life Reflection archi21-slrefl-av-j2. Mulce.org : Clermont Université. [oai : mulce.org:mce-archi21- slrefl-av-j2 ; Only contains transformed data (=the transcriptions) Refers to a selection of the original data in global corpus (=videos) Software used for transcription cited (=ELAN) Stage 4: Data transcription & diffusion 27

Why does structuring a corpus help analysis? Common technical structures to hold interaction data ▫Data linked ▫Analyses at different levels, in context whilst maintaining a global view of the course XML structure allows standard forms of annotation / coding & different analysis software to be used ▫Tatiana (2008) ▫Calico (2009) Stage 4: Research Analyses 28

An analysis example Interplay between textchat & voicechat Textchat modality acts in adjunct to the audio modality ▫e.g. technical problems exist, opening & closing sequences of sessions (Liddicoat, 2011; Palomeque, 2011) Monomodal textchat environments – auto-correction, negotiation of meaning and corrective feedback Learner overload (Deutschmann & Panichi, 2009)  Multimodal environments ? (Hampel & Stickler, 2012)  Can the textchat serve for L2 feedback provision? Stage 4: Research Analyses Wigham & Chanier (in print) CALL Journal 29

An example of modality interplay 30

Characterisation of textchat functions Wigham & Chanier (in print) CALL Journal Stage 4: Research Analyses 31

Characterisation of textchat functions Data coding facilitated by XML schemas Stage 4: Research Analyses 32

Feedback in textchat 17% of acts contain feedback (49 acts) Primarily concerns lexical and grammatical non target- like forms (cf. Tudini, 2003) Predominant use of recasts (32/49 instances) EFL Session TechnicalSocialisation Conversation management TaskForm Es-j Sc-j Sc-j Stage 4: Research Analyses 33

Results of textchat feedback study EFL tutor's strategic choice to use textchat - reduces cognitive load ▫Non expertise in content matter Language form Vs communicative meaning ▫Recasts as remain in textchat window ▫Recasts so as not to interrupt content communication Students’ management of multiple modalities Stage 4: Research Analyses 34

Publication of analyses & deposit of associated distinguished corpus Production of distinguished corpus: ▫Wigham, C.R. (2013). (dir.) Distinguished Corpus: Interplay between textchat and audio modalities during the Second Life Reflective Sessions. Mulce.org : Clermont Université. [oai : mulce.org:mce-archi21-modality-textchat ; Analysed data presented in parallel with results ▫Wigham, C.R. & Chanier, T. (in print). Interactions between text chat and audio modalities for L2 communication and feedback in the synthetic world Second Life. CALL Journal Distinguished corpora can be cited in articles Explicit connections between data and publications enhance the quality of CALL research 35 Stage 4: Publication

Conclusion: Sustaining CALL research Reuse of data for cumulative or contrastive analyses ▫Rodrigues & Wigham (in print) – text chat & problematic vocabulary points ▫Natural language processing techniques Facilitated by: ▫structured XML formalisms render online interaction data autonomous from any platform, in tool agonistic form ▫interactions described by modes & modalities -> not specific to an online environment Reuse of LETEC in corpus linguistics (TEI-CMC) 36 Conclusion

Perspectives Documented and selected materials in their original context –basis for reflection in pedagogical corpora Integration of pedagogical corpora into teacher- training classrooms 37 Conclusion

Contact: Website: Mulce-documentation: Mulce-repository: Thank you! 38

Corpus metadata Inform researchers about: ▫conditions under which the corpus was built ▫how to use the corpus ▫the corpus' content ▫licences for re-using the corpus Used for web harvesting ▫corpus become visible to whole community (OLAC, Clarin) ▫corpus can be cited Stage 3: Data organisation 39

Characterisation of textchat functions Analyses 40 Data coding facilitated by XML schemas Wigham & Chanier (in print) CALL Journal

Data coverage 6 sessions (3 FFL, 3 EFL) 4h30m of screen recordings Analyses 41 Groups analysedAudio actsTextchat acts EFL FLE38664

Perspectives Documented and selected materials in their original context –basis for reflection Inter-disciplinary project 42