Presentation is loading. Please wait.

Presentation is loading. Please wait.

Consortium Corpus-écrits SIG TEI-CMC Open Resources and TOols for LANGuage Thierry Chanier, Céline.

Similar presentations


Presentation on theme: "Consortium Corpus-écrits SIG TEI-CMC Open Resources and TOols for LANGuage Thierry Chanier, Céline."— Presentation transcript:

1 Consortium Corpus-écrits SIG TEI-CMC Open Resources and TOols for LANGuage http://comere.org http://hdl.handle.net/11403/comere Thierry Chanier, Céline Poudat, Ciara Wigham

2 Objective: Kernel corpus assembling existing corpora of different CMC genres and new corpora build on data extracted from the Internet. These heterogeneous corpora will be structured and processed in a uniform way, complemented with metadata. CoMeRe will be released as OpenData through the national infrastructure Ortolang, following constraints which will be reused for the forthcoming “Corpus de Référence du Français”. Project supported by the national consortium Corpus-écrits, sub-part of Huma-Num, and Ortolang Variety + Standards + Open Access Consortium Corpus-écrits http://comere.org http://hdl.handle.net/11403/comere

3 3

4 4

5 5 Opendata criteria

6 6

7 7

8 8

9 RefTokensPartici.PostsEnvir. (Antoniadis,2014) 449 31335922 052SMS (Falaise, 2014) 35 M25 0003 Mtextchat (Ledegen, 2014) 357 00085022 000SMS (Reffay et al., 2014) 600 00067 + 4 groups - textchat: 6 790 - emails: 2 030 - forums: 2 686 LMS (Yun, Chanier, 2014) 77 60531 + 2 courses7 750textchat (Abendroth-Timmer et al., 2014) 273 54626 + 4 groups1 200Blog (Longhi, Marinica, 2014) 567 85120534273Tweet (Poudat et al., 2015) 489 00039714456 Wiki discussions 9 Informal business Informal education politics science

10 RefTokensPartici.Posts, U, ProdEnvir. (Chanier & Audras, 2015) 184 59462 + 12 groups - audio: 2 809 - chat: 248 - non-verbal: 1 058 -blog: 779 Conference system (Chanier & Wigham, 2015) 27 91218 + 4 groups - audio: 1 690 - chat: 669 - non-verbal: 2 452 3D environment (Chanier, Reffay et al., 2015) 127 22816 + 2 groups - audio: 7 718 - chat: 1 566 - non-verbal: 5 790 Conference system 10

11 11

12 12 Mono -Mode -Modality -Textchat -Forum -SMS -Tweets -Email -Blogs (image not means of interaction) Multi Modalities LMS: - email - forum - chat Multi Modes Conf system: -Audiochat -Textchat Verbal Verbal & Non-verbal Conference system, 3D environment Etc. -Audiochat -Textchat -Icones -Collec prod Whiteboard Word proc. Semantic maps -Avatars -…

13 13 Interaction Space Time(s) Locations Participants Environments Author Adresse(s) Group Network Course Session Channel Simultaneity

14 14 New macro-level elements http://wiki.tei-c.org/index.php/SIG:CMC/Draft:_A_basic_schema_for_representing_CMC_in_TEI

15 15 Title label comment message Contents / body

16 16 Response to what? Sent to whom? Read by whom? May contain HTML, Table,etc. Attached doc

17 Computer-Mediated Communication in TEI: What Lies Ahead TEI-MM 2013 (Rome) 1.5 mn video * Paper: (Wigham & Chanier, 2013) CALL journal * Data: (Wigham, 2013) LETEC corpus Modality interplay

18 Computer-Mediated Communication in TEI: What Lies Ahead TEI-MM 2013 (Rome) Multimodalité : Verbal et non verbal (Wigham & Chanier, 2013)

19 19 Audio kinesics chat

20

21 21

22 22 LMS textchat email forum

23 23 http://wiki.tei- c.org/index.php/SIG:CMC/Draft:_A_metadata_schema_for_CMC http://wiki.tei- c.org/index.php/SIG:CMC/CoMeRe_schema_draft_for_representing_CMC_in_ TEI_%282014%29 Many more examples here

24 24 CoMeRe team


Download ppt "Consortium Corpus-écrits SIG TEI-CMC Open Resources and TOols for LANGuage Thierry Chanier, Céline."

Similar presentations


Ads by Google