Hong Kong English in Students’ Writing

Slides:



Advertisements
Similar presentations
Dr. Dana Ferris University of California, Davis PREPARING TEACHERS TO TREAT ERRORS IN THE K-12 CLASSROOM.
Advertisements

Words Words Words! Helping ELL Students Develop Vocabulary.
1 A Comparative Evaluation of Deep and Shallow Approaches to the Automatic Detection of Common Grammatical Errors Joachim Wagner, Jennifer Foster, and.
1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.
Python Programming Chapter 1: The way of the program Saad Bani Mohammad Department of Computer Science Al al-Bayt University 1 st 2011/2012.
Corpora and Language Teaching
Automated Essay Evaluation Martin Angert Rachel Drossman.
Welcome Orientation. Introduction to the Course Course Objectives By the end of this course students will be able to: · Master the grammatical uses and.
Grammatical Challenges for Second Language Writers Pre-Course 66 USASMA.
Developing Student Researchers Part 4 Dr. Gene and Ms. Tarfa Al- Naimi Research Skills Development Unit Education Institute.
Paper 4 The Continuous Writing Paper Key issues, using the mark scheme and suggested strategies.
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
8 th Grade English Mr. Stock. Course Outline Textbooks: English, Houghton Mifflin; Literature, Prentice Hall; English, McDougal This course is divided.
Selecting Relevant Documents Assume: –we already have a corpus of documents defined. –goal is to return a subset of those documents. –Individual documents.
Using Wiki in Project-based Learning Ching Chung Hau Po Woon Primary School.
Self-Editing Lesson Spring Remember, It’s a work-in-progress! What do we think about writing and grammar? What have been some of your common problems.
FLAX Shaoqun Wu and Ian H. Witten Computer Science Department Waikato University New Zealand Utilizing lexical data from a web-derived.
The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.
Natural Language Processing Vasile Rus
Made by: Oprea Cristi Laurentiu Form: 11 th ‘B’ Teacher: Dragutan Constantin.
Using language corpora in developing Arabic lessons & syllabuses
Websites Revision Guides
How to teach writing Why teach writing?
KS1 SPaG Parent Workshop October 2016
KS1 SATs INFORMATION EVENING
KS1 English at Tregolls.
Deep learning David Kauchak CS158 – Fall 2016.
Wednesday 30th September 2015
Writing ePlatform: Developing an innovative feedback tool for students’ writing to expand assessment for, of, as learning.
Year 11 mocks English Language Paper 1 – 1 hour and 45 minutes
ELT 213 APPROACHES TO ELT I GRAMMAR-TRANSLATION METHOD WEEK 3
Tokenizer and Sentence Splitter CSCI-GA.2591
SATs’ Meeting 2017 Friday 3rd March 2017
Computational and Statistical Methods for Corpus Analysis: Overview
Custom rules on subject verb agreement
Don’t Be Left in the Dark
KS1 SATs INFORMATION EVENING
Grammar Review for Essay Writing “Punctuation Marks.”
Supporting Students' Native Language in the Classroom
CSCI 5832 Natural Language Processing
A1-A2 Unit One Lesson 4B Making mistakes.
Composition and Rhetoric I Lesson 3
web1T and deep learning methods
A1-A2 Unit One Lesson 4B Making mistakes.
Assessment for Learning
Transformer result, convolutional encoder-decoder
FINDING AND CITING RESEARCH FOR A RESEARCH ESSAY (dr. atkins, a
Proof-reading Skills: Review
Project editing 7th grade Project.
Redesigning the Archival Services’ Website with User Perspectives
The CoNLL-2014 Shared Task on Grammatical Error Correction
Automatic Detection of Causal Relations for Question Answering
The CoNLL-2014 Shared Task on Grammatical Error Correction
Grammar correction – Data collection interface
LanguageTool David Ling.
Statistical n-gram David ling.
GEE’S Writing RULES.
Introduction to Text Analysis
Learning linguistic structure with simple recurrent neural networks
Ngram frequency smooting
University of Illinois System in HOO Text Correction Shared Task
Word embeddings (continued)
English project More detail and the data collection system
Section C: Reading and Language Systems
Tri-gram + LanguageTool
Some preliminary results
Information Retrieval
Editing Process: English 10 Spoken Language
Using ELA Non-Summative Assessments for ELs Teacher Leader Summit 2019 Alice Garcia Special Population Assessment Coordinator.
Homework Frequency KS3: Weekly KS4: Weekly
Presentation transcript:

Hong Kong English in Students’ Writing 2018-03-24 Holly Chung (ENG), David Ling (DLC)

Contents Background System approaches Preliminary results Summary The team The problem System approaches Statistical methods Traditional rule-based methods Deep learning methods Preliminary results Summary

Background – The team Project-in-charge: Dr. Anora Wong , Dr. Holly Chung, Dr. Amy Kong Parties/collaborators involved HSMC English Department HSMC Deep Learning Center (RGC funded) eClass Project website: https://dlc.hsmc.edu.hk/students-english/

Are you suffering these common headaches as an English-language teacher? A few classes’ writing flooding in on the same date? Need to grade writing assignments from different forms? One pile of writing JUST done but then another pile coming right after? Students submitting extra writing to you?

And we are only humans…

Messy to the eyes! Frustrating to the mind!

Unfamiliar word choice/ unnatural collocations “Ignorance of the ingredients in pearl milk tea can kill consumers in a taciturn way.” Taciturn (adj): Speaking very little

Even MS Word may not be able to detect them… “People may select digital wallets as the main payment manner.” “The appearance of digital wallet may alter the lending payment method in Hong Kong.

And what about some good practices?

Background – the problem Grammatical errors and semantic errors (HK style English) often appear in students’ writing They engages the audience by different tactics. Both of them done a good job. engage did They engages the audience by different tactics. Both of them done a good job.

Background – the problem Grammatical errors and semantic errors (HK style English) often appear in students’ writing I had a causal chat with Tim yesterday. Math lessons use English. He can say Chinese. casual are conducted in speak I had a causal chat with Tim yesterday. Math lessons use English. He can say Chinese.

Background – the problem A system which facilitates teachers’ work by: Highlighting and suggesting both grammatical and semantic errors I have two Persian cat. Output Input system Math lessons use English.

I. Statistical methods – Comparing the writing with corpora Online data allow machines to learn English Digital corpus Books Online newspaper articles, Wikipedia articles, online books, essays, papers, magazines, …

I. Statistical methods – Comparing the writing with corpora Detection by tri-gram frequency A contiguous sequence of 3 words/tokens Example: Peter goes to school every day. He can say Chinese. can say Chinese. TRI-GRAM COUNTS _start_ peter goes 1955 peter goes to 1443 goes to school 67559 school every day 35998 _start_ he can 2956493 he can say 99710 can say chinese say chinese . 57 COUNTS: the frequency in corpora Low frequency – uncommon (bad style) or problematic Corpora: Wikipedia 2007 articles (10GB), Google books N-gram (20GB) can say chinese

I. Statistical methods – Comparing the writing with corpora Detection by dependency relation frequency The relation between two words in a sentence Example: Those special viral video can gain the attentions of the audiences. Those special viral video can gain the attentions of the audiences. TOKEN PAIR RELATION COUNTS video Those determiner special adjective modifier 135 viral 770 gain nominal subj 1 can auxiliary verb 1732 attentions direct object 5 . punctuation 4820 the 762 … TOKEN PAIR RELATION COUNTS video Those determiner special adjective modifier 135 viral 770 gain nominal subj 1 can auxiliary verb 1732 attentions direct object 5 . punctuation 4820 the 762 … Can “see” beyond tri-gram Extracts and counts the relations in corpora Corpora: Wikipedia 2018 articles, BNC, ANC, BBC news, Reuters

II. Rule-based methods Detection by matching error patterns Implemented by LanguageTool (open-sourced software) Examples: Agreement errors (eg., a plural noun followed by a singular verb) Wrong prepositions (eg., happened + to her/him) Agreement error

III. Deep learning methods (In progress) Previous methods: Detection only Very labour intensive Many exceptions Deep Learning methods: Understand the sentence meaning Rewrite the sentence or provide corrections

III. Deep learning methods (In progress) Possible approaches Neural network classifier To resolve words that are easily confused, e.g., “causal”, “casual” Neural network machine translation An unedited sentence is “translated” into a corrected sentence “The models words.” => “The models work.” A lot of data for training and validation is needed Try to capture the sentence’s meaning by a vector, and attempt to generate more likely text Allen et. al., “Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction”, Harvard University, 2016

Preliminary results – An essay from HSMC students Blue: by hand-crafted rules Green: by tri-gram detection Purple: by dependency relation Results marked by the system: an assumptions tastemarkers statistics Those … videos System thinks that (video, gain) is an uncommon pair … than in the past. “the unexpectedness one” is rare

Preliminary results: Compared with the teacher’s marking Human marked script

Preliminary results: Compared with the teacher’s marking By system: By teacher: It captures some overlooked mistakes by the teacher.

Summary and the future Summary: On-going Up-coming Correct errors and highlight good practices Provide data and statistics for teachers On-going Gather more students’ scripts (both annotated and raw scripts) Attempt different deep-learning methods, analyze gathered data Up-coming Invite teachers and schools to participate Apply for the Quality Education Fund For more: https://dlc.hsmc.edu.hk/

The End Thank you very much!