Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar.

Similar presentations


Presentation on theme: "Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar."— Presentation transcript:

1 Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar Solorio 2 nisa@hlt.utdallas.eduyangl@hlt.utdallas.edusolorio@uab.edunisa@hlt.utdallas.eduyangl@hlt.utdallas.edusolorio@uab.edu 1 The University of Texas at Dallas 2 University of Alabama at Birmingham 1. Summary  Evaluate various features for automatic prediction of LI from child language transcripts  General word and text, and syntactic features perform better for spontaneous narratives  Referential and semantic and entity grid features perform better for story telling narratives  Narrative structure quality features led to an increase of 8.7% over baseline for spontaneous narratives 7. Experimental Results 7. Conclusion 3. Data  Transcripts of adolescents aged 14 years, for two tasks:  Story telling task  Spontaneous personal narrative  118 speakers (99 TD children, 18 LI children)  118 transcripts for each task  Annotated story telling transcripts for coherence and narrative structure features  Treat prediction of LI as a binary classification task  Used features in prior work by Gabani et al. as a baseline  Feature categories 2- 6: from Coh-Metrix tool  Entity grid model features: Brown coherence toolkit  Narrative structure and quality features: based on manual annotation for the story telling narratives  Used leave-one-out cross validation  Classification experiments: WEKA 2. The Larger Problem  Detecting language impairment (LI) in children  Traditional methods of detecting LI  Cutoff methods on norm referenced tests  Time consuming  May result in over and under identification of LI  Automatic detection of LI is faster and allows for exploring more features beyond norm referenced tests  Given a child language transcript, answer the following question:  Does the transcript belong to a typically developing (TD) child or a child with LI?  What features are useful in detecting LI?  Deeper NLP features are explored for automatic prediction of Language Impairment (LI)  Narrative structure and narrative quality features are also explored for the automatic prediction of LI for story telling tasks  Narrative structure and quality features along with a combination of other features are helpful in the prediction of LI on story telling narratives 5. Experiment Setup No Feature CategoryExample of features 1Gabani et al. (Gb) (baseline) Probabilities of language model, Morphosyntactic features,, Language productivity features 2Readability (RM)Flesh reading ease score 3Situational Model (SM) Repetition score for tense and aspect, Number of causal verbs in the text Ratio of causal verbs to causal particles 4General word and text (GWT) Number of utterances, Mean frequency of content words, Mean concreteness of content words, Mean hypernym value of nouns in text 5Syntactic (Syn) Number of noun phrases, Syntactic similarities between utterances, No of personal pronouns per 1000 words, Ratio of pronouns to noun phrases 6Referential and Semantic (RS) Number of anaphora references Proportion of content words in adjacent utterances that share a content word 7Entity Grid (EG)Fractions of each entity distribution 8 Narrative structure and quality (NSQ) Coherence, no of cognitive references, No of social engagement devices, Instantiation of story, resolution of story, Presence of search episodes in story, Maintenance of search theme, No of affective devices and intensifiers, INTERSPEECH 2012 6. Results 4. Features  Bayesian network classifier performed the best  Performed feature selection Instantiation of story, number of cognitive references and number of social engagement devices were top scoring NSQ features Task Feature PrecisionRecallF-1 Spontaneous narrative Gb (baseline)0.650.6840.667 GWT0.5710.4210.485 Entity grid0.1620.5790.253 RS + GWT0.4230.5790.489 Gb + All0.7060.6320.667 Gb + All – RS0.7060.6320.667 Story telling Gb (baseline)0.8240.7370.778 Narrative0.3850.2630.313 GWT0.250.3160.279 Entity grid0.3040.3680.333 RS + GWT0.3530.6320.453 Gb + All10.790.882 Gb + All – RS10.8420.914 Gb + Narrative0.8890.8420.865


Download ppt "Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar."

Similar presentations


Ads by Google