Statistical Estimation of Word Acquisition with Application to Readability Prediction Proceedings of the 2009 Conference on Empirical Methods in Natural.

Slides:

Advertisements

Similar presentations

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.

Advertisements

Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.

Inference for Regression

Estimation in Sampling

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

11 Simple Linear Regression and Correlation CHAPTER OUTLINE

Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.

ESTIMATION AND HYPOTHESIS TESTING

EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.

1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.

Linear Regression/Correlation

Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.

Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.

Inference for regression - Simple linear regression

PSYCHOLOGY 820 Chapters Introduction Variables, Measurement, Scales Frequency Distributions and Visual Displays of Data.

Introduction to Statistical Inferences

Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation.

1 Least squares procedure Inference for least squares lines Simple Linear Regression.

6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.

Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.

BPS - 3rd Ed. Chapter 211 Inference for Regression.

Estimation of Statistical Parameters

Lecture 14 Sections 7.1 – 7.2 Objectives:

Chapter 8 Introduction to Inference Target Goal: I can calculate the confidence interval for a population Estimating with Confidence 8.1a h.w: pg 481:

Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.

PARAMETRIC STATISTICAL INFERENCE

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

1 G Lect 10a G Lecture 10a Revisited Example: Okazaki’s inferences from a survey Inferences on correlation Correlation: Power and effect.

Review of Statistical Models and Linear Regression Concepts STAT E-150 Statistical Methods.

Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.

● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.

Copyright 2011 by W. H. Freeman and Company. All rights reserved.1 Introductory Statistics: A Problem-Solving Approach by Stephen Kokoska Chapter 8: Confidence.

The Process of Conducting Research

Reliability & Validity

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

Examining Relationships in Quantitative Research

CHAPTER 3 Model Fitting. Introduction Possible tasks when analyzing a collection of data points: Fitting a selected model type or types to the data Choosing.

Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.

1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.

Chapter Outline 2.1 Estimation Confidence Interval Estimates for Population Mean Confidence Interval Estimates for the Difference Between Two Population.

BPS - 3rd Ed. Chapter 131 Confidence Intervals: The Basics.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

Chapter 8 : Estimation.

6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.

Statistics : Statistical Inference Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1.

Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.

Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…

Chapter 8: Simple Linear Regression Yang Zhenlin.

Sampling and estimation Petter Mostad

Confidence Interval Estimation For statistical inference in decision making: Chapter 9.

Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher.

1 Collecting and Interpreting Quantitative Data Deborah K. van Alphen and Robert W. Lingard California State University, Northridge.

BPS - 5th Ed. Chapter 231 Inference for Regression.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.

Application of the Bootstrap Estimating a Population Mean

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

Lecture Slides Elementary Statistics Thirteenth Edition

Analyzing Reliability and Validity in Outcomes Assessment Part 1

CHAPTER 29: Multiple Regression*

Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.

Basic Practice of Statistics - 3rd Edition Inference for Regression

Analyzing Reliability and Validity in Outcomes Assessment

Objectives 6.1 Estimating with confidence Statistical confidence

Collecting and Interpreting Quantitative Data

Presentation transcript:

Statistical Estimation of Word Acquisition with Application to Readability Prediction Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 900–909 Presenter: Hsiao-Pei Chang Paul Kidwell Department of Statistics Purdue University Guy Lebanon College of Computing Georgia Institute of Technology Kevyn Collins-Thompson Microsoft Research

Outline Introduction A Model for Document Readability and Word Acquisition Experimental Results ▫ Estimation of Word Acquisition Distributions ▫ Comparison with Oral Studies ▫ Global Readability Prediction Discussion 2011/02/18 2

Introduction Word acquisition refers to the temporal process by which children learn the meaning and understanding of new words. A related concept to acquisition age is document grade level readability which refers to the school grade level the document’s intended audience. It applies in situations where documents are written with the expressed intent of being understood by children in a certain school grade. 2011/02/18 3

Introduction They develop and evaluate a novel statistical model that draws a connection between document grade level readability and age acquisition distributions. ▫ They define a model for document readability using a logistic Rasch model and the quantiles of the acquisition age distributions. They then proceed to infer the age acquisition distributions for different words from document readability data collected by crawling the web. ▫ Two perspectives : 1.Analyze and contrast them with previous studies on oral word acquisition 2.the inferred acquisition distributions serve as parameters for the readability model 2011/02/18 4

A Model for Document Readability and Word Acquisition For a fixed word and a fixed population of individuals T the age of acquisition (AoA) distribution p w represents the age at which word w was acquired by the population. Estimate AoA for a word w in terms of mean μw and standard deviation σw parameters using the (truncated) normal distribution 2011/02/18 5

Definition 1. A document d = (w 1,...,w m ) is said to have (1 − ε 1, 1 − ε 2 ) - readability level t if by age t no less than 1− ε 1 percent of the words in d have been acquired each by no less than 1 − ε 2 percent of the population. We denote by q w the quantile function of the cdf corresponding to the acquisition distribution p w. q w (r) represents the age at which r percent of the population T have acquired word w. Following Definition 1 we define a logistic Rasch readability model: 2011/02/18 6 where q d (s, r) is the s quantile of { q w i (r) : i = 1,...,m}.

An equivalent formulation to (3) that makes the probability model more explicit is In other words, the probability of a document d being (s, r) - readable increases exponentially with q d (s, r) which is the age at which s percent of the words in d have been acquired each by r percent of the population. ▫ The parameter r = 1 − ε 2 determines what it means for a word to be acquired and is typically considered to be a high value such as 0.8. ▫ The parameter s = 1− ε 1 determines how many of the document words need to be acquired for it to be readable. 2011/02/18 7

The figure describes applying (r, s)-readability to a document consisting of 5 words with r = 0.8and s = 0.7. The probability density functions p w for the ﬁve words appear as dashed lines. We illustrate the function q d (r, s) by plotting its cdf using as a solid piecewise constant function. The horizontal line indicates that grade 5.9 corresponds to the 70th-quantile of {q wi (0.8) : i = 1,...,m}. 2011/02/18 8

In the case of a normal distribution (1) we have that a word is acquired by r percent of the population at age q wi (r) = μ wi + Φ −1 (r)σ wi where Φ is the cumulative distribution function (cdf) of the normal distribution. μ w ~ G(α 1, β 1 ) σ w ~ G(α 2, β 2 ). Φ −1 (r)σ w ~ G(α 2, Φ −1 (r)β 2 ) 2011/02/18 9

The distribution of the acquisition ages as the following convolution The distribution of the s - percentile of f W, which amounts to (r, s) - readability of documents, can be analyzed by combining f W above with a standard normal approximation of order statistics. where m is the document length and F Q is the cdf corresponding to f Q. 2011/02/18 10 which reverts to a Gamma when β 1 = β 2.

Figure 1 shows the relationship between document length and confidence interval (CI) width in readability prediction. It contrasts the CI widths for model based intervals and empirical intervals. 2011/02/18 11

Experimental Results Our experimental study is divided into three parts. 1.Examines the word acquisition distributions that were estimated based on readability data. 2.Compares the estimated (written) acquisition ages with oral acquisition ages obtained from interview studies reported in the literature. 3.Using the estimated word acquisition distributions to predict document readability. 2011/02/18 12

Experimental Results In our experiments we used three readability datasets. 1.The Web data ▫ contains 373 documents, with each document written for a particular school grade level in the range The Weekly Reader (WR) dataset ▫ contains a total of 1780 documents, with 4 readability levels ranging from 2 to 5 indicating the school grade levels. 3.The Reading A-Z dataset ▫ contains a set of 215 documents, spanning grade 1 through grade /02/18 13

Estimation of Word Acquisition Distributions A comparison of empirical word appearances and AoA distributions for three words: thought (left), multitude (middle), and assimilation (right). The vertical line indicates the 0.8 quantile of the AoA distribution which corresponds to the grade by which 80% of the children have acquired the word. 2011/02/18 14 thought multitude assimilation

Comparison with Oral Studies Among the related work in the linguistic community, are several studies concerning oral acquisitions of words. These studies estimate the age at which a word is acquired for oral use based an interview processes with participating adults. Figure 3 displays the relationship between the GL age of acquisition (AoA) and the acquisition ages obtained from readability data based on the s = 0.8 quantile. Some correlation is present (r 2 = 0.34) but the two measures differ considerably. 2011/02/18 15 As expected, the acquisition ages obtained from written readability data tend to be higher than the oral studies.

Comparison with Oral Studies The difference distribution between the GL and the inferred AoA from Web 1-12 is skewed to the right as would be expected since written AoA is higher than oral AoA. Values of s in [0.5, 0.9] produced reasonable results, with s = 0.65 achieving smallest mean absolute error. 2011/02/18 16

Global Readability Prediction Once acquisition age distributions are available, whether estimated statistically from data or obtained from a survey, they may be used to predict the grade level of novel documents. ▫ the model predicts readability level t ∗ for a novel document d if it is the minimal grade for which readability is established: ▫ where β(t) is a parameter describing the strictness of the readability requirement. Note that we allow β(t) to vary as a function of time (grade level). 2011/02/18 17

First, we use the Web 1-12 corpus to learn optimal parameter values for a, b, r, and s and then assess prediction error using a test-training paradigm for the proposed model, Naive Bayes, and support vector regression. Second, the trained model is applied with to the Reader A-Z corpus and the results are compared with alternative semantic variables. 2011/02/18 18

Figure 6 is a scatter plot comparing predicted grades vs. actual grades, with a strong correlation of /02/18 19

A comparison of mean absolute error (MAE) across prediction algorithms shows the age of acquisition model compares favorably. The confidence bounds (LB,UB) were computed by repeating each model building procedure 100 times. The results show that SVR and the dynamic threshold prediction rule perform similarly well, suggesting that Definition 1 and the Rasch model are suitable models for readability prediction. 2011/02/18 20

Discussion While there have been several recent studies regarding word acquisition and readability our work is the first to provide a quantitative connection between these two concepts in a statistically meaningful way. The connection between word acquisition and readability is both intuitive and useful. Experiments validate the proposed model is effective in terms of predicting readability level of documents. 2011/02/18 21

Comments A different way to measure document’s readability ▫ How to implement ? Word acquisition v.s CEEC & GEPT word level Unknown parameter derivation 2011/02/18 22