Linear Model Incorporating Feature Ranking for Chinese Documents Readability Gang Sun, Zhiwei Jiang, Qing Gu and Daoxu Chen State Key Laboratory for Novel.

Slides:

Advertisements

Similar presentations

Design of Experiments Lecture I

Advertisements

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.

Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.

Data Analysis Statistics. Inferential statistics.

Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.

Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.

Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,

1 Validation and Verification of Simulation Models.

Inferences About Process Quality

Data Analysis Statistics. Inferential statistics.

Today Concepts underlying inferential statistics

Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.

Bootstrapping applied to t-tests

Chapter 12 Inferential Statistics Gay, Mills, and Airasian

1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.

Chapter 9 Statistical Data Analysis

What is Readability?  A characteristic of text documents..  “the sum total of all those elements within a given piece of printed material that affect.

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.

Conditional Topic Random Fields Jun Zhu and Eric P. Xing ICML 2010 Presentation and Discussion by Eric Wang January 12, 2011.

Learning to Predict Readability using Diverse Linguistic Features Rohit J. Kate 1 Xiaoqiang Luo 2 Siddharth Patwardhan 2 Martin Franz 2 Radu Florian 2.

Evaluation of software engineering. Software engineering research : Research in SE aims to achieve two main goals: 1) To increase the knowledge about.

2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.

Discriminative Syntactic Language Modeling for Speech Recognition Michael Collins, Brian Roark Murat, Saraclar MIT CSAIL, OGI/OHSU, Bogazici University.

A Comparison of Features for Automatic Readability Assessment Lijun Feng 1 Matt Huenerfauth 1 Martin Jansche 2 No´emie Elhadad 3 1 City University of New.

Statistical Estimation of Word Acquisition with Application to Readability Prediction Proceedings of the 2009 Conference on Empirical Methods in Natural.

Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

Hierarchical emotion classification and emotion component analysis on chinese micro-blog posts Hua Xu 1, Weiwei Yang 1, Jiushuo Wang 1, 2 1 State Key Laboratory.

The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.

Automatic Readability Evaluation Using a Neural Network Vivaek Shivakumar October 29, 2009.

Correlation & Regression

Reduction of Training Noises for Text Classifiers Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science ＆ Information Engineering.

Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.

Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.

1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.

CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.

Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval Rui Yan†, ♮, Han Jiang†, ♮, Mirella Lapata‡,

From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.

Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.

Reservoir Uncertainty Assessment Using Machine Learning Techniques Authors: Jincong He Department of Energy Resources Engineering AbstractIntroduction.

Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.

Relevance Language Modeling For Speech Recognition Kuan-Yu Chen and Berlin Chen National Taiwan Normal University, Taipei, Taiwan ICASSP /1/17.

Classification Ensemble Methods 1

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

A New Approach to Utterance Verification Based on Neighborhood Information in Model Space Author :Hui Jiang, Chin-Hui Lee Reporter : 陳燦輝.

Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa *, Satoshi Sekine 1, Ralph Grishman 1 ACL 2004 * Cyberspace Laboratories.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

Brian Lukoff Stanford University October 13, 2006.

Chapter 6 - Standardized Measurement and Assessment

Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.

Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.

Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.

A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.

Reinforcement Learning for Mapping Instructions to Actions S.R.K. Branavan, Harr Chen, Luke S. Zettlemoyer, Regina Barzilay Computer Science and Artificial.

Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.

Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.

Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.

Sentimental feature selection for sentiment analysis of Chinese online reviews Lijuan Zheng 1,2, Hongwei Wang 2, and Song Gao 2 1 School of Business, Liaocheng.

Unit 3 Hypothesis.

R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

An Introduction to Correlational Research

Presentation transcript:

Linear Model Incorporating Feature Ranking for Chinese Documents Readability Gang Sun, Zhiwei Jiang, Qing Gu and Daoxu Chen State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing th International Symposium on Chinese Spoken Language Processing (ISCSLP) 報告者：劉憶年 2015/4/7

Outline Introduction Related Work Linear Model Incorporating Feature Ranking Empirical Study Conclusions 2

Introduction (1/2) Given a document, its readability normally refers to the difficulty of reading and understanding by its readers. Research has been done on readability assessment for nearly a century, and fruitful achievements made to automatically predict a document's reading level (score). Recent research has used machine learning methods, and developed new and complex text features, taken from achievements in other research areas, such as Information Theory and Natural Language Processing. These newly developed methods have shown superiority over the classical readability formulae. 3

Introduction (2/2) In this paper, we apply linear regression models for readability assessment of Chinese documents, and incorporate feature ranking to select the most appropriate features to build the linear function. Our method, named LiFR, resembles the process of building a traditional readability formula. 4

Related Work (1/2) Generally, readability (or reading difficulty) of a document can be measured by predefined reading levels or readability score. Text features are typically used to determine a document's readability. Based on the features, readability formulae can be built to directly calculate the readability of a document. Recently, based on achievements in NLP(natural language processing) and machine learning, many complex features are incorporated and new methods developed in readability research. Researchers have approved that the machine learning methods are superior to traditional readability formulae in readability assessment. Progress has also been made for Chinese readability assessment. 5

Related Work (2/2) Our work differs from previous research in two ways. Firstly, we put forward LiFR (Linear model incorporating Feature Ranking), which is proved useful for readability assessment of Chinese documents. Secondly, we develop text features specific for Chinese, and use feature ranking to select the most valuable features. 6

Linear Model Incorporating Feature Ranking We put forward a method which uses linear model incorporating feature ranking (LiFR) to assess the readability of Chinese documents. 7

Linear Model Incorporating Feature Ranking -- Feature Extraction Surface features count the statistics of grammatical units in a document, such as average sentence length in words or characters, and number of syllables or characters per word. Part of speech (POS) features also count the statistics of the grammatical units, the difference is that each type of words (e.g., noun, verb, adjective, adverb or onomatopoeia) are treated separately. Parse tree features count the statistics of the parse trees extracted from a document. Entropy features are borrowed from information theory, and can measure the transfer of information in a document. 8

Linear Model Incorporating Feature Ranking -- Feature Ranking (1/2) Feature ranking is used to rank features according to their informativeness or importance to discriminate reading levels of documents. An evaluation function is required to compute the informative score of each feature, and features are assumed to be independent of each other. By feature ranking, the performances of the regression models may be improved since irrelevant features are eliminated. 9

Linear Model Incorporating Feature Ranking -- Feature Ranking (2/2) Information gain (IG), also called Kullback-Leibler divergence, is a measure of difference (or distance) between two distributions: one is the class label (i.e. reading levels); the other is the selected feature. Chi-square (Chi) is a non-parametric statistical measure of correlation (or dependence) between two distributions: one is the class label; the other is the selected feature. 10

Linear Model Incorporating Feature Ranking -- Linear Regression Models Linear regression is used to model the relationship between a scalar variable and multiple dependent (or explanatory) factors. In LiFR, we use both linear regression and log-linear regression to grasp the relationship between reading levels and selected high rank features. A linear regression model (LR) computes the reading level (or readability score) of a document by a linear function of the selected feature values collected from the document, while a logarithmic linear regression model (LogLR) computes the readability score by a log-linear function. 11

Empirical Study RQ1. Based on the designed text features, whether the linear regression models can have comparable performance to other commonly used machine learning methods (e.g. SVR) for readability assessment of Chinese documents. RQ2. For LiFR, whether feature ranking can further improve the performance of the linear models, and what is the effect of selected feature set size on readability assessment. 12

Empirical Study -- Readability Corpus For readability assessment of Chinese, we collect the commonly used textbooks in mainland China, from grade 1 to grade 6 of primary schools. Totally we get 637 documents, which are nearly balanced among the six reading (grade) levels. 13

Empirical Study -- Experimental Design We use the ICTCLAS tool to perform both word segmentation and POS tagging of the sentences in a document. Besides, we use the Stanford Parser Kit to build one parse tree for each sentence. The training set consists 90% of documents in each level, and the rest 10% goes to the test set. The process is repeated 100 times to get statistically confident results. RMSE measures the deviation between predicted levels and real ones. Acc measures the ratio of documents which have been correctly predicted. As regression models usually give decimal predictions, ±Acc measures the ratio of predictions which fall in one level around the real ones. 14

Empirical Study -- Performance of Linear models (1/2) For RQ1, firstly, we investigate the performance of both linear regression (LR) and log-linear regression (LogLR) with text features already used in available Chinese readability formulae. Instead, we build both LR models and LogLR models with the three feature sets respectively, and measure the performance of each model based on our dataset. This suggests that using more features do not always lead to better results. 15

Empirical Study -- Performance of Linear models (2/2) As the LogLR model shows its superiority over the LR model for readability assessment, secondly, we build both models using all the designed features (Section 3.1), and compare them with SVR (Support Vector Machine for Regression). This suggests that SVR cannot supersede the linear models for readability assessment. On the other hand, it is not always true that the more the features used, the better the model performs. Some form of feature selection is essential in building good regression models. 16

Empirical Study -- Performance of LiFR (1/2) For RQ2, We implement LiFR to explore whether feature ranking can further improve the performance of the linear models. This suggests that feature selection maybe specific by the dataset used. This suggests that either IG or Chi may not be a good choice for LogLR. This confirms the above suggestion that both IG and Chi are not enough to get the best performance of LiFR. 17

Empirical Study -- Performance of LiFR (2/2) As for the text features developed for Chinese, some of them are highly evaluated by the feature ranking techniques. On the other hand, the top ranked features are not consistently selected, which suggests that the value of a feature may be dependent on both the ranking techniques and the dataset. Above all, we can see that feature ranking is essential to improve the performance of linear regression models. Decision has to be made to select the most appropriate metric for real-world cases. 18

Conclusions (1/2) In this paper, we present a method which uses linear model incorporating feature ranking (LiFR), and develop four categories of text features to assess the readability of Chinese documents. We build both LR and LogLR models on the designed text features, and compare them with the commonly used machine learning method SVM. For LiFR, feature ranking can further improve the performance of linear models. The experimental results indicate that some form of feature selection is essential to build useful linear functions. 19

Conclusions (2/2) Based on above, parts of our future work include designing new features for readability assessment, applying LiFR for different data sets and of different languages, and developing suitable methods for feature selection other than feature ranking. 20