Automatic Text Processing: Cross-Lingual Text Categorization Automatic Text Processing: Cross-Lingual Text Categorization Dipartimento di Ingegneria dell’Informazione.

Slides:



Advertisements
Similar presentations
January 23 rd, Document classification task We are interested to solve a task of Text Classification, i.e. to automatically assign a given document.
Advertisements

Text Categorization.
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Albert Gatt Corpora and Statistical Methods Lecture 13.
Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
1 Semi-supervised learning for protein classification Brian R. King Chittibabu Guda, Ph.D. Department of Computer Science University at Albany, SUNY Gen*NY*sis.
Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.
Partitioned Logistic Regression for Spam Filtering Ming-wei Chang University of Illinois at Urbana-Champaign Wen-tau Yih and Christopher Meek Microsoft.
A Survey on Text Categorization with Machine Learning Chikayama lab. Dai Saito.
Multi-View Learning in the Presence of View Disagreement C. Mario Christoudias, Raquel Urtasun, Trevor Darrell UC Berkeley EECS & ICSI MIT CSAIL.
On feature distributional clustering for text categorization Bekkerman, El-Yaniv, Tishby and Winter The Technion. June, 27, 2001.
Introduction to Automatic Classification Shih-Wen (George) Ke 7 th Dec 2005.
Distributional Clustering of Words for Text Classification Authors: L.Douglas Baker Andrew Kachites McCallum Presenter: Yihong Ding.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Text Classification from Labeled and Unlabeled Documents using EM Kamal Nigam Andrew K. McCallum Sebastian Thrun Tom Mitchell Machine Learning (2000) Presented.
OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, and Weiguo Fan et.
Visual Recognition Tutorial
Naïve Bayes Classification Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 14, 2014.
Hypertext Categorization using Hyperlink Patterns and Meta Data Rayid Ghani Séan Slattery Yiming Yang Carnegie Mellon University.
Text Classification Using Stochastic Keyword Generation Cong Li, Ji-Rong Wen and Hang Li Microsoft Research Asia August 22nd, 2003.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
An Automatic Segmentation Method Combined with Length Descending and String Frequency Statistics for Chinese Shaohua Jiang, Yanzhong Dang Institute of.
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
Processing of large document collections Part 3 (Evaluation of text classifiers, applications of text categorization) Helena Ahonen-Myka Spring 2005.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
by B. Zadrozny and C. Elkan
Alignment and classification of time series gene expression in clinical studies Tien-ho Lin, Naftali Kaminski and Ziv Bar-Joseph.
Processing of large document collections Part 2 (Text categorization, term selection) Helena Ahonen-Myka Spring 2005.
Text Classification, Active/Interactive learning.
1 Bins and Text Categorization Carl Sable (Columbia University) Kenneth W. Church (AT&T)
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Review of the web page classification approaches and applications Luu-Ngoc Do Quang-Nhat Vo.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.
Pseudo-supervised Clustering for Text Documents Marco Maggini, Leonardo Rigutini, Marco Turchi Dipartimento di Ingegneria dell’Informazione Università.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Processing of large document collections Part 3 (Evaluation of text classifiers, term selection) Helena Ahonen-Myka Spring 2006.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Instance Filtering for Entity Recognition Advisor : Dr.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
CHAPTER 6 Naive Bayes Models for Classification. QUESTION????
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.
Information Retrieval Lecture 4 Introduction to Information Retrieval (Manning et al. 2007) Chapter 13 For the MSc Computer Science Programme Dell Zhang.
Exploring in the Weblog Space by Detecting Informative and Affective Articles Xiaochuan Ni, Gui-Rong Xue, Xiao Ling, Yong Yu Shanghai Jiao-Tong University.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Class Imbalance in Text Classification
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Hypertext Categorization using Hyperlink Patterns and Meta Data Rayid Ghani Séan Slattery Yiming Yang Carnegie Mellon University.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Using Asymmetric Distributions to Improve Text Classifier Probability Estimates Paul N. Bennett Computer Science Dept. Carnegie Mellon University SIGIR.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 15: Text Classification & Naive Bayes 1.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Queensland University of Technology
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Lecture 15: Text Classification & Naive Bayes
Special Topics in Text Mining
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Michal Rosen-Zvi University of California, Irvine
Information Retrieval
Using Uneven Margins SVM and Perceptron for IE
Presentation transcript:

Automatic Text Processing: Cross-Lingual Text Categorization Automatic Text Processing: Cross-Lingual Text Categorization Dipartimento di Ingegneria dell’Informazione Università degli Studi di Siena Dottorato di Ricerca in Ingegneria dell’Informazone XVII ciclo Candidate: Leonardo Rigutini Advisor: Prof. Marco Maggini

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Outlines − Introduction to Cross Lingual Text Categorization:  Realtionships with Cross Lingual Information Retrieval  Possible approaches –Text Categorization  Multinomial Naive Bayes models  Distance distribution and term filtering  Learning with labeled and unlabeled data –The algorithm  The basic solution  The modified algorithm –Experimental results and conclusions

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Cross Lingual Text Categorization − The problem arose in the last years due to the large amount of documents in many different languages − Many industries would categorize the new documents according to the existing class structure without building a different text management system for each language − The CLTC is highly close to the Cross-Lingual Information Retrieval (CLIR):  Many works in the literature deal with CLIR  Very little work about CLTC

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Cross Lingual Information Retrieval a)Poly-Lingual  Data composed by documents in different languages  Dictionary contains terms from different dictionaries  A wide learning set containing sufficient documents for each languages is needed  An unique classifier is trained b)Cross-Lingual:  The language is identified and translated into a different one  A new classifier is trained for each language

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione a) Poly-Lingual − Drawbacks:  Requires many documents for the learning set for each language  High dimensionality of the dictionary:  n vocabularies  Many terms shared between two languages  Difficult feature selection due to the coexistence of many different languages − Advantages:  Conceptually simple method  An unique classifier is used  Quite good performances

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione b) Cross-Lingual − Drawbacks:  Use of a translation step:  Very low performances  Named Entity Recognition (NER)  Time consuming  In some approaches experts for each language are needed − Advantages:  It does not need experts for each language − Three different approaches: 1.Training set translation 2.Test set translation 3.“Esperanto”

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione 1. Training set translation − The classifier is trained with documents in language L 2 translated from the L 1 learning set:  L 2 is the language of the unlabeled data  The learning set is highly noisy and the classifier could show poor performances − The system works on the L 2 language documents  Number of translations lower than the test set translation approach − Not much used in CLIR

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione 2. Test set translation − The model is trained using documents in language L 1 without translation:  Training using data not corrupted by noise − The unlabeled documents in language L 2 are translated into the language L 1 :  The translation step is highly time consuming  It has very low performances and it introduces much noise  A filtering phase on the test data after the translation is needed − The translated documents are categorized by the classifier trained in the language L 1 :  Possible inconsistency between training and unlabeled data

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione 3. “Esperanto” − All documents in each languages are translated into a new universal language, Esperanto (L E )  The new language should maintain all the semantic features of each language  Very difficult to design  High amount of knowledge for each language is needed − The system works in this new universal language  It needs the translation of the training set and of the test set  Very time consuming − Few used in CLIR

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione From CLIR to CLTC Following the CLIR: a)Poly-Lingual approach  n mono-lingual text categorization problems, one for each language  It requires a test set for each language: experts that labels the documents for each language b)Cross-lingual 1.Test set translation:  It requires the tet set translation  time consuming 2.Esperanto:  It is very time consuming and requires a large amount of knowledge for each language 3.Training set translation:  No proposals using this thecnique

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione CLTC problem formulation − “Given a predefined category organization for documents in the language L 1 the task is to classify documents in language L 2 according to that organization without having to manually label the data in L 2 since it requires experts in that language and this is expensive.” − The Poly-Lingual approach translation is not usable in this case, since it requires a learning set in the unknown language L 2 − Even the “esperanto” approach is not possible, since it needs knowledge about all the languages − Only the training and test set approach can be used in this type of problem

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Outlines − Introduction to Cross Lingual Text Categorization:  Realtionships with Cross Lingual Information Retrieval  Possible approaches –Text Categorization  Multinomial Naive Bayes models  Distance distribution and term filtering  Learning with labeled and unlabeled data –The algorithm  The basic solution  The modified algorithm –Experimental results and conclusions

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Naive Bayes classifier − The two most successful techniques for text categorization:  NaiveBayes  SVM − Naive Bayes  A document d i belongs to class C j such that:  Using bayes rule the probability can be expressed as:

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Multinomial Naive Bayes − Since is a common factor, it can be negleted − can be easily estimated from the document distribution in the training set or otherwise it can be considered constant − The naive assumption is that the presence of each word in a document is an independent event and does not depend on the others. It allows to write: where is the number of occurrences of word w t in the document d i.

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Multinomial Naive Bayes − Assuming that each document is drawn from a multinomial distribution of words, the probability of w t in class C r can be estimated as: − This method is very simple and it is one of the most used in text categorization − Despite the strong naive assumption, it yelds good performances in most cases

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Smoothing techniques − A typical problem in probailistic models are the zero values:  If a feature was never observed in training process, its estimated probability is 0. When it is observed during the classification process, the 0 value can not be used, since it makes null the likelihood − The two main methods to avoid the zero are  Additive smoothing (add-one or Laplace):  Good-Turing smoothing:

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Distance distribution − The distribution of documents in the space is uniform and does not form clouds − The distances between two similar documents and between two different documents are very close − It depends on:  High number of dimensions  High number of not discriminative words that overcome the others in the evaluation of the distances

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Distances distribution

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Information Gain − Term filtering:  Stopword list  Luhn reduction  Information gain − Information gain:

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione − New research area in Automatic Text Processing:  Usually having a large labeled dataset is a time consuming task and much expensive − Learning from labeled and unlabeled examples:  Use a small initial labeled dataset  Extract information from a large unlabeled dataset − The idea is:  Use the labeled data to initialize a labeling process on the unlabeled data  Use the new labeled data to build the classifier Learning from labeled and unlabeled data

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Learning from labeled and unlabeled data − EM algorithm  E step: data are labeled using the current parameter configuration  M step: model is updated assuming the labeled to be correct − The model is initialized using the small labeled dataset

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Outlines − Introduction to Cross Lingual Text Categorization:  Realtionships with Cross Lingual Information Retrieval  Possible approaches –Text Categorization  Multinomial Naive Bayes models  Distance distribution and term filtering  Learning with labeled and unlabeled data –The algorithm  The basic solution  The modified algorithm –Experimental results and Conclusions

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Cross Lingual Text Categorization − The problem can be stated as:  We have a small labeled dataset in language L 1  We want to categorize a large unlabeled dataset in language L 2  We do not want to use experts for the language L 2 − The idea is:  We can translate the training set into the language L 2  We can initialize an EM algorithm with these very noisy data  We can reinforce the behavior of the classifier using the unlabeled data in language L 2

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Notation − With L 1, L 2 and L 1  2 we indicate the languages 1,2 and L 1 translated into L 2 − We use these pedices for training set Tr, test set Ts and classifier C:  C 1  2 indicates the classifier trained with Tr 1  2,, that is the training set Tr 1 translated into language L 2

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione The basic algorithm Tr 1 Ts 2 C21C21C21C21 results Tr 1  2 Translation 1  2 E(t) start EM iterations E step M step

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione The basic algorithm − Once the classifier is trained, it can be used to label a larger dataset − This algortihm can start with small initial dataset and it is an advantage since our initial dataset is very noisy − Problems  Data  Translation  Algorithm

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Data − Temporal dependency:  Documents regarding same topic in different times, deal with different themes − Geographical dependency:  Documents regarding the same topics in different places, deal with different persons, facts etc… − Find the discriminative terms for each topic independent of time and place

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Translation − The translator performs very poorly expecially when the text is badly written :  Named Entity Recognition (NER):  words that should not be translated  different words referring to the same entity  Word-sense disambiguation:  In translation it is a fundamental problem

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Algorithm − EM algorithm has some important limitations:  The trivial solution is a good solution:  all documents in a single cluster  all the others clusters empty  Usually it tends to form few large central clusters and many small peripheral clusters:  It depends on the starting point and on the noise on the data added at the cluster at each EM step

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Improved algorithm by using IG Ts 2 C21C21C21C21 results Tr 1  2 E(t) start EM iterations E step M step IG k 1 IG k 2

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione The filter k 1 − Highly selective since the data are composed by translated text and they are very noisy − Initialize the EM process by selecting the most informative words in the data Ts 2 results Tr 1  2 IG k 1

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione The filter k 2 − It performs a regularization effect on the EM algorithm  it selects the most discriminative words at each EM iteration  The not significative words do not influence the updating of the centroid in EM iterations − The parameter should be higher than the previous:  It works on the original data Ts 2 C21C21C21C21 results E(t) E step M step IG k 2

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Outlines − Introduction to Cross Lingual Text Categorization:  Realtionships with Cross Lingual Information Retrieval  Possible approaches –Text Categorization  Multinomial Naive Bayes models  Distance distribution and term filtering  Learning with labeled and unlabeled data –The algorithm  The basic solution  The modified algorithm –Experimental results and Conclusions

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Previous works − Nuria et al. used ILO corpus and two language (E,S) to test three different approaches to CLTC:  Polylingual  Test set translation  Profile-based translation − They used the Winnow (ANN) and Rocchio algorithm − They compared the results with the monolingual test − Low performances: 70%-75%

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Multi-lingual Dataset − Very few multi-lingual data sets available:  No one with Italian language − We built the data set by crawling the Newsgroups − Newsgroups:  Availability of the same groups in different languages  Large number of available messages  Different levels of each topic

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Multi-lingual Dataset − Multi lingual dataset compostion  Two languages: Italian (L I ) and English (L E )  Three groups: auto, hardware and sport TRAINTEST Tr I Tr E Ts I Auto Hw Sports total

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Multi-lingual Dataset − Drawbacks:  Short messages  Informal documents:  Slang terms  Badly written words  Often transversal topics  advertising, spam, other actual topics (elections)  Temporal dependency: same topic in two different moments deals with different problems  Geographical dependency: same topic in two different places deals with different persons, facts etc…

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Monolingual test Tr I Ts I CICICICI results Ts I test setRecallPrecision Auto Hw Sports ,01 ± 1,03% 96,21 ± 0,93% 92,89 ± 1,12% 93,76 ± 1,09% 93,01 ± 0,45% 96,74 ± 1,24% total ,43 ± 0,90% Results are averaged on a ten-fold cross-validation –No traslation –Training set and test set in the Italian language

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Baseline multilingual test CEICEICEICEI Tr E Ts I results Tr E  I Translation E  I Ts I test setRecallPrecision Auto Hw Sports ,56 ± 5,34% 87,24 ± 2,02% 50,95 ± 6,28% 66,56 ± 4,76% 63,35 ± 3,72% 88,22 ± 4,36% total ,26 ± 4,22% Translation from English to Italian Results are averaged on a ten-fold cross-validation

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Simple EM Algorithm Ts I results Tr E  I Translation E  I E(t) start EM iterations E step M step CEICEICEICEI Tr E Ts I test setRecallPrecision Auto Hw Sports ,32 ± 1,05% 98,04 ± 1,01% 0,73 ± 0,41% 51,40 ± 1,00% 61,55 ± 0,98% 65,41 ± 0,05% total ,32 ± 1,10% Translation from English to Italian Results are averaged on a ten-fold cross-validation

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Filtered EM algorithm Ts I test setRecallPrecision Auto Hw Sports ,59 ± 1,05% 87,88 ± 0,98% 91,01 ± 1,03% 87,07 ± 1,02% 92,78 ± 0,88% 92,28 ± 0,90% total ,64 ± 0,96% Ts I CEICEICEICEI results Tr E  I start EM iterations E step M step IG k 1 IG k 2 E(t) k 1 = 300 k 2 = 1000 Translation from English to Italian Results are averaged on a ten-fold cross-validation

Artificial Intelligence Research Group of Siena Leonardo Rigutini – Dipartimento Ingegneria dell’Informazione Conclusions − The filtered EM algorithm performs better than other algorithms existing in literature − It does not needs an initial labeled dataset in the desired language:  No other algorithms have been proposed having such feature − It achieves good results starting with few translated documents:  It does not require much time for translation