Presentation is loading. Please wait.

Presentation is loading. Please wait.

Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.

Similar presentations


Presentation on theme: "Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering."— Presentation transcript:

1 Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering National Taiwan Normal University Main Reference: 1.Z. Xu, C. Hogan, R. Bauer, Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm, ICDM Workshops, pp. 326-331, 2009. ✩ 2012/8/21 2009 ICDMW

2 I. INTRODUCTION 1.Effective active learning algorithms reduce human labeling effort, as well as produce better learning results. 2.However, efficient active learning algorithms for real world large scale data have not yet been well addressed either in the machine learning community or practical industrial applications. 3.The existing batch mode active learning algorithms, however, cannot exceed the computational bottleneck of the greedy algorithm, which takes O(KN), where K is the number of examples in the batch, and N is the total number of unlabeled examples in the collection. We prove the selection objective function is a submodular function, which exhibits the diminishing returns property: labeling a datum when we have only a small amount of labeled data yields more learnable information for the underlying classifier, than labeling it when we already have a large amount of labeled data. 2

3 II. RELATED WORKS 1.Several active learning algorithms [3], [4] have been proposed to improve the support vector machine classifier. Coincidentally, these active learning approaches use the same selection scheme: Choosing the next unlabeled data close to the current decision hyperplane in the kernel space. 2.Brinker [5] incorporated diversity measure in the batch mode support vector machine active learning problem. This active learning algorithm employs a scoring function to select unlabeled data, which combines distance to the decision boundary [4] and diversity (that is distance to the already selected data). 3.Batch mode active learning considering diversity has also been applied to relevance feedback in information retrieval. Common grounds 1.They explicitly or implicitly model diversity of the selected dataset. 2.They solve the NP hard combinatorial optimization problem with a greedy algorithm. 3

4 Submodular Objective Function In the batch mode active learning problem, we aim to select a subset A of unlabeled examples from all the unlabeled examples N to acquire labels. We formulate the batch mode active learning problem as a constraint optimization problem: select the set of data which maximizes the reward objective function, while within the defined cost constraint. R(A) : the reward function of a candidate unlabeled set A. C(A) : The cost of labeling A B : the cost constraint The informativeness of unlabeled examples to the classifier is well captured by their uncertainty and diversity. 4 (1)

5 Submodular Objective Function Uncertainty is a widely used selection criterion for pool based active learning algorithms. The uncertainty could be measured by different heuristics, including uncertainty sampling in the logistic regression classifier [9], query by committee in the Naïve Bayes classifier [10], version space reduction in the support vector machine classifier[4]. we only focus on support vector machine classifiers in this paper. Among them, the MaxMin margin and ratio margin algorithms need to retrain the SVM classifier, which requires significant computational overhead. So we use the simple margin algorithm, which measures the uncertainty of an unlabeled example by its distance to the current separating hyperplane. 5

6 Submodular Objective Function 6 (2)

7 Submodular Objective Function 7

8 More formally, based on the proof above, we obtain the following Theorem. 8

9 Lazy Active Learning Algorithm The greedy algorithm selects the first example with the largest uncertainty, then calculates the diversity of the remaining examples and selects the example with the largest combination score. However, The total complexity of the greedy algorithm is O(KN), when we select a subset of K examples from a pool of N candidate examples. 9

10 Lazy Active Learning Algorithm We further explore the submodularity of the objective function to reduce the number of pairwise distance calculations. We first find an example with the largest marginal reward. If the distances between this example and any of the previously selected examples have not been calculated, we update its diversity by calculating these distances. If the updated marginal reward of this example is still the largest, we select this example. 10

11 Lazy Active Learning Algorithm 11

12 IV. EXPERIMENTS The algorithm behaves differently for different datasets. Thus, we selected 3 datasets in our experiments to cover a wide range of properties. 1.we consider the binary text classification task CCAT from the Reuters RCV1 collection [13]. This task has an almost balanced class ratio. 2.we use the task C11 category from the RCV1 collection [13], since it has an unbalanced class ratio. 3.we include the topic 103 in the TREC legal 2008 interactive task [1]. This task models a real world e-discovery task, which aims to find relevant information with respect to a legal subject matter. Thus, the final TREC legal dataset we are using contains 6421 labeled documents, among which 3440 documents are non-relevant, and 2981 documents are relevant. We randomly sample 3000 documents as test set, and use the remaining 3421 documents as training set. We use the three text only fields ( text body, Title, brand). 12

13 IV. EXPERIMENTS 13

14 How Effective is the Active Learning Algorithm ? we compare the running time of our lazy active learning algorithm with two versions of greedy algorithms: greedy algorithm using an inverted index and greedy algorithm using pairwise cosine distance calculation. 14

15 How Effective is the Active Learning Algorithm ? We fixed the number of feedback documents at 100, and varied the total number of training documents in the pool. For all these three datasets, we use 12.5%, 25%, 50%, and 100% of the training data as the sampling pool, and compare the speed of lazy active learning, inverted index greedy active learning, pairwise greedy active learning. 15

16 V. CONCLUSIONS To summarize, the major contributions of this paper are: 1.We propose a generalized object function for batch mode active learning, which is shown to be a submodular function. Based on the submodularity of the objective function, we propose an extremely fast algorithm, lazy active learning algorithm. 2.We extensively evaluate our new approach on several real world text classification tasks in terms of classification accuracy and computational efficiency. 16


Download ppt "Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering."

Similar presentations


Ads by Google