Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ensemble methods with Data Streams

Similar presentations


Presentation on theme: "Ensemble methods with Data Streams"— Presentation transcript:

1 Ensemble methods with Data Streams
Jungbeom Lee CS240B

2 Outline Intro Ensemble in Machine learning Online ensemble algorithms
Future work

3 Intro Previous class: Data Streams Classifiers Ensemble methods
Online algorithm

4 Classifiers The batch classification problem:
Given a finite training set D={(x,y)} , where y={y1, y2, …, yk}, |D|=n, find a function y=f(x) that can predict the y value for an unseen instance x The data stream classification problem: Given an infinite sequence of pairs of the form (x,y) where y={y1, y2, …, yk}, find a function y=f(x) that can predict the y value for an unseen instance x Example applications: Fraud detection in credit card transactions Topic classification in a news aggregation site, e.g. Google news Translator for foreign languages Supervised learning

5 Motivations Data Volume Changing data characteristics Cost of Learning
Online mining different from static mining Data Volume impossible to mine the entire data at one time can only afford constant memory per data sample Changing data characteristics previously learned models are invalid Cost of Learning model updates can be costly can only afford constant time per data sample.

6 Ensemble A set of classifiers whose individual decisions are combined in some way to classify new examples An ensemble of classifiers to be more accurate than any of its individual members one key to successful is to use individual classifiers with error rates below .5

7 Reasons

8 Ensemble methods Manipulating the Training Examples
Bagging Adaboost Injecting Randomness C4.5 decision tree algorithm

9 Bagging algorithm

10 Bagging algorithm

11 Online bagging algorithm

12 Online weighted bagging algorithm

13 AdaBoost algorithm

14 AdaBoost algorithm

15 Adaptive boosting algorithm

16 Experimental Results

17 Type of Data

18 Experimental Results

19 Experimental Results

20 Experimental Results

21 Future work Better online algorithm for Bagging
Dealing with multiple data types

22 References /mcs-ensembles.pdf A2008.pdf W77MW0J8CP.pdf archive/archive/0962.pdf df


Download ppt "Ensemble methods with Data Streams"

Similar presentations


Ads by Google