Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.

Similar presentations


Presentation on theme: "Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset."— Presentation transcript:

1 Transfer Learning Task

2 Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset : B Year: 2006 Features: 96 Model ‘M’ Training Testing 60.9% ??

3 Transfer Learning Transfer learning is the improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned.

4 Traditional Machine Learning vs. Transfer Source Task Knowledge Target Task Learning System Different Tasks Learning System Traditional Machine LearningTransfer Learning

5 Transfer Learning Definition Given a source domain and source learning task, a target domain and a target learning task, transfer learning aims to help improve the learning of the target predictive function using the source knowledge, where or

6 Transfer Definition Therefore, if either : Domain Differences Task Differences

7 Examples: Cancer Data Age Smoking AgeHeight Smoking

8 Examples: Cancer Data Task Source: Classify into cancer or no cancer Task Target: Classify into cancer level one, cancer level two, cancer level three

9 Settings of Transfer Learning Transfer learning settings Labelled data in a source domain Labelled data in a target domain Tasks Inductive Transfer Learning × √ Classification Regression … √ √ Transductive Transfer Learning √× Classification Regression … Unsupervised Transfer Learning ×× Clustering …

10 Questions to answer when transferring What to Transfer ? How to Transfer ? When to Transfer ? Instances ? Model ? Features ? Map Model ? Unify Features ? Weight Instances ? In which Situations

11 What to Transfer ?? Transfer learning approachesDescription Instance-transferTo re-weight some labeled data in a source domain for use in the target domain Feature-representation-transferFind a “good” feature representation that reduces difference between a source and a target domain or minimizes error of models Model-transferDiscover shared parameters or priors of models between a source domain and a target domain Relational-knowledge-transferBuild mapping of relational knowledge between a source domain and a target domain.

12 Inductive Transfer Learning (Instance-transfer) Assumption: the source domain and target domain data use exactly the same features and labels. Motivation: Although the source domain data can not be reused directly, there are some parts of the data that can still be reused by re-weighting. Main Idea: Discriminatively adjust weighs of data in the source domain for use in the target domain.

13 Instance-transfer Assumptions: Source and Target task have same feature space: Marginal distributions are different: Not all source data might be helpful !

14 Algorithms: What to Transfer ? How to Transfer ? Instances Weight Instances

15 Algorithm: TrAdaBoost Idea: Iteratively reweight source samples such that: reduce effect of “bad” source instances encourage effect of “good” source instances Requires: Source task labeled data set Very small Target task labeled data set Unlabeled Target data set Base Learner

16 Our Case D1D1 M D2D2 % D 2 Transfer Learning

17 Self taught clustering Unsupervised transfer learning Co-clustering, no labelled data Feature based transfer learning Features are not the same Tasks may not be the same First applied on image clustering Key idea: found high level shared features, new feature representation

18 Self Taught Learning

19 Self taught learning

20 Latent Dirichlet Allocation (LDA) LDA is a generative probabilistic model of a corpus. The basic idea is that the documents are represented as random mixtures over latent topics, where a topic is characterized by a distribution over words. Typically used for topic modeling Forums, twitter messages, text corpus Do not consider word order Can be viewed as a dimension reduction technique.


Download ppt "Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset."

Similar presentations


Ads by Google