Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mapping Between Taxonomies Elena Eneva 30 Oct 2001 Advanced IR Seminar.

Similar presentations


Presentation on theme: "Mapping Between Taxonomies Elena Eneva 30 Oct 2001 Advanced IR Seminar."— Presentation transcript:

1 Mapping Between Taxonomies Elena Eneva 30 Oct 2001 Advanced IR Seminar

2 Idea Review German French Textile Automobile By country By industry

3 Learning Algorithms 2 separate learners for the documents Old doc category -> new doc category Doc contents -> new category Put together Weighted average based on confidence Final result determined by a decision tree One combined learner – used both old category and contents as features

4 Data Sets Hoovers – 4285 documents –28 categories –255 categories Reuter 2001 – 810597 documents –Topics –Industry categories

5 Current System Simple Decision Tree (C4.5) – learns probabilities of new categories based on old categories (doesn’t know about documents/words) Naïve Bayes (rainbow) – word-based classification into the new categories (doesn’t know about old categories) Combination (Decision Tree) – takes the outputs and confidences of the two, predicts new category

6 Current Results NB tr NB te DT tr DT te Comb tr Comb te 28p255?21.1430.0226.1967.7230.26 255p28??100 Accuracy (%) Five fold cross validation

7 Work in Progress Naïve Bayes for 255 predict 28 (expect higher accuracies) Use one classifier only (taking both kinds of features - words & old categories) – NB An additional single simple classifier – KNN (and VNC-Light, if there is time in the end) Run everything on Reuters 2001 (in addition to Hoovers)

8 Comments? http://www.cs.cmu.edu/~eneva/tax.htm The end.


Download ppt "Mapping Between Taxonomies Elena Eneva 30 Oct 2001 Advanced IR Seminar."

Similar presentations


Ads by Google