Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : CHRISTOS BOURAS, VASSILIS TSOGKAS 2012, KBS A clustering technique for news articles.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : CHRISTOS BOURAS, VASSILIS TSOGKAS 2012, KBS A clustering technique for news articles."— Presentation transcript:

1 Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : CHRISTOS BOURAS, VASSILIS TSOGKAS 2012, KBS A clustering technique for news articles using WordNet

2 Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab Motivation Document clustering is a powerful technique that has been widely. That some of the problems like synonymy, ambiguity and lack of a descriptive content marking of the generated clusters.

4 Intelligent Database Systems Lab Objectives We are proposing the enhancement of standard k- means algorithm using the external knowledge from WordNet hypernyms. The proposed method enabled significantly improves k-means generating also useful and high quality cluster.

5 Intelligent Database Systems Lab Methodology-Framework

6 Intelligent Database Systems Lab Methodology - Euclidian Distance & City-block Distance

7 Intelligent Database Systems Lab Methodology - Pearson

8 Intelligent Database Systems Lab Methodology - Cosine Distance

9 Intelligent Database Systems Lab Methodology - Spearman-rank Distance

10 Intelligent Database Systems Lab Methodology -Kendall Distance

11 Intelligent Database Systems Lab Methodology - Comparison of various methods Euclidian City-Block Cosine Kendall Spearman Pearson

12 Intelligent Database Systems Lab Methodology - heuristic function For Example for ‘fruit’ d=9, f=2 then W=0.9954 For Example for ‘edible fruit’ d=7, f=1 then W=0.8915’ For Example for ‘food’ d=5, f=1 then W=0.6534

13 Intelligent Database Systems Lab Methodology - Enriching news articles using WordNet hypernyms

14 Intelligent Database Systems Lab Methodology - Labeling clusters using WordNet hypernyms

15 Intelligent Database Systems Lab Methodology - News article’s clustering using W-k means

16 Intelligent Database Systems Lab Experiments

17 Intelligent Database Systems Lab Experiments

18 Intelligent Database Systems Lab Experiments With WordNet use Without WordNet use 0.0001 → ←0.0010 1.000

19 Intelligent Database Systems Lab Experiments

20 Intelligent Database Systems Lab Experiments

21 Intelligent Database Systems Lab Experiments

22 Intelligent Database Systems Lab Conclusions From the plethora of similarity measures that have been used, the appliance of Euclidian and cosine k-means produced the best results. We have also presented a novel algorithmic approach towards enhancing the k-means algorithm using knowledge from an external database, WordNet.

23 Intelligent Database Systems Lab Comments Advantages -The resulting labels are with high precision Applications -News clustering -Cluster labeling


Download ppt "Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : CHRISTOS BOURAS, VASSILIS TSOGKAS 2012, KBS A clustering technique for news articles."

Similar presentations


Ads by Google