Presentation is loading. Please wait.

Presentation is loading. Please wait.

南台科技大學 資訊工程系 Automatic Website Summarization by Image Content: A Case Study with Logo and Trademark Images Evdoxios Baratis, Euripides G.M. Petrakis, Member,

Similar presentations


Presentation on theme: "南台科技大學 資訊工程系 Automatic Website Summarization by Image Content: A Case Study with Logo and Trademark Images Evdoxios Baratis, Euripides G.M. Petrakis, Member,"— Presentation transcript:

1 南台科技大學 資訊工程系 Automatic Website Summarization by Image Content: A Case Study with Logo and Trademark Images Evdoxios Baratis, Euripides G.M. Petrakis, Member, IEEE, and Evangelos Milios, Senior Member, IEEE IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 20, NO. 9, SEPTEMBER 2008 Date : 2009/10/29 Speaker : Chin-Yen Yang

2 2 Outline INTRODUCTION 1 IMAGE FEATURE EXTRACTION 2 PROPOSED METHOD 3 EXPERIMENTAL RESULTS 4 5 CONCLUSIONS

3 3 1. INTRODUCTION  We introduce the concept of image-based summarization  A fully automated image-based summarization approach is proposed  The evaluation of the method on corporate Websites is presented

4 4 1. INTRODUCTION (C.)  Logos and trademarks are important characteristic signs of corporate Websites  A recent contribution reports that logos and trademarks comprise 32.6 percent of the total number of images on the Web

5 5 2. IMAGE FEATURE EXTRACTION Intensity histogram Radial histogram Angle histogram

6 6 2. IMAGE FEATURE EXTRACTION (C.)  2.1 Image Representation

7 7 3 PROPOSED METHOD

8 8 3 PROPOSED METHOD (C.)  3.1 Image Information Extraction 1.Link information 2. Text Information  This information is displayed together with images or can be used for searching the Web

9 9 3 PROPOSED METHOD (C.)  3.2 Logo and Trademark Detection  Training the decision tree using histogram features outperforms training using raw histograms

10 10 3 PROPOSED METHOD (C.)  Similarity detection  Three attributes corresponding to three histogram intersections, and one attribute corresponding to the euclidean distance of their vectors of moment invariants  The decision tree was pruned with a confidence value of 0.1 and achieved a 93.89 percent average classification accuracy

11 11 3 PROPOSED METHOD (C.)  Image clustering  3.3 Duplicate Logo and Trademark Detection  From each cluster, one image is selected to represent the cluster in the summary

12 12 3 PROPOSED METHOD (C.)  3.4 Logo and Trademark Ranking  Probability  Instances  Depth Image Importance = Probability*Depth*Instances

13 13 3 PROPOSED METHOD (C.)  3.5 Image-Based Summarization  Cluster Importance = Image Importance

14 14 4 EXPERIMENTAL RESULTS

15 15 4 EXPERIMENTAL RESULTS (C.)

16 16 5 CONCLUSIONS  First by extracting images with high probability of being logos or trademarks  Clustering similar images together and by ranking images in each cluster by importance  The most important image from each cluster is included in the summary

17 17 5 CONCLUSIONS(C.)  76 percent detection accuracy  85 percent classification accuracy  64 percent summarization accuracy  Future work includes experimentation with larger training data sets and image types for improving the performance machine learning

18 南台科技大學 資訊工程系


Download ppt "南台科技大學 資訊工程系 Automatic Website Summarization by Image Content: A Case Study with Logo and Trademark Images Evdoxios Baratis, Euripides G.M. Petrakis, Member,"

Similar presentations


Ads by Google