Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM Multilingual document mining.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM Multilingual document mining."— Presentation transcript:

1 Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM Multilingual document mining and navigation using self-organizing maps

2 Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab Motivation Such directories are generally constructed manually and may have disadvantages of narrow coverage and inconsistency. Most of existing directories provide only monolingual hierarchies that organized Web pages in terms that a user may not be familiar with.

4 Intelligent Database Systems Lab

5 資料探勘  Data mining

6 Intelligent Database Systems Lab Objectives This work will propose an approach that could automatically arrange multilingual Web pages into a multilingual Web directory to break the language barriers in Web navigation.

7 Intelligent Database Systems Lab Methodology

8 Intelligent Database Systems Lab Methodology – Web directory generation Web page preprocessing and encoding – English Word segmentation stop-word elimination Stemming keyword selection – Chinese select only nouns as keywords

9 Intelligent Database Systems Lab Methodology – Web directory generation Feature map generation

10 Intelligent Database Systems Lab Methodology – Web directory generation Web directory generation – Super cluster construction – Determining dominating clusters – Constructing hierarchy – Parameter setting and discussions – Super cluster construction – Determining dominating clusters – Constructing hierarchy – Parameter setting and discussions

11 Intelligent Database Systems Lab Methodology – Web directory generation Evaluation of the quality of generated hierarchies

12 Intelligent Database Systems Lab Methodology – Multilingual Web directory generation Alignment of monolingual Web directories – Calculating semantic similarity – Incorporating structural similarity – Overall similarity

13 Intelligent Database Systems Lab Methodology – Multilingual Web directory generation Alignment of monolingual Web directories

14 Intelligent Database Systems Lab Methodology – Multilingual Web directory generation Multilingual Web directory generation

15 Intelligent Database Systems Lab Experiments - SOM training

16 Intelligent Database Systems Lab Experiments - SOM training

17 Intelligent Database Systems Lab Experiments - Hierarchy generation

18 Intelligent Database Systems Lab Experiments - Hierarchy generation

19 Intelligent Database Systems Lab Experiments - Hierarchy generation

20 Intelligent Database Systems Lab Experiments - Hierarchy alignment and Web directory generation

21 Intelligent Database Systems Lab Conclusions The development of multilingual hierarchy alignment method is fully automated and requires no human intervention. It will be convenient for users to have a Web directory providing multilingual category labels and categorizing multilingual Web pages.

22 Intelligent Database Systems Lab Comments Advantages -The development of multilingual hierarchy alignment method -Fully automated Applications - SOM


Download ppt "Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM Multilingual document mining."

Similar presentations


Ads by Google