Intelligent Database Systems Lab Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee IPM Multilingual document mining and navigation using self-organizing maps
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments
Intelligent Database Systems Lab Motivation Such directories are generally constructed manually and may have disadvantages of narrow coverage and inconsistency. Most of existing directories provide only monolingual hierarchies that organized Web pages in terms that a user may not be familiar with.
Intelligent Database Systems Lab
資料探勘 Data mining
Intelligent Database Systems Lab Objectives This work will propose an approach that could automatically arrange multilingual Web pages into a multilingual Web directory to break the language barriers in Web navigation.
Intelligent Database Systems Lab Methodology
Intelligent Database Systems Lab Methodology – Web directory generation Web page preprocessing and encoding – English Word segmentation stop-word elimination Stemming keyword selection – Chinese select only nouns as keywords
Intelligent Database Systems Lab Methodology – Web directory generation Feature map generation
Intelligent Database Systems Lab Methodology – Web directory generation Web directory generation – Super cluster construction – Determining dominating clusters – Constructing hierarchy – Parameter setting and discussions – Super cluster construction – Determining dominating clusters – Constructing hierarchy – Parameter setting and discussions
Intelligent Database Systems Lab Methodology – Web directory generation Evaluation of the quality of generated hierarchies
Intelligent Database Systems Lab Methodology – Multilingual Web directory generation Alignment of monolingual Web directories – Calculating semantic similarity – Incorporating structural similarity – Overall similarity
Intelligent Database Systems Lab Methodology – Multilingual Web directory generation Alignment of monolingual Web directories
Intelligent Database Systems Lab Methodology – Multilingual Web directory generation Multilingual Web directory generation
Intelligent Database Systems Lab Experiments - SOM training
Intelligent Database Systems Lab Experiments - SOM training
Intelligent Database Systems Lab Experiments - Hierarchy generation
Intelligent Database Systems Lab Experiments - Hierarchy generation
Intelligent Database Systems Lab Experiments - Hierarchy generation
Intelligent Database Systems Lab Experiments - Hierarchy alignment and Web directory generation
Intelligent Database Systems Lab Conclusions The development of multilingual hierarchy alignment method is fully automated and requires no human intervention. It will be convenient for users to have a Web directory providing multilingual category labels and categorizing multilingual Web pages.
Intelligent Database Systems Lab Comments Advantages -The development of multilingual hierarchy alignment method -Fully automated Applications - SOM