Presentation is loading. Please wait.

Presentation is loading. Please wait.

義守大學資訊工程學系 作者:郭東黌, 張佑康 報告人:徐碩利 Date: 2006/11/01

Similar presentations


Presentation on theme: "義守大學資訊工程學系 作者:郭東黌, 張佑康 報告人:徐碩利 Date: 2006/11/01"— Presentation transcript:

1 義守大學資訊工程學系 作者:郭東黌, 張佑康 報告人:徐碩利 Date: 2006/11/01
2018/2/15 適用於FTP之全文檢索系統實作與分析 Implementation and Analysis of an FTP-specific Full-text Search System 義守大學資訊工程學系 作者:郭東黌, 張佑康 報告人:徐碩利 Date: 2006/11/01

2 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Outline Introduction Proposed architecture Experimental result and analysis Conclusion Implementation and Analysis of an FTP-specific Full-text Search System

3 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Introduction FTP sites with large amount of files Filename search Full-text search Implementation and Analysis of an FTP-specific Full-text Search System

4 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Problem Indexing a large amount of data Improving search performance Reforming ranking results Implementation and Analysis of an FTP-specific Full-text Search System

5 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Purpose FTP-specific full-text search system Performance benchmark Precision evaluation Implementation and Analysis of an FTP-specific Full-text Search System

6 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Current systems Google Proposed architecture in 1998 Large-scale web search engine Inverted index scheme Gais FTPLocate SmartArchie An intelligent file search engine ProxyLog-based file search engine Implementation and Analysis of an FTP-specific Full-text Search System

7 Proposed architecture
2018/2/15 Proposed architecture Data indexing (data crawler) File list retrieval and analysis File description extraction from FTP File name and description indexing Query processing Query pre-parsing Data searching Result sorting Implementation and Analysis of an FTP-specific Full-text Search System

8 Implementation and Analysis of an FTP-specific Full-text Search System

9 Implementation and Analysis of an FTP-specific Full-text Search System

10 Data indexing (data crawler)
File list retrieval and analysis File list retrieval in many ways FTP – LIST Rsync Unix – find、ls File description extraction Specific formats of file descriptions XML Package Metadata RFC-index 00_index Implementation and Analysis of an FTP-specific Full-text Search System

11 Data indexing (data crawler)
File name and description indexing Simple string segmentation method Separating Chinese serial words into Chinese characters one by one Separating English words by space or symbols Full-text indexing Based on the modified full-text parser of MySQL Multi-language full-text index support Stopwords Implementation and Analysis of an FTP-specific Full-text Search System

12 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Query processing Query pre-parsing Chinese segmentation with exhaustive search using Libtabe dictionary Term frequency clustering Using k-means algorithm Score computation Group A is required, Group B is optional Implementation and Analysis of an FTP-specific Full-text Search System

13 Implementation and Analysis of an FTP-specific Full-text Search System
k-means clustering Implementation and Analysis of an FTP-specific Full-text Search System

14 Query results cache method
2018/2/15 Query results cache method Cache method Query segmentation and clustering MD5 Benefits Reducing search time Avoiding duplicated search Implementation and Analysis of an FTP-specific Full-text Search System

15 Experimental result and analysis
2018/2/15 Experimental result and analysis Platform CPU:Dual P4 2.6 GHz Ram:8 GBytes OS:Gentoo Linux (Linux ) Storage: GB used Statistics Files:5,414,639 Segmented terms:219,167,352 Terms in the dictionary:236,275 Implementation and Analysis of an FTP-specific Full-text Search System

16 Frequency distribution
Implementation and Analysis of an FTP-specific Full-text Search System

17 Implementation and Analysis of an FTP-specific Full-text Search System
Average search time Implementation and Analysis of an FTP-specific Full-text Search System

18 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Search result (1) Implementation and Analysis of an FTP-specific Full-text Search System

19 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Search result (2) Implementation and Analysis of an FTP-specific Full-text Search System

20 Performance evaluation
2018/2/15 Performance evaluation Search and download times Hit rate Implementation and Analysis of an FTP-specific Full-text Search System

21 Search and download times
Implementation and Analysis of an FTP-specific Full-text Search System

22 Implementation and Analysis of an FTP-specific Full-text Search System
Hit rate Implementation and Analysis of an FTP-specific Full-text Search System

23 Implementation and Analysis of an FTP-specific Full-text Search System
2018/2/15 Conclusion We implement the method on I-Shou University FTP server Effects using full-text search on FTP server is better than traditional search. The average of hit rate is greater than 0.6 Thank you for your attention! Implementation and Analysis of an FTP-specific Full-text Search System


Download ppt "義守大學資訊工程學系 作者:郭東黌, 張佑康 報告人:徐碩利 Date: 2006/11/01"

Similar presentations


Ads by Google