Presentation is loading. Please wait.

Presentation is loading. Please wait.

Min Li; Jun Zhao; Tinglei Huang Intelligent Computing and Integrated Systems (ICISS), 2010 International Conference Publication Year: 2010, Page(s): 790.

Similar presentations


Presentation on theme: "Min Li; Jun Zhao; Tinglei Huang Intelligent Computing and Integrated Systems (ICISS), 2010 International Conference Publication Year: 2010, Page(s): 790."— Presentation transcript:

1 Min Li; Jun Zhao; Tinglei Huang Intelligent Computing and Integrated Systems (ICISS), 2010 International Conference Publication Year: 2010, Page(s): 790 - 792 Speaker : Chang, Kun-Hsiang 1

2  Abstract  MODEL AND FRAMEWORK DESIGN OF THE CRAWLERSYSTEM ◦ Workflow diagram of a vertical search engine ◦ Main business logic in the crawler system ◦ Main design patterns in the crawler system ◦ Projects and their dependency diagram of the crawler system 2

3  The crawler system in a vertical search engine should format a representative sample web page so at to make sure that the page could meet the W3C standards, which make it available that the processed page can be resolved by the visual XPath generator and then the desired XPath value will be found out. 3

4  vertical search engine 垂直搜尋引擎  垂直搜尋引擎是針對某一個行業的專業搜尋引擎, 是搜尋引擎的細分和延伸,是對網頁庫中的某類專 門的訊息進行一次整合,定向分欄位抽取出需要的 資料進行處理後再以某種形式返回給使用者。  Xpath - XML Path Language  為 XML 路徑語言  http://studiesweb.wikidot.com/xml:xpath http://studiesweb.wikidot.com/xml:xpath 4

5 5

6  The task configuration parameters are divided into 4 parts:  task basic attributes  path configuration  retrieving rules  schedule configuration 6

7 7

8 8

9 Speaker : Chang, Kun-Hsiang 9


Download ppt "Min Li; Jun Zhao; Tinglei Huang Intelligent Computing and Integrated Systems (ICISS), 2010 International Conference Publication Year: 2010, Page(s): 790."

Similar presentations


Ads by Google