Presentation is loading. Please wait.

Presentation is loading. Please wait.

Supporting Ranked Search in Parallel Search Cluster Networks Fang XiongQiong LuoDyce Jing Zhao {xfang, luo, Hong Kong University of.

Similar presentations


Presentation on theme: "Supporting Ranked Search in Parallel Search Cluster Networks Fang XiongQiong LuoDyce Jing Zhao {xfang, luo, Hong Kong University of."— Presentation transcript:

1 Supporting Ranked Search in Parallel Search Cluster Networks Fang XiongQiong LuoDyce Jing Zhao {xfang, luo, zhaojing}@cs.ust.hk Hong Kong University of Science and Technology

2 2 Introduction Environment: P2P –Unstructured, super-peer, Parallel Search Cluster Network (PSCN) Task: search –Data object ID –filename –Content: ranked keyword search Previous work on ranked search in P2P –PlanetP: in the unstructured P2P network –Shen et al.: in the super-peer network

3 3 FSL: Forwarding Search Link NIL: Non-forwarding Index Link FIL: Forwarding Index Link Both are unstructured P2P networks

4 4 The Process of Ranked Search in a PSCN Indexing time –Build the local indexes –Transmit the local indexes across clusters through NILs Querying time –Forward the query within a cluster through FSLs –Collect the Local Aggregate Information (LAI), and merge into the Global Aggregate Information (GAI) –The document-level index vs. the peer-level index In the case of using the document-level index, additional steps include: calculating the Local Ranking (LR), and merging into the Global Ranking (GR) In the case of using the peer-level index, additional steps include: calculating the Local Peer Ranking (LPR), merging into the Global Peer Ranking (GPR), calculating the Local Document Ranking (LDR) and merging into the Global Document Ranking (GDR) –Merge the locally ranked query results into globally ranked ones and return all or top-K of them to the user

5 5 Average processing time spent on each step, using the document-level index The majority of processing time is spent on local processing This suggests that it is necessary to distribute the search workload evenly over multiple peers Average processing time spent on each step, using the peer- level index

6 6 Average processing time in three overlays, using the document- level index The processing time in the unstructured network is much larger than in the other two The processing time in the super-peer network is about 30% larger than that in the PSCN The processing time is slightly more when using the document- level index than that using the peer-level index Average processing time in three overlays, using the peer-level index

7 7 22~31% less 41~47% less 22~25% lower 4.5~7.7 times higher

8 8 Summary The majority of processing time is spent on local processing. Therefore, it is beneficial to distribute the search workload over peers; otherwise, the bottleneck will be at the super-peers in a super-peer network or at the querying peer in an unstructured network. The processing time and the storage cost per peer in a PSCN is the lowest among the three overlays. The downside of a PSCN is the flooding communication within a cluster and the index replication cost across clusters. The super- peer network wins on the network bandwidth usage and the total storage cost. Compared with document-level indexes, peer-level indexes save 70% of the processing time, 30% of the network bandwidth usage and 30% of the storage space, with a slight decrease in precision.


Download ppt "Supporting Ranked Search in Parallel Search Cluster Networks Fang XiongQiong LuoDyce Jing Zhao {xfang, luo, Hong Kong University of."

Similar presentations


Ads by Google