Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hyper search ing the Web Soumen Chakrabarti, Byron Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins Jacob Kalakal Joseph CS.

Similar presentations


Presentation on theme: "Hyper search ing the Web Soumen Chakrabarti, Byron Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins Jacob Kalakal Joseph CS."— Presentation transcript:

1 Hyper search ing the Web Soumen Chakrabarti, Byron Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins Jacob Kalakal Joseph CS 572 (Spring 2011) | Class Presentation | June 21, 2011

2 Outline Characteristics of the WWW Motivation for building search engines Traditional SEs and the challenges Improvements the associated problems CLEVER Power of hyperlinks Hubs and Authorities Algorithm Evaluate CLEVER Future scope Answer questions and class discussion CS572-Joseph 2 June 21, 2011

3 WWW ~ Universe CS572-Joseph 3 June 21, 2011

4 Motivation for search engines CS572-Joseph 4 June 21, 2011

5 Initial Attempts Ranking functions based on simple heuristics CS572-Joseph 5 June 21, 2011

6 Challenges: Synonymy CS572-Joseph 6 June 21, 2011

7 Challenges: Polysemy CS572-Joseph 7 June 21, 2011

8 Challenges: Spamming Cheap airtickets Cheap airtickets Cheap airtickets Cheap airtickets Cheap airtickets White font on White background CS572-Joseph 8 June 21, 2011

9 Improvements Semantic NetworksHuman selectors Helps synonymy but worsens polysemy Impractical CS572-Joseph9June 21, 2011

10 Hyperlinks - What a CLEVER idea! CS572-Joseph 10 June 21, 2011

11 Hubs & Authorities CS572-Joseph 11 June 21, 2011

12 How it works CS572-Joseph 12 June 21, 2011

13 Clever vs. Google Googles faster!Clever looks back also CS572-Joseph13June 21, 2011

14 Pros Rapid convergence (5 iterations for root set of 3000 pages) Independent of the initial H, A scores Get info even before we actually crawl CS572-Joseph 14 June 21, 2011

15 Segregation of web into clusters CS572-Joseph 15 June 21, 2011

16 Cons The underlying assumption – Web links confer authority – could be incorrect! – Navigation – Advertisement – Disapproval CS572-Joseph 16 June 21, 2011

17 Cons Ignores the Anchor text It is not necessary for every page to be either a hub or an authority Universally popular Websites like Wikipedia will be an authority on almost everything May return a General result for a Narrow topic search CS572-Joseph 17 June 21, 2011

18 Whats next? CS572-Joseph 18 June 21, 2011

19 References S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins,Hypersearching the Web. Scientific American, June 1999.Hypersearching the Web CLEVER project (http://www.almaden.ibm.com/projects/clever.shtml)http://www.almaden.ibm.com/projects/clever.shtml J. Kleinberg.Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998Authoritative sources in a hyperlinked environment S. Brin, L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems. Vol. 30, No. 1-7, pp , 1998.The anatomy of a large-scale hypertextual Web search engine WordNet Project (http://wordnet.princeton.edu/)http://wordnet.princeton.edu/ CS572-Joseph 19 June 21, 2011

20 Group Discussion CS572-Joseph 20 June 21, 2011


Download ppt "Hyper search ing the Web Soumen Chakrabarti, Byron Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins Jacob Kalakal Joseph CS."

Similar presentations


Ads by Google