Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005 A PRESENTATION on What is this Page Known for? Computing Web Page Reputations D. Rafiei.

Similar presentations


Presentation on theme: "CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005 A PRESENTATION on What is this Page Known for? Computing Web Page Reputations D. Rafiei."— Presentation transcript:

1 CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005 A PRESENTATION on What is this Page Known for? Computing Web Page Reputations D. Rafiei & A.O. Mendelzon WWW9 Conference, Amsterdam, May 2000 by Osama Ahmed Khan

2 OVERVIEW  Introduction  Motivation  Random Walks On The Web Graph  One-level Influence Propagation  Two-level Influence Propagation  Issues  Evaluation  Limitations

3 Introduction  To find a ranked set of pages which have a ‘reputation’ on a topic  To find a ranked set of topics on which a webpage has a ‘reputation’  ‘Reputation’: Evaluated on the basis of:  Navigation  Subsumption  Relatedness  Refutation  Justification

4 Motivation  Organizational Review  Page Classification  Personal Review

5 Random Walks On The Web Graph  Given a set S = {s 1, s 2, ……., s n ) of states, ‘Random Walk’:  Switches  Stays

6 One-level Influence Propagation  ‘Random Surfer’:  Selects a page  Follows a link  Reputation  Total number of visits

7 One-level Model  Reputation  Probability that random surfer looking for topic ‘t’ will visit page ‘p’ at step ‘n’ of the walk

8 One-level Model (Contd.)  One-level Reputation Rank  Equilibrium Probability of visiting page ‘p’ for topic ‘t’

9 Two-level Influence Propagation  ‘Random Surfer’:  Selects a page  Follows a link Forward Backward  Reputation  Authority: Total number of Forward visits  Hub: Total number of Backward visits

10 Two-level Model  Reputation  Authority: Probability that random surfer looking for topic ‘t’ makes a forward visit to page ‘p’ at step ‘n’ of the walk

11 Two-level Model (Contd.)  Reputation (Contd.)  Hub: Probability that random surfer looking for topic ‘t’ makes a backward visit to page ‘p’ at step ‘n’ of the walk

12 Two-level Model (Contd.)  Two-level Reputation Rank  Equilibrium Probability of visiting page ‘p’ for topic ‘t’ in direction associated to ‘r’

13 IssuesIssues 1.Access to large crawl of Web: Computation  Set of pages where ranks are computed Algorithm 1 (One-level) Algorithm 2 (Two-level)  Set of topics on which ranks are computed  Algorithm 1 (One-level) Algorithm 2 (Two-level)

14 Issues (Contd.) 2.No access to large crawl of Web: Approximation  Set of pages where ranks are computed Generalization of PageRank Model (One-level) Generalization of Hubs and Authorities Model (Two-level)  Set of topics on which ranks are computed Algorithm 3 (One-level) Algorithm 4 (Two-level)

15 EvaluationEvaluation  Not access to large crawl of Web: Approximation  Set of topics on which ranks are computed Algorithm 3 (One-level) Algorithm 4 (Two-level)

16 Evaluation (Contd.) 1.Known Authoritative Pages

17 Evaluation (Contd.) 2.Personal Home Pages

18 Evaluation (Contd.) 3.Unregulated Websites

19 LimitationsLimitations  Topic representation on Web  Page connectivity

20 Thank You


Download ppt "CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005 A PRESENTATION on What is this Page Known for? Computing Web Page Reputations D. Rafiei."

Similar presentations


Ads by Google