Presentation is loading. Please wait.

Presentation is loading. Please wait.

Amy N. Langville Mathematics Department College of Charleston Math Meet 2/20/10.

Similar presentations


Presentation on theme: "Amy N. Langville Mathematics Department College of Charleston Math Meet 2/20/10."— Presentation transcript:

1 Amy N. Langville Mathematics Department College of Charleston langvillea@cofc.edu Math Meet 2/20/10

2 Outline Short History of Web Search Link Analysis and Google’s PageRank The Random Surfer Google-opoly March Madness Conclusion

3 Thesis 1998

4 Pre-1998 Web Trip back in time to 1995 – How did you find information then?

5 Pre-1998 Web Trip back in time to 1995 – How did you find information then? – Better question:

6 Pre-1998 Web Trip back in time to 1995 – How did you find information then? – Better question: how old were you then?

7 Pre-1998 Web Trip back in time to 1995 – How did you find information then? – Better question: how old were you then?

8 Inverted Index Main tool of pre-1998 search engines

9 Problems with the Inverted Index Too many pages

10

11 Problems with the Inverted Index Too many pages Spam

12 Problems with the Inverted Index Too many pages Spam: human eyes vs. spider eyes

13

14

15 Problems with the Inverted Index Too many pages Spam: human eyes vs. spider eyes

16 Problems with the Inverted Index Too many pages Spam: human eyes vs. spider eyes

17 Problems with the Inverted Index Too many pages Spam: human eyes vs. spider eyes Learn how to make millions Win a ipod Text 8 if you’re awake

18 Link Analysis pre-1998 engines only used text analysis. Link analysis saved search from SEOs andLink analysis built companies like Google, Yahoo, Ask. Nearly every major search engine uses link analysis. 1998 text analysisLink analysis

19 Link Analysis pre-1998 engines only used text analysis. Link analysis saved search from SEOs andLink analysis built companies like Google, Yahoo, Ask. Nearly every major search engine uses link analysis. 1998 text analysisLink analysis

20

21 Moral #1 Sometimes being perceived as an expert forces you to become one.

22 What happens when you google? All the old text analysis + the new link analysislink analysis

23 What happens when you google? ranked list 1 2 3 4 5 6 7 8

24 Why are rankings so important?

25 Web as a graph Each node is a webpage. Each arrow is a hyperlink.

26 In-links vs. Out-links

27 A Trip to Google-topia Emmie Randy, the Random Surfer video clip

28 A Random Walk on the Web graph

29

30

31

32

33 Matrix Notation

34 BUT THERE ARE SOME PROBLEMS!

35

36

37 The surfer gets stuck! This is called a dangling node. How does Google fix this? The surfer gets stuck! This is called a dangling node. How does Google fix this?

38 The surfer can “teleport” We add a link from the dangling node to every other node. When web surfing, this is equivalent to typing an address in the URL bar. We add a link from the dangling node to every other node. When web surfing, this is equivalent to typing an address in the URL bar.

39 Probability Matrix We must also take this into consideration for our probability matrix.

40 Dangling nodes and teleportation video clip

41 Let’s look at another problem.

42

43

44

45

46

47

48

49 Our surfer gets stuck in the webpages 4, 5, and 6. This is called a cycle. How do we fix this? Our surfer gets stuck in the webpages 4, 5, and 6. This is called a cycle. How do we fix this?

50 Cycling video clip

51 Full Teleportation We must consider the possibility of, at any time, using the URL bar to type an address. We add an extra link from every vertex to every other vertex.

52 Surfing vs. teleporting Do people always use the URL bar as much as they use hyperlinks? Google doesn’t think so. They think you only use the URL about 15% of the time.

53 Computing PageRank by observing Randy video clip

54 Summary of Ranking Search query Pull out relevant webpages from inverted index Use PageRank and other information to rank webpages

55 Creators of Google Sergey Brin and Larry Page Computer Science majors Now entire PhD programs in information retrieval

56 Creators of Google Sergey Brin and Larry Page Computer Science majors Now entire PhD programs in information retrieval The world’s largest eigenvector computation

57 Moral #2 Take a leave of absence for brilliant ideas.

58 More on PageRank SIAM’s WhydoMath? Project – url = http://dev.whydomath.org/node/google/index.html DDL on PageRank – url = http://spinner.cofc.edu/~langvillea/DISSECTION- LAB/ClarePageRankModule/1_WebLetter.html?referrer=web cluster& LOCI: Google-opoly – url= http://mathdl.maa.org/mathDL/23/?pa=content&sa=view Document&nodeId=3355

59 Moral #3 The more ways you can view a problem, the more likely you are to truly understand it, and hence, solve it.

60 Google-opoly applets

61 March Madness How should teams vote? Losing teams give one vote to each team that beats them. Losing teams vote with margin of victory. Both winning and losing teams vote with # points scored.

62 Point Differential Voting

63

64

65

66 Moral #4 Now is a great time to do math.

67 Conclusion PageRank is a sophisticated algorithm that set Google apart The Web can be represented with graphs and matrices PageRank’s idea of Voting has many applications.

68 Acknowledgements Tim Chartier Carl Meyer Emmie Douglas Kathryn Pedings Clare Rodgers Erich Kreutzer Ben Kovanich Ryan Dumville Luke Ingram Anjela Govan Nick Dovidio Yoshi Yamamoto Neil Goodson Colin Stephenson


Download ppt "Amy N. Langville Mathematics Department College of Charleston Math Meet 2/20/10."

Similar presentations


Ads by Google