Presentation is loading. Please wait.

Presentation is loading. Please wait.

Search engine note. Search Signals “Heuristics” which allow for the sorting of search results – Word based: frequency, position, … – HTML based: emphasis,

Similar presentations


Presentation on theme: "Search engine note. Search Signals “Heuristics” which allow for the sorting of search results – Word based: frequency, position, … – HTML based: emphasis,"— Presentation transcript:

1 Search engine note

2 Search Signals “Heuristics” which allow for the sorting of search results – Word based: frequency, position, … – HTML based: emphasis, Header – URI based: server name, URL – Page based: Not dependent on the Search term, but on the page features PageRank the most important Search results are a combination of these

3 Anchor text Other pages, images, documents, etc. are linked via “anchors” – E.g.,, etc Text around the anchor describes the linked page – UFOs are stealing our cows! These words index to the LINKED page

4 Search “algorithm” Single or multi-word – For every word in query Find the pages the word occurs on and compute – Group 1: Pages with all those words (intersection) – Group 2: Pages with any of those words (union) – For every page in the returned set Sort by formula – k1 * signal1 + k2 * signal2 + … +kn * signaln – (k’s sum to 1 is advantageous computationally)

5 Indexes Search index – For every page, what words occur on that page Plus “features” of word occurance (location, html, etc) Inverted (reverse) index – For every word, what pages it occurs on

6 Summary http://www.youtube.com/watch?v=fnSJBpB_ OKQ http://www.youtube.com/watch?v=fnSJBpB_ OKQ


Download ppt "Search engine note. Search Signals “Heuristics” which allow for the sorting of search results – Word based: frequency, position, … – HTML based: emphasis,"

Similar presentations


Ads by Google