Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.1 Chapter 4 : Searching the Web The mechanics.

Similar presentations


Presentation on theme: "Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.1 Chapter 4 : Searching the Web The mechanics."— Presentation transcript:

1 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.1 Chapter 4 : Searching the Web The mechanics of a typical search. Search engines as information gatekeepers. The search engine wars. Statistics from search engine logs. The architecture of a search engine. The search index. The query engine. Crawling the web.

2 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.2 Mechanics of a Typical Search Figure 4.1 : Query submitted to Google

3 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.3 Mechanics of a Typical Search Figure 4.2 : Google results for the query

4 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.4 Mechanics of a Typical Search Figure 4.3: Category of first result

5 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.5 Mechanics of a Typical Search Figure 4.4 : Result for phrase query

6 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.6 Search Engines as Information Gatekeepers Search engines are becoming the primary entry point for discovering web pages. Ranking of web pages influences which pages users will view. Exclusion of a site from search engines will cut off the site from its intended audience. The privacy policy of a search engine is important.

7 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.7 Search Engine Wars The battle for domination of the web search space is heating up! The competition is good news for users! The way in which advertising is combined with search results is crucial! There are serious implications if one of the search engines will manage to dominate the space!

8 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.8 Google Verb google has become synonymous with searching for information on the web. Has raised the bar on search quality, Has been the most popular search engine in the last few years. Had a very successful IPO in August 2004. Is innovative and dynamic.

9 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.9 Yahoo! Synonymous with the dot-com boom, probably the best known brand on the web. Started off as a web directory service. Has very strong advertising and e-commerce partnerships. Acquired leading search engine technology in 2003.

10 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.10 MSN Search Synonymous with PC software. Remember its victory in the browser wars with Netscape. Developed its own search engine technology only recently, officially launched in Feb. 2005. May link web search into its next version of Windows.

11 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.11 Others Ask Jeeves –Specialises in natural language question answering. –Search driven by Teoma.Teoma Looksmart –Has its own directory service. –Search driven by Wisenut.Wisenut …

12 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.12 Statistics from search engine logs Statistic (Year) AltaVista (1998) AlltheWeb (2002) Excite (2001) average terms per query 2.352.302.60 average queries per session 2.022.802.30 average result pages viewed 1.391.551.70 usage of advanced search features 20.4%1.0%10.0%

13 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.13 Experiment with search engine query syntax Default is AND, e.g. computer chess normally interpreted as computer AND chess, i.e. both keywords must be present in all hits. +chess in a query means the user insists that chess be present in all hits. computer OR chess means either keywords must be present in all hits. computer chess means that the phrase computer chess must be present in all hits.

14 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.14 The most popular search keywords AltaVista (1998)AlltheWeb (2002)Excite (2001) sexfree appletsex pornodownloadpictures mp3softwarenew chatuknude

15 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.15 Architecture of a Search Engine

16 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.16 Search Index - Inverted File Also store position of word in web page and information on HTML structure.

17 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.17 The query engine The interface between the search index, the user and the web. Algorithmic details of commercial search engines kept as trade secrets. First step is retrieval of potential results from the index. Second step is the ranking of the results based on their relevance to the query.

18 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.18 Portal User Interface (See also yahoo.com)yahoo.com

19 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.19 Crawling the Web

20 Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.20 Delivering a global search service See: Web Search for a Planet: The Google Cluster Architecture (IEEE Micro, 2003). Web Search for a Planet: The Google Cluster Architecture


Download ppt "Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005 Slide 4.1 Chapter 4 : Searching the Web The mechanics."

Similar presentations


Ads by Google