Presentation is loading. Please wait.

Presentation is loading. Please wait.

Search Engines Jan Damsgaard Dept. of Informatics Copenhagen Business School

Similar presentations


Presentation on theme: "Search Engines Jan Damsgaard Dept. of Informatics Copenhagen Business School"— Presentation transcript:

1 Search Engines Jan Damsgaard Dept. of Informatics Copenhagen Business School http://www.cbs.dk/staff/damsgaard

2 EBUSSJan Damsgaard, 2004 Introduction u How to find relevant information on the web a major problem u Size, growth, lack of universal semantic organization major impediments u Two major strategies 1.Improve users’ search capability by using raw computer power: search engines 2.Help organize user relevant information into meaningful categories and bundles of services: portals

3 EBUSSJan Damsgaard, 2004 Definitions u Search engine –Specific information retrieval software which provides as results URL and descriptions web pages u Portal –Site that forms a major site for users when they connect to web; portals combine directories, services and search capabilities and personalization

4 EBUSSJan Damsgaard, 2004 Search Engines u Technical and business solutions that provide these services on a mass scale are important internet phenomena for two reasons: –1) they obtain immense hit rates and therefore are major points of origin for any internet activity –2) they are most important means to channel user search and retrieval –Therefore they are strategically important as reflected in the valuations of the search engine companies in the market t www.mediametrix.com t www.nielsen-netratings.com

5 msn.dk1.365.657 dr.dk863.095 krak.dk794.017 tv2.dk540.977 eniro.dk496.780 ekstrabladet.dk480.470 ofir.dk454.639 tdconline.dk412.923 bt.dk336.704 sol.dk317.125 netdoktor.dknetdoktor.dk (26) 103.669 FDIM (top ti)

6 EBUSSJan Damsgaard, 2004 Look at the stickiness Top 10 sites in November 2000 in terms of minutes spend per month

7 EBUSSJan Damsgaard, 2004 Where Do Search Engines Develop Market Value? u Market recognition, leading to –popular use and adoption –selling add impressions –long term contracts for search engine functionality u Market assessment of real options associated with the recognition of the tool in the marketplace –future value-added alliance and spin-offs

8 EBUSSJan Damsgaard, 2004 Search engine basics u Basic information retrieval techniques u Market trends and capabilities u Awareness of popular assessment metrics for search engine performance u Search engine business models

9 EBUSSJan Damsgaard, 2004 How Search Engines Work u Three components: –spider or link crawler software agent –index or catalog database of content –search engine software or combined meta-search engine u Require significant hardware horsepower, server connectivity and database capabilities u If not connected, you submit your links

10 EBUSSJan Damsgaard, 2004 How do search engines work u Add keywords to text fields u Critical is the choice of the keywords, possibilities of their combination and how the search engine exploits the results u Multilingual support u Another issue is how it organizes search result

11 The most popular search engines Search Engine Total from Dec. 2002 Total from March 2002 Total from Aug. 2001 Google9,7328,3716,567 AlltheWeb6,7574,3884,969 AltaVista5,4193,4323,112 WiseNut4,6645,0094,587 HotBot3,6802,8693,277 MSN Search3,2672,5233,005 Teoma3,2591,8392,219 NLResearch2,3523,6103,321 Gigablast2,352NA

12 EBUSSJan Damsgaard, 2004 Popularity over time u March 2002:Google, WiseNut, AlltheWeb March 2002 u August 2001:Google, Fast, WiseNut August 2001 u April 2001:Google, Fast, MSN (Inktomi) April 2001 u Oct. 2000:Fast, Google, Northern Light Oct. 2000 u July 2000:iWon, Google, AltaVista July 2000 u April 2000:Fast, AltaVista, Northern Light April 2000 u Feb. 2000:Fast, Northern Light, AltaVista Feb. 2000 u Jan. 2000:Fast, Northern Light, AltaVista Jan. 2000 u Nov. 1999:Northern Light, Fast, AltaVista Nov. 1999 u Sept. 1999:Fast, Northern Light, AltaVista Sept. 1999 u Aug. 1999:Fast, Northern Light, AltaVista Aug. 1999 u May 1999:Northern Light, AltaVista, Anzwers May 1999 u March 1999:Northern Light, AltaVista, HotBot March 1999 u January 1999:Northern Light, AltaVista, HotBot January 1999 u August 1998:AltaVista, Northern Light, HotBot August 1998 u May 1998:AltaVista, HotBot, Northern Light May 1998 u February 1998: HotBot, AltaVista, Northern Light February 1998 u October 1997:AltaVista, HotBot, Northern Light October 1997 u September 1997:Northern Light, Excite, HotBot September 1997 u June 1997:HotBot, AltaVista, Infoseek June 1997 u October 1996:HotBot, Excite, AltaVista October 1996 http://searchengineshowdown.com/stats/size.shtml

13 EBUSSJan Damsgaard, 2004 Also specific services u E.g. Google provides –Find pdf files –Stock quotes –Cached links –Similar pages –Who links to you –Specific site –Dictionary definitions –Find Maps

14 Major design issues: completeness and relevance The set of relevant replies The set of obtained results The larger the overlap the better in terms of completeness The smaller the set of not relevant Replies the more relevant search How to organize the results for fast reviewing

15 EBUSSJan Damsgaard, 2004 Page Ranking for Relevance u Biased or unbiased by search engine? u The size of the search space (pages e.g. google addresses currently 1,346,966,000 pages) u Use of keywords: in title, meta-tags information in HTML code, or near top of the page u Use of other facilities like semantic nets or reliability indices (E.g. google uses page ranks and filtering) u Daily, weekly, monthly WebCrawler software refresher u For an analysis see http://www.notess.com/search/

16 EBUSSJan Damsgaard, 2004 Special features of search engines u Multi-lingua searches u Natural language interfaces u Image searches u Agents (specific crawlers and service providers, e-mail, news agents, shopping and trading agents)

17 EBUSSJan Damsgaard, 2004 Search Assistance Features u Phrase Searching –finds terms you enter into the search box as a phrase; tells you in results whether any full or partial matches found u Stemming –Ability for search engine to search for variations of word based on stem t Entering "swim" might also find "swims" and maybe "swimming," depending on the search engine, in some other languages more important t Some search engines have stemming switched on by default u Clustering –Allows only one page per site to be represented in the results

18

19 EBUSSJan Damsgaard, 2004 Conclusions u Search engines are key elements of Internet business u Next wave will integrate new interfaces and new access channels (Digital TV, wireless) u Mass scale business with the value of installed base


Download ppt "Search Engines Jan Damsgaard Dept. of Informatics Copenhagen Business School"

Similar presentations


Ads by Google