Presentation on theme: "GOOGLE SEARCH ENGINE Presented By Richa Manchanda."— Presentation transcript:
GOOGLE SEARCH ENGINE Presented By Richa Manchanda
What is Google search? Google search is a Web search engine owned by Google.Inc, and is the most used search engine on the Web. Google receives several hundred million queries each day through its various services.
History……… Google began in January 1996 as a research project by Larry Page, a Ph.D. student at Stanford he was soon joined by Sergey Brin, a fellow Stanford Ph.D. student they were In search for a dissertation theme, Page considered—among other things exploring the mathematical properties of the World Wide Web, understanding its link and structure.
Left to right, Eric E. Schmidt, Sergey Brin and Larry Page of Google………
How does a web search engine work ???? Page Rank Indexing Searching Google
Page Rank PageRank is a link analysis algorithm used by the Google Internet search engine that assigns a numerical weight to each element of a hyperlinked set of documents, such as the Web.
a PageRank results from a "ballot" among all the other pages on the World Wide Web about how important a page is. A hyperlink to a page counts as a vote of support Google assigns a numeric weighting from 0-10 for each webpage on the Internet; this PageRank denotes a site’s importance in the eyes of Google.
Example Assume four web pages: A, B, C and D. Each document would begin with an estimated PageRank of 0.25. If pages B, C, and D each only link to A, they would each confer 0.25 PageRank to A. All PageRanks would thus gather to A because all links would be pointing to A. This is 0.75. A Web crawler may use PageRank as one of a number of important metrics it uses to determine which URL to visit next during a crawl of the web.
Indexing Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval The purpose of storing an index is to optimize speed and performance in finding relevant documents for a search query.
Without an index, the search engine would scan every document in the corpus, which would require considerable time and computing power...
Index design factors: Merge factors Storage techniques Index size Lookup speed Maintenance
Web search engine Web search engines work by storing information about many web pages, which they retrieve from the WWW itself These pages are retrieved by a Web crawler an automated Web browser which follows every link it sees.
The contents of each page are then analyzed to determine how it should be indexed Data about web pages are stored in an index database for use in later queries.
A user enters a query into a search engine, the engine examines its index and provides a listing of best- matching web pages according to its criteria, usually with a short summary containing the document's title.
Search syntax It accepts queries as a simple text area, and breaks up the user's text into a sequence of search terms, which will be words that are to occur in the results, but may also be phrases, delimited by quotations marks ("), qualified terms, with a prefix such as "+", "-", or one of several advanced operators, such as "site:” Google’s Advanced Search web form gives several additional fields which may be used to qualify searches by such criteria as date of first retrieval.
Query expansion Query expansion (QE) is the process of reformulating a seed query to improve retrieval performance in information retrieval operations. Query expansion involves techniques such as: >Finding synonyms of words, and searching for the synonyms as well. >Finding all the various morphological forms of words by stemming each word in the search query. >Fixing spelling errors and automatically searching for the corrected form or suggesting it in the results. >Re-weighting the terms in the original query.
Search engine optimization It is the process of improving the volume and quality of traffic to a web site from search engines via results. . Higher a site's "page rank”, the more visitors it will receive from the search engine.
Search engine result page A search engine results page, or SERP, is the listing of web pages returned by a search engine in response to a keyword query. The results normally include a list of web pages with titles, a link to the page, and a short description showing where the keywords have matched content within the page. A serp may usually contain some ads or results of desired search
Google Wheel - Wikipedia, the free encyclopedia A wheel is a circular device that is capable of rotating on its axis, facilitating movement or transportation whilst supporting a load (mass), or performing... en.wikipedia.org/wiki/Wheel - 61k - Cached - Similar pagesCachedSimilar pages Custom wheels, chrome wheels, chrome rims, truck wheels, car rims...Custom wheels, chrome wheels, chrome rims, truck wheels, car rims... Custom wheels and rims distributor for wholesale and retail customers. We offer custom wheels and tire packages, chrome wheels, chrome rims, truck wheels,... customwheel.com/ - 54k - Cached - Similar pagesCachedSimilar pages 1.Wheel of FortuneWheel of Fortune Wheel of Fortune.... Will Daniel make the grade on Wheel? Find out! These teachers deserve a break from the classroom and the chance to win thousands in... www.wheeloffortune.com/ - 18k - Cached - Similar pagesCachedSimilar pages 1.The WheelThe Wheel The Wheel is a leading support and representative network for the community and voluntary sector in Ireland. www.wheel.ie/ - 18k - Cached - Similar pagesCachedSimilar pages Advanced Search Preferences
Error messages "We're sorry...... but your query looks similar to automated requests from a computer virus or spy ware application. To protect our users, we can't process your request right now. We'll restore your access as quickly as possible, so try again soon. We apologize for the inconvenience, and hope we'll see you again on Google."
Conclusions Google is designed to be a scalable search engine . The primary goal is to provide high quality search results over a rapidly growing World Wide Web. Google employs a number of techniques to improve search quality including page rank, anchor text, and proximity information
A large-scale web search engine is a complex system and much remains to be done Our immediate goals are to improve search efficiency and to scale to approximately 100 million web pages Some simple improvements to efficiency include query caching, smart disk allocation. Another area which requires much research is updates. Future Work