Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to build a better Google? Adam Bak IST 497E November 21, 2002.

Similar presentations


Presentation on theme: "How to build a better Google? Adam Bak IST 497E November 21, 2002."— Presentation transcript:

1 How to build a better Google? Adam Bak IST 497E November 21, 2002

2 Google Timeline  1995 March-December – Ph.D. candidates Sergey Brin and Larry Page meet at Stanford University and discuss ideas about new search technology  1996-1997 January 1996-December – Brin and Page create BackRub

3 Google Timeline  1998 August-December – Sergey and Larry raise one million dollars in funding and create Google Corporation 10,000 search queries per day  1999 February-June – 500,000 search queries per day

4 Google Timeline  1999 August-December – 3 million searches per day  2000 May-June – 18 million search queries per day November-December – 60 million searches per day

5 Google Timeline  2002 May – 150 million searches per day

6 Google’s Current Technology  Page Rank Does not count direct links Page A would have a lower rank if pages B and C did not have a high weighting

7 Google’s Current Technology  Hypertext-Matching Analysis Font size – The larger and bolder the fonts, the higher the weights Capitalization – Higher weights Relative Distance – Example - Peanut Butter

8 Google’s Search Capabilities  Images  Usenet  Search by language  File Types (key word filetype:)  News (new feature)

9 Google’s Key Words  cache: Will retrieve the page that Google has stored in its cache  link: Will display pages that link to the given page  related: Will display pages that are similar to the specified page  info: Will show information about a particular page

10 Google’s Key Words  stocks: Will treat the query as a stock ticker symbols  site: Will restrict the search to the given domain  allintitle: Will search words found only in the title  intitle: Will display results with the first word appearing in the title  allinurl: Will search words found only in the URL  inurl: Will display results with the first word appearing in the URL

11

12 The big question  Can any improvements be done to make Google any better than it already is?

13 Google’s Programming Contest  Started this year  Winner - Daniel Egnor  His Idea – A geographic search  “Converted street addresses found within a large corpus of documents to latitude-longitude-based coordinates”  Would allow the user to specify a query – “What are closest movie theaters near my house”

14 Personalized results based on location  The server knows your IP  Find the server closest to you by doing a trace route  http://www.calweb.com/cgi-bin/traceroute http://www.calweb.com/cgi-bin/traceroute  The relative geographic location of your computer can be found by doing a whois query on your IP’s server  http://dns411.com/cgi-bin/whois.pl http://dns411.com/cgi-bin/whois.pl  Once your location is found your results can be customized based on where you live

15 Personalized results from Cookies  Google could ask the user to answer a one time survey and store the results as a cookie  For example: Age Sex Education  A query done by a 60 year old man for “rock” might give back different results than the same search done by a teenager

16 Linguistic Approach  Google could tailor results based on the language used  For example the English word “Java” has many definitions The programming language The coffee The Indonesian island

17 File type restriction  Google already has the ability to search for file types with its keyword filetype:  What if that user does not want to find a certain file type, but instead has the need to find a page that contains a file type either embedded inside the page or has a link to that certain file type?  For example: Find me only pages that have audio files and java applets

18 Authorities and Hubs  Authorities - Highly cited pages  Hubs – Pages that contain many authorities  Difference between search on www.Google.com and www.inquirus.com when searching for “Pasta” www.Google.comwww.inquirus.com

19 Business Improvements  Develop Google software for the PC market  The single search query using the search tool on a windows machine is relatively slow compared to a Google search done online

20 P2P  If Google would create software for the PC market, maybe the amount of searchable documents would increase drastically.  Perhaps with this P2P technology one would be able to find a computer science document about search engine technology that sits in a professor’s computer at Stanford

21 B2B  Business to Business  Google could act as an intermediary between corporations that are looking for the business of other corporations  Coupled together with the Geographic technology, a business could perform a sample query: Find me all the businesses that sell paper around the Philadelphia Region

22 Other ideas  Include commercial databases Library catalogs Proquest  Cluster documents by topic After searching for the keyword “Law,” Google should cluster the documents pertaining to the type of law (property law, banking law, criminal law)

23 Resources  http://www.google.com/corporate/tech.html  http://www.google.com/corporate/timeline.html  http://www.google.com/programming-contest/winner.html  http://citeseer.nj.nec.com/borodin01finding.html  http://www.calweb.com/cgi-bin/traceroute  http://dns411.com/cgi-bin/whois.pl  Aaron Steward– Finance Major

24 Any Questions?


Download ppt "How to build a better Google? Adam Bak IST 497E November 21, 2002."

Similar presentations


Ads by Google