Presentation is loading. Please wait.

Presentation is loading. Please wait.

MGS Computer Special Interest Group (SIG) 4 Manatee Genealogical Society.

Similar presentations


Presentation on theme: "MGS Computer Special Interest Group (SIG) 4 Manatee Genealogical Society."— Presentation transcript:

1 MGS Computer Special Interest Group (SIG) 4 http://www.colket.org/genealogy/MGS/ http://www.colket.org/genealogy/MGS/ Manatee Genealogical Society

2 Manatee Genealogical Society o History of Browsing o Problem of Searching o Solution to Search Problem o Google Search Basics o Search Results 2 Overview

3 Manatee Genealogical Society ss 3 Internet Search Indexable Nodes Non Indexable Nodes Use Google, Bing, or other Search Engine Every word on Page Is indexed with web crawler Static Searches Dynamic Searches Non Indexable Nodes Private Databases Fee/membership (e.g., Ancestry, Professional, News) Many available with Library membership Commercial Databases Shopping Or Limited to employees and customers only Public Databases City, County, State Federal Records Dark Web

4 Manatee Genealogical Society 4 Static Searches Have Web Crawlers Visit Each Node For “Public Domains”

5 Manatee Genealogical Society Who Invented the Internet? 5

6 Manatee Genealogical Society History of Browsing 6 Early on very cumbersome Generally login to a desired computer and search based on the directory  Every computer had its own directory structure and search application(s) In 1980, Tim Berners-Lee proposed and prototyped ENQUIRE, a system to share documents In 1990, he collaborated with Robert Cailliau on a joint proposal for the World Wide Web (WWW) or W3 project for a protocol to share information using hypertext. Became HyperText Markup Language (HTML) – defined using text This allowed people to organize information they wanted to share with Links to the information or files which could then be downloaded Requires a browser that could read these HTML files using a protocol called: HyperText Transfer Protocol (HTTP) Many commercial browsers available today Internet Explorer (IE), Safari, Netscape, Mozilla Firefox, etc. Even Google has its own browser called “Google Chrome” You need a current browser to access latest information

7 Manatee Genealogical Society Problem with Searching 7 Many search applications developed based on HTML BUT Search on Coke –117,000,000 hits Many of these are menu items at restaurants – Much useless information You have hits from every restaurant that has coke on its menu  If you are interested in Coca-Cola headquarters in Atlanta, it may not appear until item 23,672,344 How do you get RELEVANT hits???? How do you get hits ordered so that Relevant Hits are Ordered in a way that facilitates use???? Google found a way to “solve” this problem;

8 Manatee Genealogical Society What’s a Google? 8

9 Manatee Genealogical Society Solution to Search Problem - 1 9 1995, Sergey Brin and Larry Page while students at Stanford came up with a concept of using the strength of the Internet community. Their technology evaluated a site primarily on how many other sites linked to it and ranked search results accordingly. The technology was called PageRank (named for Larry Page) although, it does rank pages as to which page is most important. PageRank tended to return results that people found useful, Resulting in a surprisingly valuable system PageRank was patented by Stanford University. In 1997, BackRub was a PageRank application so called because the technology analyzed what was going on behind the scenes. Fall, 1997 BackRub became Google http://infolab.stanford.edu/~backrub/google.html Sergey Brin and Larry Page purchased the exclusive licensing rights to PageRank for 1,800,000 shares of Google from Stanford $1.56B

10 Manatee Genealogical Society 10 Google is an adaption of googol. A googol is the number 1 followed by 100 zeros (10E100). (from Hitchhikers Guide to the Galaxy). This reflects the number of WWW pages it searches. In 1998, they dropped out of Stanford to develop Google. Set up shop in the Menlo Park garage of Susan WojcickiSusan Wojcicki 1998, 50 employees. 7 million searches a day. By 2005, Google was having 250 million web searches per day. Sergey Brin’s Net Worth is 29.9 Billion Dollars (17 th richest in the world in 2014)Sergey Brin Larry Page’s Net Worth is 29.8 Billion Dollars (18 th richest in the world in 2014)Larry Page Google headquarters, the Googleplex, is located in Mountain View, California. As of March 31, 2009, the company has 19,786 full-time employees; 46,170 by May 2014 - 68 Worldwide locationsthe GoogleplexMountain View, California Solution to Search Problem - 2

11 Manatee Genealogical Society Solution to Search Problem - 3 11 Most Relevant Results First

12 Manatee Genealogical Society Google Search Basics - 0 12  Ready to do some Google Searching  Still a Big Problem  Need to find a way to reduce results  Google Basics Discusses way to do this on Search Query  Google Results discusses ways to do this on Results Page Simple Surname search yields millions of results Colket => 89,600 results Pelot => 477,000 results Reger => 7,650,000 results Sparrow => 63,900,000 results Johnson => 978,000,000 results Smith => 1,500,000,000 results

13 Manatee Genealogical Society Google Search Basics - 1 13  Google cares about: Singular versus Plural – “apple” versus “apples” Order Of Words is Important for Ranking “brown bear” – things named “Brown Bear” first – 20,800,000 Hits “bear brown” – emphasis on bears – 87,000,000 Hits Spelling is Important Names originating in another alphabet have many valid transliterations Mohamed, Mohammed Pelot, Pelote, Pelotte  Google does not care about: Case Sensitivity – Hence “Samuel Pelot” = “samuel pelot” Little Words Ignored – such as I, where, how, the, of, an, for, from, how, it, in, is, single digits, single letters. If desired, use quotes. The who Is a Band Punctuation – MOST PUNCTUATION IS IGNORED. … Suggest putting Surnames first – Pelot Samuel Sometime Get Spelling Suggestions Sometimes Use Misspelled Queries Exceptions to These Rules

14 Manatee Genealogical Society Google Search Basics - 2 14 – Apostrophes are meaningful Hence Pauls, Paul’s, and Pauls’ require 3 different searches. – A “-” before a word excludes terms – later – A “-” between 2 or more words strongly connects the words: Example: twelve-year-old dog almost like “twelve year old” – A “-” by itself is ignored – A “_” between 2 or more words also strongly connects the words Underscore when between 2 words as formal name: Quick_Sort Mary_Beth Underscore treated as a search for MaryBeth | Mary Beth | Mary_Beth – Quotes require exact match – later Exceptions: Punctuation in proper names: Google+ AB+ C++, A# $ is understood to be dollars “Nikon $400” ≠ “Nikon 400” Ditto for ¢, £, ¥. Etc. @ is understood to be an email address e.g., colket@colket.org Hashtags are understood to be trending topics #newenglandpatriots

15 Manatee Genealogical Society Google Search Basics - 3 15  Exact Order; Exact Phrase – Use quotation marks. This techniques is especially useful for genealogy – very different results for Samuel George Pelot versus “Samuel George Pelot” George Samuel Pelot versus “George Samuel Pelot” Huh??? Should get the same number – Why??? What about the middle name? Some sources report as initial or no middle initial (nmi) “Samuel Pelot” “Samuel G Pelot” “Samuel G. Pelot” “Samuel nmi Pelot” 11,000 Hits 37 Hits 0 Hits8,670 Hits 231 Hits 24 Hits 0 Hits Most Punctuation is ignored Remember, a search for “Alexander Bell” will miss hits for “Alexander G Bell” 410,200 Hits 3,390,000 Hits with Graham 87,200 Hits with G. Does not exist

16 Manatee Genealogical Society Google Search Basics - 4 16  Search Within Site/Domain – Identify site in query: iraq site:nytimes.com – returns hits on “Iraq” in NY Times only iraq site:.gov returns hits only from a.gov domain iraq site:.iq returns hits only from an Iraq domain Good for genealogy research: Pelot site:nytimes.com NY Times only Pelot Worldwide Pelot site:.fr French Domain Pelot site:.ch Swiss Domain Pelot site:.ca Canadian Domain Pelot site:.us US Domain (not null) Pelot site:.mil US Military Domain Pelot site:.gov US Government Domain Pelot site:.biz US Business Domain 157 Hits 394,000 Hits 14,700 Hits 1,070 Hits 2,900 Hits 2,410 Hits 89 Hits 947Hits 5,480 Hits

17 Manatee Genealogical Society Google Search Basics - 5 17  Exclude Terms – Use “-” preceded by a blank Say searching for anti-virus stuff for humans: anti-virus includes antivirus, anti virus, and anti-virus” anti-virus -software jaguar -cars -football and for the poor fellow with the surname of “Sparrow” Sparrow Sparrow -bird Sparrow -bird -book 132,000,000 Hits 79,100,000 Hits Can use multiple negations 63,400,000 Hits 60,400,000 Hits 45,500,000 Hits Note: “-” is part of the word for “anti-virus” Strongly Connected Note: Combinations of Search Terms can be effective

18 Manatee Genealogical Society Google Search Basics - 6 18  OR Operator – Sometimes you want hits for either/or Use cap “OR” or OR Operator “|” Tampa Bay Buccaneers 2,620,000 Hits Tampa Bay Buccaneers 2004 298,000 Hits Tampa Bay Buccaneers 2005 409,000 Hits Tampa Bay Buccaneers 2004 2005 206,000 Hits Tampa Bay Buccaneers 2004 OR 2005 726,000 Hits Tampa Bay Buccaneers 2004 | 2005 726,000 Hits Exceptions: Phrases such as “FOR BETTER OR FOR WORSE”

19 Manatee Genealogical Society Google Search Basics - 7 19  Feeling Lucky – Gives you the first page.  Wild Cards – Use a “*” – Works on words, not parts of words – Use a “?” – Single characters (Officially not in Google) For Questions: “"How often does Halley's comet appear?“ Pose as: Halley’s Comet appears every * years – it’s 76 years Also for unknown middle names Samuel * Pelot 10,700,000 Hits Difference for “Samuel * Pelot“ 7,910,000 Hits Difference for “Samuel ? Pelot“ 624 Hits Note: For Samuel Pelot 801,000 Hits and For “Samuel Pelot“ 616 Hits  Ten Word Limit – Search terms over 10 are ignored

20 Manatee Genealogical Society Google Search Basics - 8 20  Misspellings – Try alternative spellings thousands of Web sites mention Arnold Schwarznegger 70,000 Hits though the governator spells his name "Schwarzenegger” 34,500,000 Hits Google recognizes some misspellings and provides alternatives New since Mar 2010

21 Manatee Genealogical Society Google Search Basics - 9 21  Proximity Search Proximity Search “Samuel Pelot”~3 Hits for: Samuel Pelot 801,000 Hits “Samuel Pelot” 616 Hits “Samuel George Pelot” 27 Hits “Samuel G Pelot” 73 Hits “Samuel Pelot”~2 351 Hits (catch initial) “Samuel Pelot”~3 190 Hits “Samuel Pelot”~4 158 Hits “Samuel Pelot”~7 126 Hits “Samuel Pelot”~10 173 Hits Not Advertised Google Tool, But Common Search Tool (e.g., Archive Grid) – Seems to be Useful With Google

22 Manatee Genealogical Society Google Search Basics - 10 22 Keep Search Terms Simple  Most Queries do not require advanced operators or unusual syntax  Simply enter name, place, product, or concept,  Simple is good  Think of terms likely to be on result pages  Don’t use My Head Hurts  Instead use Headache {term likely found on medical page}  Describe what you want in as few words as possible  Use Weather Cancun  Instead of Weather Report for Cancun Mexico  Choose Descriptive Terms  Use Celebrity Ringtones  Instead of Celebrity Sounds

23 Manatee Genealogical Society Google Results - 1 Search Term(s) Link Uniform Resource Locator (URL) Snippet Sponsored Links Start Search Result Statistics Advanced Search (Controls For Advanced Search Options) 23 Result Links Sometimes Similar Pages Cached Pages Filters

24 Manatee Genealogical Society Google Results - 2 24 Ordered By Relevance [Indented same site, less relevant] Also sponsored links, links to news stories, Ads True, unpaid results are on the lower left Ads are on the right (no more than 10 per page) Sponsored Links on top (Ads, at a higher rate; colored background) True Unpaid Search Results => Title Text from site with Snippets of your search terms (in bold) URL => Uniform Resource Locator Size Date – NOT created/updated, but when last crawled Dataset in Jul crawl of 2014 is over 266TB containing 4.05 billion webpages Indication if Cached – Good place to go if Page Removed URL goes to current page Cached link goes to cached page – handy if page deleted or link broken Cached version is used to highlight key words File Format.html use browser.pdf – read with Adobe’s free reader at www.adobe.comwww.adobe.com.doc – read with Microsoft’s free reader at www.microsoft.comwww.microsoft.com.ppt – read with Microsoft’s free reader at www.microsoft.comwww.microsoft.com Similar Results

25 Manatee Genealogical Society Google Results - 3 25 Location Feature – Sets default for searches Location auto-detected - by IP Address - or entered into Google Toolbar Can be changed, if you are looking for stuff in a different location **Only works in your selected country** Manually set location is stored in a “Cookie” Can also be turned off Type of Content – Limit results to a particular type of web content: Images, Videos, News, Shopping, Books, Discussions, Places, Blogs, Real-time (e.g., updates from Twitter) or select the default – Everything This is a big recent change Five years ago one had to search each database --- The databases were not integrated --- They are now --- Called Filters

26 Manatee Genealogical Society Note on URLs 26 Results of Google Search provided as a Uniform Resource Locator (URL) URL Format: http://www.google.com.uk Domain Names: http://www.networksolutions.com/whois/index.jsphttp://www.networksolutions.com/whois/index.jsp URL for my domain name is: http://www.colket.orghttp://www.colket.org Domain name extensions include:.com.mobi.mil.gov.edu.net.info.org.biz.bz.tv Domain Name Extensions (including Country): http://www.networksolutions.com/glossary/glossary- d.jsp#domainnameextensions http://www.networksolutions.com/glossary/glossary- d.jsp#domainnameextensions Domain Name Country Extensions –.be.ca.cn.de.es. ru.com se.com.us URL Uniform Resource Locator HyperText Transfer Protocol World Wide Web Domain Name Extension Domain Name Country Extension

27 Manatee Genealogical Society Note on IP Addresses 27 Every URL maps into a Unique Number called an IP (Internet Protocol) Address http://www.google.comhttp://www.google.com => 216.239.51.99 IPV4 in format of xxx.xxx.xxx.xxx (e.g., 208.77.188.166) 2 32 can handle 4,294,967,296 addresses Expected to run out in early 2000s IPV6 in format of x:x:x:x:x:x:x:x in late 1990s (e.g., 2001:db8:0:1234:0:567:1:1) 2 128 (or 340,282,366,920,938,463,463,374,607,431,768,211,456 ) addresses IP addresses still work as IPV4 addresses all map to IPV6 Operating systems are migrating to IPV6 (e.g., Vista uses IPV6; XP uses IPV4) Go to help/support on your computer searching for IPV6 Need Current Browser Google crawls Over 8,000,000,000 Pages each month

28 Manatee Genealogical Society Static versus Dynamic Searches - 1 “Relevancy” might not be relevant to Researchers and Genealogists. Google’s use of Relevancy is not useful for doing many types of searches: Dynamic Databases Genealogy Searches on family surnames Obscure information Much non-business oriented information Rather unique information

29 Dynamic Searches Indexable Nodes Non Indexable Nodes Use Google, Bing, or other Search Engine Every word on Page Is indexed with web crawler Private Databases Fee/membership (e.g., Ancestry, Professional, News) Many available with Library membership Commercial Databases Shopping Or Limited to employees and customers only Public Databases City, County, State Federal Records Dark Web Static Searches Dynamic Searches

30 Manatee Genealogical Society Static Versus Dynamic Searches - 2 30 Desired Information is in a Separate Database Auction Sites: Ebay | Craig’s List | UBid | Bid Start | Ebid | US Seek Web Pages are Private and Not Available for Google Most businesses have a public web site and a private web site Only data companies want to share is available via Google Limited Access Web Sites – Typically for profit sites, e.g., ACM’s Digital Library – No Google access at all Ancestory.com – Google provides “Teaser” results to entice membership Chicago Tribune – Get “Teaser” hits on Google, but have to pay to access data Many Models Later We will discuss: The dark web Archive Grid New York Times Database

31 Manatee Genealogical Society Future Plans 31 Future Plans for Computer SIGs:  Finding Pictures of Your Ancestor on the Internet – 3 Feb 2015  Using Google for Genealogical Searches – Scheduled for 3 March 2015  Manipulating Photos for Genealogy – Scheduled for April 2015  Using Ancestry.com requested by Dunham Swift – Maybe November 2015  Need Inputs see sheet What else would you like to have addressed at future Computer SIG Meetings?????

32 32 WHAT IS IT? A meeting of genealogists interested in using their personal computers to enhance their research. WHEN? Monthly -- On the first Tuesday of the month (October through May) following main topic speaker. TIME: About 11:15 AM to 12:15 PM, following the meeting break period after the main MGS speaker. PLACE: The Central Library Auditorium, Bradenton, FL (same location as our MGS monthly meeting) WHO: Open to all those interested in using their personal computers to enhance their genealogical research. PROGRAM: Each month we will discuss and view what's new in genealogy on the Internet. We'll have demonstrations of software and hardware that will facilitate our research. Tips and techniques will be shared by and among those attending each meeting. Genealogically related computer, Internet, digital photography and research questions will be fielded during the sessions. We'll look at the newest technology but will keep the discussions as low tech as possible. What topics would you like to hear?????? MGS Computer Special Interest Group (SIG)


Download ppt "MGS Computer Special Interest Group (SIG) 4 Manatee Genealogical Society."

Similar presentations


Ads by Google