Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
Published byModified over 8 years ago
Presentation on theme: "Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting."— Presentation transcript:
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting with the Home Page, they follow all the internal links in the site, visit every WebPage at the site and read every word on every page and create an index of these words.
Running Effective Searches Browsing and Searching are not the same. When you browse, you navigate from one Web page to another by following links. When you search, you enter keywords in a search engine to display a list of pages that match the keywords.
Word Index versus Subject Directory A Word Index database contains billions of pages and from each page hundreds, or even thousands of words, since a Word Index contains every main word (small words such as: in, at, on, etc. are not indexed) from every page it finds at a Website. The Google database contains every main word for 228.000 pages at the Baskent University websites. A Subject Directory is extremely small – it contains only basic Subject headings for a few main pages at each Website. For example, the Google Subject Directory database contains only the Subject category and page Titles of about 54 pages at the Baskent University websites.
Searching for Data Use a Search Engine to find data by keying in a word or phrase. The word or phrase is called a keyword and represents a topic or phrase.
Ranking The positioning of a Web page on the results page is called a site’s ranking. –The order of the ranking will vary according to which search engine is used. –Search engines only examine their own databases.
Search Engines Differ Because they: –use different Web robots (spiders) to collect information –choose different Web pages to index –interpret search expressions differently –store a different amount of text from a Web page in the database
Word Limiters The minus ( - ) sign means a word must not be on the results page. if you want to be sure that the words are found in the results then put a plus ( + ) sign before the word. Phrase Matching (" ") Putting quotes around a set of words will only find results that match the words in that exact sequence compare
Document Section Limiters intitle: Finds pages that contain one specified word in the page title, which appears in the title bar of the browser. allintitle: Finds pages containing several words in title. e.g. allintitle: ataturk education requires both words to be in the page Title. inurl: Finds pages with one specific word in the URL. allinurl: If you start a query with allinurl; Google will restrict the results to those pages with all of the query words in the url. (google-search) allintext: Searches only the Text in the BODY of the web page for the words. filetype: Finds only a specified filetype such as MS- Word (.doc), MS-Excel (.xls)
Web Directory Search engines index words in Web pages and then add them to their databases by employing automated programs, such as Web robots. Real people develop Web directories and decide which Web sites should be added to the directory.
The content in Yahoo’s Web Site Directory is organized by topic
Drill down through directory levels to find Web sites
Some Web directories also include search engine features
Natural Language Searches A conceptual query is one where the search engine returns only Web pages that are relevant to the topic, even if the words don’t precisely match your keywords.
Concept-based Search Engines www.excite.com www.askjeeves.com Can also be queried by natural language
Metasearch Engines Metasearch engines will query several engines simultaneously –the search will pull results from several search engines –www.infospace.comwww.infospace.com –www.mamma.comwww.mamma.com
Other Electronic Research Resources Web is not the only Electronic source of information. Among other sources is the Başkent library website which provides students with access to hundreds of other quality databases that are not found using Search Services like Google or Yahoo, because they are for registered subscribers only. Başkent pays a fee for these services that are then offered at no cost to our Students.