SEARCH ENGINE By Ms. Preeti Patel Lecturer School of Library and Information Science DAVV, Indore E mail:

Slides:



Advertisements
Similar presentations
Natural Language Processing WEB SEARCH ENGINES August, 2002.
Advertisements

Computer Information Technology – Section 3-2. The Internet Objectives: The Student will: 1. Understand Search Engines and how they work 2. Understand.
Search Engines. 2 What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables.
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
1 ETT 429 Spring 2007 Microsoft Publisher II. 2 World Wide Web Terminology Internet Web pages Browsers Search Engines.
Chapter 5 Searching for Truth: Locating Information on the WWW.
How Search Engines Work Source:
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Introduction Web Development II 5 th February. Introduction to Web Development Search engines Discussion boards, bulletin boards, other online collaboration.
Searching the World Wide Web From Greenlaw/Hepp, In-line/On-line: Fundamentals of the Internet and the World Wide Web 1 Introduction Directories, Search.
What is a search engine? A program that indexes documents, then attempts to match documents relevant to a user's search requests. The term search engine.
Internet Research Search Engines & Subject Directories.
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
1 Using Search Tools on the Internet They’re Not All The Same!
Historical Background An internet server from which hierarchically-organised text files could be retrieved from allover the world. Developed at the University.
1 Internet Search Tools Adapted from Kathy Schrock’s PowerPoint entitled “Successful Web Search Strategies” Kathy Schrock’s complete PowerPoint available.
1 Web Developer Foundations: Using XHTML Chapter 11 Web Page Promotion Concepts.
Lecturer: Ghadah Aldehim
Topics Basic Internet Concepts. Types of Information. Search Tools & Techniques. Managing Internet Resources. Browsing a mail. Composing a mail. Attaching.
Lesson 12 — The Internet and Research
1 Web Developer & Design Foundations with XHTML Chapter 13 Key Concepts.
Wasim Rangoonwala ID# CS-460 Computer Security “Privacy is the claim of individuals, groups or institutions to determine for themselves when,
Chapter 5 Searching for Truth: Locating Information on the WWW.
HOW SEARCH ENGINE WORKS. Aasim Bashir.. What is a Search Engine? Search engine: It is a website dedicated to search other websites and there contents.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Courtney Forsmann IT Help Desk Manager Lewis-Clark State College October 1, 2014.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
NCBI/WHO PubMed/Hinari Course Introduction Session #1, Sept 13, 2005 Session #2, Sept 14, 2005 Internet Concepts and Scientific Literature Resources Ho.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 27 How Internet Searching Works.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Search engines are the key to finding specific information on the vast expanse of the World Wide Web. Without sophisticated search engines, it would be.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
استاد : مهندس حسین پور ارائه دهنده : احسان جوانمرد Google Architecture.
Internet Research Tips Daniel Fack. Internet Research Tips The internet is a self publishing medium. It must be be analyzed for appropriateness of research.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Searching Tutorial By: Lola L. Introduction:  When you are using a topic, you might want to use “keyword topics.” Using this might help you find better.
Search Engines.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
Search Tools and Search Engines Searching for Information and common found internet file types.
Search Engines By: Faruq Hasan.
CPT 499 Internet Skills for Educators Session Three Class Notes.
Unit 1—Computer Basics Lesson 3 The Internet and Research.
1 SEARCHING FOR TRUTH Locating Information on the WWW chapter 5.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
Effective Internet Search Strategies: Search Engines & Directories Wendy E. Moore, M.S. in L.S. Acquisitions/Serials Librarian University of Georgia School.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Lecture 4 Access Tools/Searching Tools. Learning Objectives To define access tools To identify various access tools To be able to formulate a search strategy.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
Education 499-R01 Search Basics.
Using Search Tools on the Internet
Search Engines and Search techniques
Types of Search Questions
Chapter Five Web Search Engines
Internet Searching: Finding Quality Information
Search Engines & Subject Directories
Data Mining Chapter 6 Search Engines
Searching for Truth: Locating Information on the WWW
Search Engines & Subject Directories
Search Engines & Subject Directories
Information Retrieval and Web Design
Presentation transcript:

SEARCH ENGINE By Ms. Preeti Patel Lecturer School of Library and Information Science DAVV, Indore E mail:

Search Engine Introduction Introduction Components Components Type Type Functions Subject directories Vs Search Functions Subject directories Vs Search engine engine

Introduction: Search engine Search engine came into existence in According to Yahoo Search engine directory – 2003, there are over 448 major search engines. Search engine came into existence in According to Yahoo Search engine directory – 2003, there are over 448 major search engines. A SE is a searchable database of Internet files collected by a computer program (called wanderer, crawler, robot, worm and spider). A SE is a searchable database of Internet files collected by a computer program (called wanderer, crawler, robot, worm and spider).

Indexing is created from the colleted files e.g. title, full text, size, URL etc. There are no selection criteria for collection of files. SE allows the user to enter keywords and SE retrieve Web documents from its data base that match the key words entered by the searcher. Indexing is created from the colleted files e.g. title, full text, size, URL etc. There are no selection criteria for collection of files. SE allows the user to enter keywords and SE retrieve Web documents from its data base that match the key words entered by the searcher.

The SE doesn’t wait for someone to submit information about a site. It send spider/crawler/web crawler to visits publicly accessible websites following all links it comes across collecting data for search engine indexes. The SE doesn’t wait for someone to submit information about a site. It send spider/crawler/web crawler to visits publicly accessible websites following all links it comes across collecting data for search engine indexes.

A Spider discovers new sites and update information from sites previously visited. A spider can also be used to check links within websites. A Spider discovers new sites and update information from sites previously visited. A spider can also be used to check links within websites.

Components of SE A SE might well be called a search engine service or a search service. The components of SE are following- A SE might well be called a search engine service or a search service. The components of SE are following- Spider: Programs that traverses the Web from link to link, identifying and reading pages. Spider: Programs that traverses the Web from link to link, identifying and reading pages. Index: Web database containing a copy of each web page gathered by the spider. Index: Web database containing a copy of each web page gathered by the spider. SE Mechanism: Software that enables users to query the index and that usually returns results in relevancy ranked order. SE Mechanism: Software that enables users to query the index and that usually returns results in relevancy ranked order.

Types: SE A SE downloads all the information that the page contains and then examines that information to index key words and phrases that can be used to categories the sites. SE can be categorized into three types on the basis of the indexing techniques employed by them:- A SE downloads all the information that the page contains and then examines that information to index key words and phrases that can be used to categories the sites. SE can be categorized into three types on the basis of the indexing techniques employed by them:-

Active SE: It collect all information by itself. It uses a program calls ‘Spider’ or ‘Web robot’ to index and categories web pages as well as websites. The spider travel around WWW in search of new sites and add entries to their catalogue. Active SE: It collect all information by itself. It uses a program calls ‘Spider’ or ‘Web robot’ to index and categories web pages as well as websites. The spider travel around WWW in search of new sites and add entries to their catalogue.

Passive search engines or Subject directories:- Passive search engines or Subject directories:- This type of SE are possibly more accurately referred to as directories. It doesn’t seek out information by itself but it rely on the WWW users to submit details on their favorite sites in order to build up a database. For example yahoo directory has 14 main subject categories and each categories has many sub categories and sub categories also their own sub categories, and so on almost ad infinitum. This type of SE are possibly more accurately referred to as directories. It doesn’t seek out information by itself but it rely on the WWW users to submit details on their favorite sites in order to build up a database. For example yahoo directory has 14 main subject categories and each categories has many sub categories and sub categories also their own sub categories, and so on almost ad infinitum.

Due to size of the web and constant transformation,keeping up with important sites in all subject areas is humanly impossible. Due to size of the web and constant transformation,keeping up with important sites in all subject areas is humanly impossible.

Meta Search engine: Meta Search engine: An increasing number of search engines have led to the creation of ‘meta ‘ search tool. A meta search engine does not catalogue any web page by itself. It simultaneously searches multiple search engines. When query is put before this type of search engine,it forward that query to other search engines. An increasing number of search engines have led to the creation of ‘meta ‘ search tool. A meta search engine does not catalogue any web page by itself. It simultaneously searches multiple search engines. When query is put before this type of search engine,it forward that query to other search engines.

Types of meta Search engine There are two types of meta Search engine There are two types of meta Search engine 1. One type of SE provide separate list of results from each engine that was searched. With this type of Meta SE, one can retrieve comprehensive, and sometimes over whelming, results. 2. The other type is more common and returns a single list of results, often with the duplicate hits removed. This type of Meta SE always brings the results back to its own site for viewing.

Example: Metacrawler ( Metacrawler ( SurrfWax ( ) SurrfWax ( ) Zapmeta ( ) Zapmeta ( )

According scope the Search engine SE can divided in following categories. According scope the Search engine SE can divided in following categories. General Search engine : It covers a rage of services and facilities and facilitate Boolean search. Example: Google, Alta Vista etc. General Search engine : It covers a rage of services and facilities and facilitate Boolean search. Example: Google, Alta Vista etc. Regional Search Engine: It refer to country specific search engine for locating varied resources region –wise. Example : Euro Ferret( Europe) and Excite UK etc. Regional Search Engine: It refer to country specific search engine for locating varied resources region –wise. Example : Euro Ferret( Europe) and Excite UK etc.

Subject specific search engine: Subject specific search engine: It does not attempt to index the entire web. It focuses on searching for websites or pages within a defined subject area, geographical area or type of resources. Because this specific search engine aims for depth of coverage across subject. It does not attempt to index the entire web. It focuses on searching for websites or pages within a defined subject area, geographical area or type of resources. Because this specific search engine aims for depth of coverage across subject.

Examples: Examples: 1. Regional 1. RegionalWWW.123india.com 2. Regional 2. RegionalWWW.in.altavista.com 3. Employment 3. EmploymentWWW.nauri.com 4. Weather 4. WeatherWWW.zipcode.com 5. India specific 5. India specificwww.khoj.com

Features of SE When using a Web search engine by entering more than one words, the space between the words has a logical meaning that directly affects the results of the search. This is known as default syntax. Example: Alta Vista, Info seek and excite, a search, a search of word ‘bird migration’ means that the searcher will get back documents that contain either word’ Birds’ and the word ‘migration’ or both. When using a Web search engine by entering more than one words, the space between the words has a logical meaning that directly affects the results of the search. This is known as default syntax. Example: Alta Vista, Info seek and excite, a search, a search of word ‘bird migration’ means that the searcher will get back documents that contain either word’ Birds’ and the word ‘migration’ or both.

The space between the words defaults to the Boolean OR. This is probably not what the searcher will get back documents that contain both the words ’ Birds’ and ‘migration’. The space between the words defaults to the Boolean OR. This is probably not what the searcher will get back documents that contain both the words ’ Birds’ and ‘migration’. SE return results in schematic order. Most SE use various criteria to contract a term relevancy rating of each hit and present the search results in this order. SE return results in schematic order. Most SE use various criteria to contract a term relevancy rating of each hit and present the search results in this order.

Criteria can include: search term in the title, URL, first heading, HTML META tag; number of times search appear in the document; search terms appearing early in the document; search term appearing close together; etc. Criteria can include: search term in the title, URL, first heading, HTML META tag; number of times search appear in the document; search terms appearing early in the document; search term appearing close together; etc. SE technology continuous in developing stage. To day SE technology is organization of search results by concept, site, domain popularity and linking rather than by relevancy. SE technology continuous in developing stage. To day SE technology is organization of search results by concept, site, domain popularity and linking rather than by relevancy.

Following services provided by the SE Following services provided by the SE Direct Hit ranks according to sites other searchers have chosen from their results to similar queries. Direct Hit ranks according to sites other searchers have chosen from their results to similar queries. Google rank by the number of links from pages ranked high by services. Google rank by the number of links from pages ranked high by services. Inference find ranks by concept and top-level domain. Inference find ranks by concept and top-level domain. Meta find sorts results by keywords, alphabetically or by domain. Meta find sorts results by keywords, alphabetically or by domain.

SE do not index all the documents available on the web. Example most SE cannot index files to password protected sites, behind firewalls or configured by the host server to be left alone. Other web pages may not picked up if they are not linked to other pages. SE do not index all the documents available on the web. Example most SE cannot index files to password protected sites, behind firewalls or configured by the host server to be left alone. Other web pages may not picked up if they are not linked to other pages. SE rarely contain the most recent document posted to internet; do not look yesterday news on search engine SE rarely contain the most recent document posted to internet; do not look yesterday news on search engine

Contents of databases will generally not show up in a search engine results. A growing amount of valuable information on the Web is not generated from the database. Contents of databases will generally not show up in a search engine results. A growing amount of valuable information on the Web is not generated from the database. Some SE allow users to viewed display of the retrieved Web sites/ Web pages, clustered under different topics related to the search terms. Some SE allow users to viewed display of the retrieved Web sites/ Web pages, clustered under different topics related to the search terms.

FUNCTIONS OF SE They search the Internet by using a specialized software,called crawler or robot ;these software /agent can find out web pages by following hyper links. They search the Internet by using a specialized software,called crawler or robot ;these software /agent can find out web pages by following hyper links. These agent/ software sent the cached version of web pages to the repository of a search engine and SE keeps an index of words they find and where (URL) they find them These agent/ software sent the cached version of web pages to the repository of a search engine and SE keeps an index of words they find and where (URL) they find them

They allow users to look forwards or combinations of words found in that index They allow users to look forwards or combinations of words found in that index

Diagrammatic representation of Search Engine CRAWLARS Different Websites Different Websites Different Websites Different Websites Switch Indexing Software in search engine Database of search engine Search User Interface

Subject Directories Vs Search Engine A subject directories is a services that offers a collection of links to Internet resources submitted by the site creators or evaluators and organized into subject categories. A subject directories is a services that offers a collection of links to Internet resources submitted by the site creators or evaluators and organized into subject categories.