Presentation is loading. Please wait.

Presentation is loading. Please wait.

Search Engines By: Faruq Hasan.

Similar presentations


Presentation on theme: "Search Engines By: Faruq Hasan."— Presentation transcript:

1 Search Engines By: Faruq Hasan

2 Today's Coverage Introduction Types of Search Engines
Components of a Search Engine Semantics and Relevancy Search engine Algorithm Search Engine Optimization others

3 Introduction A web search engine is a software system that is designed to search for information on the World Wide Web. The search results are generally presented in a line of results often referred to as search engine results pages. Search engines look through their own databases of information in order to find what it is that you are looking for…

4 Search engine Definition: An internet-based tool that searches an index of documents for a particular term, phrase or text specified by the user. Common Characteristics: Spider, Indexer, Database, Algorithm Find matching documents and display them according to relevance Frequent updates to documents searched and ranking algorithm

5 Types of Search Engine Crawler Powered Indexes Human Powered Indexes
Guruji.com, Google.com Human Powered Indexes Hybrid Models Submitted URLs to a search engine ? Semantic Indexes Hakia.com,

6

7

8 How does a Search Engine work ?
Copyleft (ɔ) 2009 Sudarsun Santhiappan

9 How Search Engines Work (Sherman 2003)
The Web Crawler URL1 URL2 Indexer URL3 URL4 Your Browser Eggs - 90% Eggo - 81% Ego- 40% Huh? - 10% All About Eggs by S. I. Am Search Engine Database Eggs? Eggs.

10 Search Engine Internals
Copyleft (ɔ) 2009 Sudarsun Santhiappan

11 Search Engine Internals
Crawlers Indexers Searching Semantics Ranking Copyleft (ɔ) 2009 Sudarsun Santhiappan

12 Crawlers A crawler is a program that visits Web sites and reads their pages and other information in order to create entries for a search engine index. The major search engines on the Web all have such a program, which is also known as a "spider" or a "bot."

13 Indexers A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and the use of more storage space to maintain the extra copy of data.

14 Semantics Semantics is the study of meaning. It focuses on the relation between signifiers, like words, phrases, signs, and symbols, and what they stand for, their denotation. semantics is the study of meaning that is used for understanding human expression through language. 

15 How Does Search Engine Works?
Spider “crawls” the web to find new documents (web pages, other documents) typically by following hyperlinks from websites already in their database Search engines indexes the content (text, code) in these documents by adding it to their databases and then periodically updates this content Search engines search their own databases when a user enters in a search to find related documents (not searching web pages in real-time) Search engines rank the resulting documents using an algorithm (mathematical formula) by assigning various weights and ranking factors

16 How it works

17 Anatomy of a Search Engine

18 Search Engine algorithm
Unique to every search engine, and just as important as keywords, search engine algorithms are the why and the how of search engine rankings. Basically, a search engine algorithm is a set of rules, or a unique formula, that the search engine uses to determine the significance of a web page, and each search engine has its own set of rules. These rules determine whether a web page is real or just spam, whether it has any significant data that people would be interested in, and many other features to rank and list results for every search query that is begun, to make an organized and informational search engine results page. The algorithms, as they are different for each search engine, are also closely guarded secrets, but there are certain things that all search engine algorithms have in common.

19 1. Relevancy – One of the first things a search engine algorithm checks for is the relevancy of the page. Whether it is just scanning for keywords, or looking at how these keywords are used, the algorithm will determine whether this web page has any relevancy at all for the particular keyword. Where the keywords are located is also an important factor to the relevancy of a website. Web pages that have the keywords in the title, as well as within the headline or the first few lines of the text will rank better for that keyword than websites that do not have these features. The frequency of the keywords also is important to relevancy. If the keywords appear frequently, but are not the result of keyword stuffing, the website will rank better.

20 Algorithm 2. Individual Factors – A second part of search engine algorithms are the individual factors that make that particular search engine different from every other search engine out there. Each search engine has unique algorithms, and the individual factors of these algorithms are why a search query turns up different results on Google than MSN or Yahoo!. One of the most common individual factors is the number of pages a search engine indexes. They may just have more pages indexed, or index them more frequently, but this can give different results for each search engine. Some search engines also penalize for spamming, while others do not. 3. Off-Page Factors – Another part of algorithms that is still individual to each search engine are off-page factors. Off-page factors are such things as click-through measurement and linking. The frequency of click-through rates and linking can be an indicator of how relevant a web page is to actual users and visitors, and this can cause an algorithm to rank the web page higher. Off-page factors are harder for web masters to craft, but can have an enormous effect on page rank depending on the search engine algorithm.

21 Search engine algorithms are the mystery behind search engines, sometimes even amusingly called the search engine’s “Secret Sauce”. Beyond the basic functions of a search engine, the relevancy of a web page, the off-page factors, and the unique factors of each search engine help make the algorithms of each engine an important part of the search engine optimization design.

22 What is SEO What is SEO SEO is an abbreviation for search engine optimization. SEO is the process of improving the volume or quality of traffic to a web site from search engines via search results. SEO aims to improve rankings for relevant keywords in search results.


Download ppt "Search Engines By: Faruq Hasan."

Similar presentations


Ads by Google