A Training Manual By Sapience Infosolutions. Did You Know…(Revelation)  What is a Search Engine?  What is Google?  How Google Works?  What is Web.

Slides:



Advertisements
Similar presentations
Getting Your Web Site Found. Meta Tags Description Tag This allows you to influence the description of your page with the web crawlers.
Advertisements

”SEO” Search engine optimization Webmanagement training - Dar es Salaam 2008.
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
1 Presented By Avinash Gutte Under The Guidance of Mrs. Hemangi Kulkarni Department of Computer Engineering Pimpri-Chinchwad College of Engineering, Pune.
Computer Information Technology – Section 3-2. The Internet Objectives: The Student will: 1. Understand Search Engines and how they work 2. Understand.
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
Search Engine Optimization March 23, 2011 Google Search Engine Optimization Starter Guide.
WEB SCIENCE: SEARCHING THE WEB. Basic Terms Search engine Software that finds information on the Internet or World Wide Web Web crawler An automated program.
 Search engines are programs that search documents for specified keywords and returns a list of the documents where the keywords were found.  A search.
IDK0040 Võrgurakendused I Building a site: Publicising Deniss Kumlander.
SEARCH ENGINE By Ms. Preeti Patel Lecturer School of Library and Information Science DAVV, Indore E mail:
“ The Initiative's focus is to dramatically advance the means to collect,store,and organize information in digital forms,and make it available for searching,retrieval,and.
Search Engine Optimization
Search Engine Optimization (SEO) Week 07 Dynamic Web TCNJ Jean Chu.
SEO Essentials Let Your Customers Find You. What is SEO? The process of improving the visibility of a website or a webpage in search engines o Uses “organic”
Wasim Rangoonwala ID# CS-460 Computer Security “Privacy is the claim of individuals, groups or institutions to determine for themselves when,
HOW SEARCH ENGINE WORKS. Aasim Bashir.. What is a Search Engine? Search engine: It is a website dedicated to search other websites and there contents.
Web Search Created by Ejaj Ahamed. What is web?  The World Wide Web began in 1989 at the CERN Particle Physics Lab in Switzerland. The Web did not gain.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
The Confident Researcher: Google Away (Module 2) The Confident Researcher: Google Away 2.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
The Anatomy of a Large-Scale Hypertextual Web Search Engine Presented By: Sibin G. Peter Instructor: Dr. R.M.Verma.
Downloading defined: Downloading is the process of copying a file (such as a game or utility) from one computer to another across the internet. When you.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 27 How Internet Searching Works.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
SEO  What is it?  Seo is a collection of techniques targeted towards increasing the presence of a website on a search engine.
Do's and don'ts to improve your site's ranking … Presentation by:
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
Search Engine Optimization & Pay Per Click Advertising
1 Search Engine Optimization An introduction to optimizing your web site for best possible search engine results.
The Internet 8th Edition Tutorial 4 Searching the Web.
Search engines are the key to finding specific information on the vast expanse of the World Wide Web. Without sophisticated search engines, it would be.
Search engines are used to for looking for documents. They compile their databases by employing "spiders" or "robots" to crawl through web space from.
استاد : مهندس حسین پور ارائه دهنده : احسان جوانمرد Google Architecture.
Web Search Algorithms By Matt Richard and Kyle Krueger.
The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.
Search Engines Reyhaneh Salkhi Outline What is a search engine? How do search engines work? Which search engines are most useful and efficient? How can.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
Computer Science 1000 Information Searching II Permission to redistribute these slides is strictly prohibited without permission.
Search Engines By: Faruq Hasan.
Search Engine and SEO Presented by Yanni Li. Various Components of Search Engine.
1 University of Qom Information Retrieval Course Web Search (Spidering) Based on:
SEO Friendly Website Building a visually stunning website is not enough to ensure any success for your online presence.
Search Engine Optimization Information Systems 337 Prof. Harry Plantinga.
CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Steve Cassidy Computing at MacquarieNo 1 Searching The Web Steve Cassidy Centre for Language Technology Department of Computing Macquarie University.
How to create a high traffic website. Ok, so your site is now live and you still haven't seen any traffic whatsoever to your website. Although getting.
Week 1 Introduction to Search Engine Optimization.
The Anatomy of a Large-Scale Hypertextual Web Search Engine S. Brin and L. Page, Computer Networks and ISDN Systems, Vol. 30, No. 1-7, pages , April.
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
SEO - TECHNIQUES Types of SEO SEO techniques can be classified into two broad categories : 1.White Hat SEO 2.Black Hat SEO
Seminar on seminar on Presented By L.Nageswara Rao 09MA1A0546. Under the guidance of Ms.Y.Sushma(M.Tech) asst.prof.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
Search Engine Optimization
SEARCH ENGINE OPTIMIZATION.
Search Engine Optimization
Search Engines and Search techniques
SEARCH ENGINES & WEB CRAWLER Akshay Ghadge Roll No: 107.
Text Based Information Retrieval
Information Retrieval
What is a Search Engine EIT, Author Gay Robertson, 2017.
Data Mining Chapter 6 Search Engines
Best Digital Marketing Tips For Quick Web Pages Indexing Presented By:- Abhinav Shashtri.
Presentation transcript:

A Training Manual By Sapience Infosolutions

Did You Know…(Revelation)  What is a Search Engine?  What is Google?  How Google Works?  What is Web Spider/crawler?  What is PR?  What is SEO?  What is Hyper Link?  How we Optimize Web and use it for our Benefit?

LOST………??  Lots of questions, Right?  Don’t worry, we’ll answer everything!!  Just don’t forget to Take Notes.

How Do Search Engines Work?  Basically there are two types of search engines. The first one is robots which are called crawlers or spiders. Search Engines is making use of spiders to index websites. As soon as you submit your website pages to a search engine by implementing their necessary submission page, the search engine spider will index your entire site. A ‘spider’ is an automated program specifically run by the search engine system. Spider visits a web site, read the content on the actual site, the site’s Meta tags and also follow the links that the site connects. The spider then returns all that information back to a central depository, where the data is indexed. It will visit each link you have on your website and index those sites too. Some spiders will simply index a specific number of pages on your site, so don’t create a site with 500 pages!

How Do Search Engines Work?  The spider will frequently come back to the sites to test out for any information that has changed. The rate of recurrence with which this happens is determined by the moderators of the search engine.  A spider is nearly like a book where it contains the table of contents, the actual content and the links and references for all the websites it finds during its search, and it may possibly index up to a million pages a day.

How Do Search Engines Work?

 At the time when you ask a search engine to find information, it is essentially searching throughout the index which it has created and not truly searching the Web. Different search engines produce different rankings for the reason that not every search engine uses the same algorithm to search through the index.  One of the things that a search engine algorithm scans for is the occurrence and position of keywords on a web page, but it can moreover identify artificial keyword stuffing or spamdexing. Then the algorithms analyze the way that pages link to other pages in the Web. By glance how pages link to each other, an engine can both verify what a page is about, if the keywords of the linked pages are related to the keywords on the original page.

Some Examples…

How Google Works

 Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:  Googlebot, a web crawler that finds and fetches web pages.  The indexer that sorts every word on every page and stores the resulting index of words in a huge database.  The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.

How Google Works 1. Googlebot, Google’s Web Crawler  Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It’s easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn’t traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, then handing it off to Google’s indexer.  Googlebot consists of many computers requesting and fetching pages much more quickly than you can with your web browser. In fact, Googlebot can request thousands of different pages simultaneously. To avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately makes requests of each individual web server more slowly than it’s capable of doing.  Googlebot finds pages in two ways: through an add URL form, and through finding links by crawling the web.

How Google Works  Unfortunately, spammers figured out how to create automated bots that bombarded the add URL form with millions of URLs pointing to commercial propaganda. Google rejects those URLs submitted through its Add URL form that it suspects are trying to deceive users by employing tactics such as including hidden text or links on a page, stuffing a page with irrelevant words, cloaking (aka bait and switch), using sneaky redirects, creating doorways, domains, or sub-domains with substantially similar content, sending automated queries to Google, and linking to bad neighbors. So now the Add URL form also has a test: it displays some squiggly letters designed to fool automated “letter-guessers”; it asks you to enter the letters you see — something like an eye-chart test to stop spambots.  When Googlebot fetches a page, it culls all the links appearing on the page and adds them to a queue for subsequent crawling. Googlebot tends to encounter little spam because most web authors link only to what they believe are high-quality pages. By harvesting links from every page it encounters, Googlebot can quickly build a list of links that can cover broad reaches of the web. This technique, known as deep crawling, also allows Googlebot to probe deep within individual sites. Because of their massive scale, deep crawls can reach almost every page in the web. Because the web is vast, this can take some time, so some pages may be crawled only once a month.

How Google Works  Although its function is simple, Googlebot must be programmed to handle several challenges. First, since Googlebot sends out simultaneous requests for thousands of pages, the queue of “visit soon” URLs must be constantly examined and compared with URLs already in Google’s index. Duplicates in the queue must be eliminated to prevent Googlebot from fetching the same page again. Googlebot must determine how often to revisit a page. On the one hand, it’s a waste of resources to re-index an unchanged page. On the other hand, Google wants to re-index changed pages to deliver up-to-date results.  To keep the index current, Google continuously recrawls popular frequently changing web pages at a rate roughly proportional to how often the pages change. Such crawls keep an index current and are known as fresh crawls. Newspaper pages are downloaded daily, pages with stock quotes are downloaded much more frequently. Of course, fresh crawls return fewer pages than the deep crawl. The combination of the two types of crawls allows Google to both make efficient use of its resources and keep its index reasonably current.

How Google Works 2. Google’s Indexer  Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google’s index database. This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs. This data structure allows rapid access to documents that contain user query terms.  To improve search performance, Google ignores (doesn’t index) common words called stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single letters). Stop words are so common that they do little to narrow a search, and therefore they can safely be discarded. The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve Google’s performance.

How Google Works 3. Google’s Query Processor  The query processor has several parts, including the user interface (search box), the “engine” that evaluates queries and matches them to relevant documents, and the results formatter.  PageRank is Google’s system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank. PageRank  Google considers over a hundred factors in computing a PageRank and determining which documents are most relevant to a query, including the popularity of the page, the position and size of the search terms within the page, and the proximity of the search terms to one another on the page. A patent application discusses other factors that Google considers when ranking a page. Visit SEOmoz.org’s report for an interpretation of the concepts and the practical applications contained in Google’s patent application.A patent applicationSEOmoz.org’s report

How Google Works  Google also applies machine-learning techniques to improve its performance automatically by learning relationships and associations within the stored data. For example, the spelling- correcting system uses such techniques to figure out likely alternative spellings. Google closely guards the formulas it uses to calculate relevance; they’re tweaked to improve quality and performance, and to outwit the latest devious techniques used by spammers.spelling- correcting system  Indexing the full text of the web allows Google to go beyond simply matching single search terms. Google gives more priority to pages that have search terms near each other and in the same order as the query. Google can also match multi-word phrases and sentences. Since Google indexes HTML code in addition to the text on the page, users can restrict searches on the basis of where query words appear, e.g., in the title, in the URL, in the body, and in links to the page, options offered by Google’s Advanced Search Form and Using Search Operators (Advanced Operators).Google’s Advanced Search FormUsing Search Operators (Advanced Operators)

Let’s see how Google processes a query…

This is How it Looks…!!

Link Value and Page Rank  Link value is based on the valuation of specific link on any specific page, it must be resolved by mixing each of the important keys of hyperlink valuation.  By discovering these keys we can work out the value of a link to our web page and consequently, benefit all existing inbound links.  Link valuation is important because handful of links could be calculated as valuable, even if they fail in two out of three factors of link.

Link Value and Page Rank 1. Page Value  The very important thing that matters is the PR of the website. The higher the PR, the better the inbound link, or so it would seem. Some more things agreed to be even, a link that carries a high page value is pretty good, or even a great as a vote of belief.  On the other hand, PR is not everything you need to know about link value…

Link Value and Page Rank 2. Back link Diluting  Inbound link degrading considers just how much links are placed on the page are querying. This is just because of the fact a web page with lots of page rank counts good worth, but individual links divide the value, therefore the larger amount of links, they will accumulate less value.  To have a back link from a PAGE RANK 10 page is pretty nice, but if that page is having 2000 links on it, then all the back links are getting the share in some heavily divided PR.

Link Value and Page Rank 3. Counted Links  There is no point in having inbound links if it’s on the inner page of a website which is not counted by the search engines.  If you are not able to found the pages in search engines then any back links you own, they are incomplete. So if you want to view any pages you have to build some links to those articles and get them counted.

Link Value and Page Rank

 If you are looking for achieving ultimate value from a website it would include all of the basic factors of link evaluation. A good PR page with few links and it is indexed. Implement all the 3 of important strategies in your working and you will own a link worth more than a assortment of non-indexed, low page rank or page devalued links.

Thanks