Presentation on theme: "Search Engine Optimization (SEO)"— Presentation transcript:
1 Search Engine Optimization (SEO) Most techniques = common sense …Search engine success = integration common senseSurface - search engines and search engine optimizationIncrease sales, traffic, and conversions
2 Agenda What is a Search Engine? Examples of popular Search Engines Search Engines statisticsWhy is Search Engine marketing important?What is a SEO Algorithm?Steps to developing a good SEO strategyRanking factorsBasic tips for optimizationMy agenda for today will encompass the following high level topics and I have allocated 4-8 minutes for Q&A session.I will also list of resources if you are interested in exploring SEO and other internet marketing techniques such as marketing, PPC ads, in-line text, etc.
3 Examples popular Search Engines Today the undisputed leader in search engine usage is Google.Yahoo Claims to have the largest index of all search engines ranking over 5 Billion pages.MSN is the leading third, they have done a tremendous job at catching up to Google and yahoo but they still have a long time to catch up…. This a true example of the first mover advantage I don’t usually talk about Ask Jeeves but I have been seeing an increased level of traffic from Ask.com and according to recent statistics, Ask is now representative of approximately 7-8% of all searches.Every search engine has a different algorithm and ranking…. Our main focus today will be to look at the common and basic elements that every search engine company uses to rank and index pages.
4 Today the undisputed leader in search engine usage is Google. Yahoo Claims to have the largest index of all search engines ranking over 5 Billion pages.MSN is the leading third, they have done a tremendous job at catching up to Google and yahoo but they still have a long time to catch up…. This a true example of the first mover advantage I don’t usually talk about Ask Jeeves but I have been seeing an increased level of traffic from Ask.com and according to recent statistics, Ask is now representative of approximately 7-8% of all searches.Every search engine has a different algorithm and ranking…. Our main focus today will be to look at the common and basic elements that every search engine company uses to rank and index pages.
5 How Do Search Engines Work? Mechanics of a typical searchIf you were not take anything away from this session today… and only understand how Spider or crawlers work you would have understood 40% of Search engines.
9 How Do Search Engines Work? Spider “crawls” the web to find new documents (web pages, other documents) typically by following hyperlinks from websites already in their databaseSearch engines indexes the content (text, code) in these documents by adding it to their databases and then periodically updates this contentSearch engines search their own databases when a user enters in a search to find related documents (not searching web pages in real-time)Search engines rank the resulting documents using an algorithm (mathematical formula) by assigning various weights and ranking factorsIf you were not take anything away from this session today… and only understand how Spider or crawlers work you would have understood 40% of Search engines.
10 Search on the WebCorpus: The publicly accessible Web: static + dynamicGoal: Retrieve high quality results relevant to the user’s need(not docs!)NeedInformational – want to learn about somethingNavigational – want to go to that pageTransactional – want to do something (web-mediated)Access a serviceDownloadsShopGray areasFind a good hubExploratory search “see what’s there”Low hemoglobinUnited AirlinesTampere weatherMars surface imagesNikon CoolPixCar rental FinlandAbortion morality
12 100+ Billion Searches / Month According to Nielsen/Net rating
13 Search Engine WarsThe battle for domination of the web search space is heating up!The competition is good news for users!Crucial: advertising is combined with search results!What if one of the search engines will manage to dominate the space?
14 Synonymous with the dot-com boom, probably the best known brand on the web. Started off as a web directory service in 1994, acquired leading search engine technology in 2003.Has very strong advertising and e-commerce partnersAcquired Inktomi in 2003 ( a search engine) and search marketing Overture (which had acquired alltheweb and altavista in 2003.At that time dropped its collaboration with google.Yahoo!
15 Lycos! One of the pioneers of the field Introduced innovations that inspired the creation of GoogleIntroduced the concept of popularity of web sites.
16 Verb “google” has become synonymous with searching for information on the web. Has raised the bar on search qualityHas been the most popular search engine in the last few years.Had a very successful IPO in August 2004.Is innovative and dynamic.Google
17 Live Search (was: MSN Search) Synonymous with PC software.Remember its victory in the browser wars with Netscape.Developed its own search engine technology only recently, officially launched in FebMay link web search into its next version of Windows.Used Yahoo’s inktomi nd overture until 2005.
18 More (relevant) traffic + Good Conversions Rate = More Sales/Leads Important?80% of consumers find your website by first writing a query into a box on a search engine (Google, Yahoo, Bing)90% choose a site listed on the first page85% of all traffic on the internet is referred to by search enginesThe top three organic positions receive 59% percent of user clicks.Cost-effective advertisingClear and measurable ROIOperates under this assumption:More (relevant) traffic + Good Conversions Rate = More Sales/Leads
19 Experiment with query syntax Default is AND, e.g. “computer chess” normally interpreted as “computer AND chess”, i.e. both keywords must be present in all hits.“+chess” in a query means the user insists that “chess” be present in all hits.“computer OR chess” means either keywords must be present in all hits.“”computer chess”” means that the phrase “computer chess” must be present in all hits.
20 The most popular search keywords AltaVista (1998)AlltheWeb (2002)Excite (2001)sexfreeappletpornodownloadpicturesmp3softwarenewchatuknude
21 Free Keyword Research Tools https://adwords.google.com/o/Targeting/Explorer?__c= &__u= &__o=te&ideaRequestType=KEYWORD_IDE AS#search.noneKeyword Tool and Traffic Estimator to identify competitive phrases and search frequenciesCompare search patterns across specific regions, categories, time frames and properties
22 Web search Users Ill-defined queries Wide variance in Short lengthImprecise termsSub-optimal syntax (80% queries without operator)Low effort in defining queriesWide variance inNeedsExpectationsKnowledgeBandwidthSpecific behavior85% look over one result screen onlymostly above the fold78% of queries are not modified1 query/sessionFollow links – “the scent of information” ...
26 A: Because all of those pages have been crawled Q: How does a search engine know that all these pages contain the query terms?A: Because all of those pages have been crawled
27 Crawling picture URLs crawled and parsed Unseen Web URLs frontier Seed Sec. 20.2Crawling pictureWebURLs frontierURLs crawledand parsedUnseen WebSeedpages
28 Motivation for crawlers Support universal search engines (Google, Yahoo, MSN/Windows Live, Ask, etc.)Vertical (specialized) search engines, e.g. news, shopping, papers, recipes, reviews, etc.Business intelligence: keep track of potential competitors, partnersMonitor Web sites of interestEvil: harvest s for spamming, phishing…… Can you think of some others?…
29 A crawler within a search engine WebgooglebotPage repositoryText & link analysisQueryhitsText indexPageRankRanker
30 One taxonomy of crawlers Many other criteria could be used:Incremental, Interactive, Concurrent, Etc.
31 Basic crawlers This is a sequential crawler Seeds can be any list of starting URLsOrder of page visits is determined by frontier data structureStop criterion can be anything
32 Graph traversal (BFS or DFS?) Breadth First SearchImplemented with QUEUE (FIFO)Finds pages along shortest pathsIf we start with “good” pages, this keeps us close; maybe other good stuff…Depth First SearchImplemented with STACK (LIFO)Wander away (“lost in cyberspace”)
33 Universal crawlers Support universal search engines Large-scale Huge cost (network bandwidth) of crawl is amortized over many queries from usersIncremental updates to existing index and other data repositories
34 Large-scale universal crawlers Two major issues:PerformanceNeed to scale up to billions of pagesPolicyNeed to trade-off coverage, freshness, and bias (e.g. toward “important” pages)
35 Large-scale crawlers: scalability Need to minimize overhead of DNS lookupsNeed to optimize utilization of network bandwidth and disk throughput (I/O is bottleneck)Use asynchronous socketsMulti-processing or multi-threading do not scale up to billions of pagesNon-blocking: hundreds of network connections open simultaneouslyPolling socket to monitor completion of network transfers
36 Universal crawlers: Policy CoverageNew pages get added all the timeCan the crawler find every page?FreshnessPages change over time, get removed, etc.How frequently can a crawler revisit ?Trade-off!Focus on most “important” pages (crawler bias)?“Importance” is subjective
37 Web coverage by search engine crawlers This assumes we know the size of the entire the Web. Do we? Can you define “the size of the Web”?
38 Maintaining a “fresh” collection Universal crawlers are never “done”High variance in rate and amount of page changesHTTP headers are notoriously unreliableLast-modifiedExpiresSolutionEstimate the probability that a previously visited page has changed in the meanwhilePrioritize by this probability estimate
39 Do we need to crawl the entire Web? If we cover too much, it will get staleThere is an abundance of pages in the WebFor PageRank, pages with very low prestige are largely uselessWhat is the goal?General search engines: pages with high prestigeNews portals: pages that change oftenVertical portals: pages on some topicWhat are appropriate priority measures in these cases? Approximations?
40 Complications Web crawling isn’t feasible with one machine SecWeb crawling isn’t feasible with one machineAll of the above steps distributedMalicious pagesSpam pagesSpider traps – incl dynamically generatedEven non-malicious pages pose challengesLatency/bandwidth to remote servers varyWebmasters’ stipulationsHow “deep” should you crawl a site’s URL hierarchy?Site mirrors and duplicate pagesPoliteness – don’t hit a server too often
41 your guide for the search engines ROBOT.TXTyour guide for the search engines
42 What is robots.txt?It’s a file in the root of your website that can either allow or restrict search engine robots from crawling pages on your website.
43 How does it work?Before a search engine robot crawls your website, it will first look for your robots.txt file to find out where you want them to go.There are 3 things you should keep in mind:Robots can ignore your robots.txt. Malware robots scanning the web for security vulnerabilities, or address harvesters used by spammers, will not care about your instructions.The robots.txt file is public. Anyone can see what areas of your website you don’t want robots to see.Search engines can still index (but not crawl) a page you’ve disallowed, if it’s linked to from another website. In the search results it’ll then only show the url, but usually no title or information snippet. Instead, make use of the robots meta tag for that page.
44 What to put in your robots.txt file User-agent:This is the line where you define which robot you’re talking to. It’s like saying hello to the robot: User-agent: * (Googlebot - Google, Slurp – Yahoo)Disallow:This tells the robots what you don’t want them to crawl on your site: Disallow: / (do not crawl anything on my site) /images/AllowThis tells the robots what you want them to crawl on your site.Allow: /
45 What to put in your robots.txt file (Asterisk / wildcard *)With the * symbol, you tell the robots to match any number of any characters. Very useful for example when you don’t want your internal search result pages to be indexed.Disallow: *contact* (do not crawl any urls containing the word contact)$ (Dollar sign / ends with)The dollar sign tells the robots that it is the end of the url.Disallow: *.pdf$# (Hash / commeYou can add comments after the “#” symbol, either at the start of a line or after a directive.
46 What to put in your robots.txt file Crawl-DelayThis directive asks the robot to wait a certain amount of seconds after each time it’s crawled a page on your website..Crawl-delay: 5Request-rate:Here you tell the robot how many pages you want it to crawl within a certain amount of seconds. The first number is pages, and the second number is seconds.Request-rate: 1/5 # load 1 page per 5 secondsVisit-time:It’s like opening hours, i.e. when you want the robots to visit your website. This can be useful if you don’t want the robots to visit your website during busy hours (when you have lots of human visitors).Visit-time: # only visit between 21:00 (9PM) and 05:00 (5AM) UTC (GMT)
47 Test your pagehttps://www.google.com/webmasters/
49 What is SEO? SEO = Search Engine Optimization Refers to the process of “optimizing” both the on-page and off-page ranking factors in order to achieve high search engine rankings for targeted search terms.Refers to the “industry” that has been created regarding using keyword searching a a means of increasing relevant traffic to a website
51 What is a SEO Algorithm?Top Secret! Only select employees of a search engines company know for certainReverse engineering, research and experiments gives SEOs (search engine optimization professionals) a “pretty good” idea of the major factors and approximate weight assignmentsThe SEO algorithm is constantly changed, tweaked & updatedWebsites and documents being searched are also constantly changingVaries by Search Engine – some give more weight to on-page factors, some to link popularity
53 A good SEO strategy:Research desirable keywords and search phrases (WordTracker, Overture, Google AdWords)Identify search phrases to target (should be relevant to business/market, obtainable and profitable)“Clean” and optimize a website’s HTML code for appropriate keyword density, title tag optimization, internal linking structure, headings and subheadings, etc.Help in writing copy to appeal to both search engines and actual website visitorsStudy competitors (competing websites) and search enginesImplement a quality link building campaignAdd Quality contentConstant monitoring of rankings for targeted search terms
54 Ranking factors On-Page Factors (Code & Content) Off-Page Factors #3 - Title tags <title>#5 - Header tags <h1>#4 - ALT image tags#1 - Content, Content, Content (Body text) <body>#6 - Hyperlink text#2 - Keyword frequency & densityOff-Page Factors#1 Anchor text#2 - Link Popularity (“votes” for your site) – adds credibilityAnchor text is the visible hyperlinked text on the page.anchor text is usually used to indicate the subject matter of the page that it links to. For example, the text “7th world congress ebusiness" indicates to visitors that they can expect to see content about conference pertaining to ebusiness if they visit the link.This pattern of usage has been applied in search engine algorithms to enhance the relevance of the "target" or the "landing page" URL for the keywords appearing within the anchor text.
55 What a Search Engine Sees View > Source (HTML code)
56 Pay Per Click PPC ads appear as “sponsored listings” Companies bid on price they are willing to pay “per click”Typically have very good tracking tools and statisticsAbility to control ad textCan set budgets and spending limitsGoogle AdWords and Overture are the two leaders
57 PPC vs. “Organic” SEO Pay-Per-Click “Organic” SEO results in 1-2 days easier for a novice or one little knowledge of SEOability to turn on and off at any momentgenerally more costly per visitor and per conversionfewer impressions and exposureeasier to compete in highly competitive market space (but it will cost you)Ability to generate exposure on related sites (AdSense)ability to target “local” marketsbetter for short-term and high-margin campaignsresults take 2 weeks to 4 monthsrequires ongoing learning and experience to achieve resultsvery difficult to control flow of trafficgenerally more cost-effective, does not penalize for more trafficSERPs are more popular than sponsored adsvery difficult to compete in highly competitive market spaceability to generate exposure on related websites and directoriesmore difficult to target local marketsbetter for long-term and lower margin campaigns
58 Keys to Successful SEO Strategy 1. Do not underestimate the importance of keyword research2. Be sure to include the proper tags in your page coding3. You must have optimized content! (3-5 uses of keyword per 250 words)4. Use content marketing
59 Marketing/Brand Relevance Keyword SelectionHow much competition (large, authority sites) is there for the particular keyword?How closely does the keyword match your product/service offering, messaging, goals and objectives?Marketing/Brand RelevanceOptimizationOpportunityRecommendedKeywordsCompetitionSearch FrequencyIs there already a logical place on the site to optimize for the particular keyword?How many people are searching on the particular keyword?