Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2008 Stephan M Spencer Netconcepts Unraveling URLs and Demystifying Domains presented by Stephan Spencer,

Similar presentations


Presentation on theme: "© 2008 Stephan M Spencer Netconcepts Unraveling URLs and Demystifying Domains presented by Stephan Spencer,"— Presentation transcript:

1 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Unraveling URLs and Demystifying Domains presented by Stephan Spencer, Founder & President, Netconcepts

2 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Subdomains vs. Subdirectories  Matt's/Google's announcement – they'd essentially treat them as the same (www.mattcutts.com/blog/subdomains-and- subdirectories/)  You shouldn't treat subdomains as a means of creating tons of easy thin-content microsites. They're being viewed as subdirectories. Yes, use them for managing your website and doing load balancing. No, don't use them purely for SEO reasons.

3 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Microsites  Can be bad for your SEO if overly numerous or if they contain substantial amounts of duplicate content (merely changing the UI doesn’t count)  Can be good when you’ll get more link love –Hyphothetical example: stayinghealthy.com vs. stayinghealthy.metlife.com  Can also be beneficial in terms of demographic targeting and focused keyword targeting

4 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Keywords in URLs  Beneficial in Google regardless of whether in filename/directory/subdirectory names versus variable values in querystrings.  In other search engines, more important that the keyword be in filename/directory/subdirectory. And, the closer the keyword(s) are to the root domain name, apparently the more weight they will lend.  Just because a keyword is bolded in the SERP doesn’t mean it’s given extra weight in the ranking algo.

5 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Word Separators in URLs  Hyphens are the best. Preferred over underscores. –Historically to Google underscores were not word separators –Bare spaces cannot be used in URLs. Character encoded equivalents for "white space" character are + or %20. (e.g. blue%20widgets.htm). Regardless, hyphen is preferred.  Too much of a good thing looks like keyword stuffing –Aim for fewer than a half dozen words (i.e. <5 hyphens) –See my Matt Cutts interview (stephanspencer.com/search-engines/matt-cutts-interview)

6 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com URL Stability  An annually recurring feature, like a Holiday Gift Buying Guide, should have a stable URL –When the current edition is to be retired and replaced with a new edition, assign a new URL to the archived edition  Otherwise link juice earned over time is not carried over to future years’ editions

7 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Domain Age and Expiry  Crusty old domains (and crusty old sites) are more trusted by Google, as alluded to in Google’s "Information retrieval based on historical data” patentInformation retrieval based on historical data –Parked domains aren’t as trusted. Start the clock running.  Number of years that your domain name has before expiring may very well be a big quality indicator. –Suggest increasing the registration period for your domain so the expiration date will be further in the future –Particularly for newer domains

8 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Domain Age and Expiry –Domainers often have been known to do "tasting” (i.e. registering domains for just a couple of days to see what keyword traffic they get) –Google just announced that they'll stop displaying AdSense ads on domain tasting sites as a measure to try to fight the practice (www.informationweek.com/news/showArticle.jhtml?articleID =205918984)

9 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Rewriting Your Spider-Unfriendly URLs  3 approaches: 1)Use a “URL rewriting” server module / plugin – such as mod_rewrite for Apache, or ISAPI_Rewrite for IIS Server 2)Recode your scripts to extract variables out of the “path_info” part of the URL instead of the “query_string” 3)Or, if IT department involvement must be minimized, use a proxy server based solution (e.g. Netconcepts' GravityStream) –With (1) and (2), replace all occurrences of your old URLs in links on your site with your new search-friendly URLs. 301 redirect the old to new URLs too, so no link juice is lost.

10 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Let’s Geek Out!

11 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com URL Rewriting – Under the Hood  If running Apache, place “rules” within.htaccess or your Apache config file (e.g. httpd.conf, sites_conf/…) –RewriteEngine on –RewriteBase / –RewriteRule ^products/([0-9]+)/?$ /get_product.php?id=$1 [L] –RewriteRule ^([^/]+)/([^/]+)\.htm$ /webapp/wcs/stores/servlet/ProductDisplay?storeId=10001&c atalogId=10001&langId=-1 &categoryID=$1&productID=$2 [QSA,P,L]

12 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com URL Rewriting – Under the Hood  The magic of regular expressions / pattern matching –* means 0 or more of the immediately preceding character –+ means 1 or more of the immediately preceding character –? means 0 or 1 occurrence of the immediately preceding char –^ means the beginning of the string, $ means the end of it –. means any character (i.e. wildcard) –\ “escapes” the character that follows, e.g. \. means dot –[ ] is for character ranges, e.g. [A-Za-z]. –^ inside [] brackets means “not”, e.g. [^/]

13 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com URL Rewriting – Under the Hood –() puts whatever is wrapped within it into memory –Access what’s in memory with $1 (what’s in first set of parens), $2 (what’s in second set of parens), and so on  Regular expression gotchas to beware of: –“Greedy” expressions. Use [^ instead of.* –.* can match on nothing. Use.+ instead –Unintentional substring matches because ^ or $ wasn’t specified

14 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com URL Rewriting – Under the Hood  Proxy page using [P] flag –RewriteRule /blah\.html$ http://www.google.com/ [P]  [QSA] flag is for when you don’t want query string params dropped (like when you want a tracking param preserved)  [L] flag saves on server processing  Got a huge pile of rewrites? Use RewriteMap and have a lookup table as a text file

15 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com If You’re on Microsoft IIS Server  ISAPI_Rewrite not that different from mod_rewrite  In httpd.ini : –[ISAPI_Rewrite] RewriteRule ^/category/([0-9]+)\.htm$ /index.asp?PageAction=VIEWCATS&Category=$1 [L] –Will rewrite a URL like http://www.example.com/index.asp?PageAction=VIEWCATS &Category=207 to something like http://www.example.com/category/207.htm

16 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com 301 Redirects – Under the Hood  In.htaccess (or httpd.conf), you can redirect individual URLs, the contents of directories, entire domains… : –Redirect 301 /old_url.htm http://www.example.com/new_url.htm –Redirect 301 /old_dir/ http://www.example.com/new_dir/ –Redirect 301 / http://www.example.com  Pattern matching can be done with RedirectMatch 301 –RedirectMatch 301 ^/(.+)/index\.html$ http://www.example.com/$1/

17 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com 301 Redirects – Under the Hood  Or use a rewrite rule with the [R=301] flag –RewriteCond %{HTTP_HOST} !^www\.example\.com$ [NC] –RewriteRule ^(.*)$ http://www.example.com/$1 [L,QSA,R=301]  [NC] flag makes the rewrite condition case-insensitive

18 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Conditional Redirects, Under the Hood  Selectively redirect bots that request URLs with session IDs to the URL sans session ID: –RewriteCond %{QUERY_STRING} PHPSESSID RewriteCond %{HTTP_USER_AGENT} Googlebot.* [OR] RewriteCond %{HTTP_USER_AGENT} ^msnbot.* [OR] RewriteCond %{HTTP_USER_AGENT} Slurp [OR] RewriteCond %{HTTP_USER_AGENT} Ask\ Jeeves RewriteRule ^/(.*)$ /$1 [R=301,L]  Utilize browscap.ini instead of having to keep up with each spider’s name and version changes

19 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com URLs that Lead to Error Pages  Traditional approach is to serve up a 404, which drops that obsolete or wrong URL out of the search indexes. This squanders the link juice to that page.  But what if you return a 200 status code instead, so that the spiders follow the links? Then include a meta robots noindex so the error page itself doesn’t get indexed.  Or do a 301 redirect to something valuable (e.g. your home page) and dynamically include a small error notice?

20 © 2008 Stephan M Spencer Netconcepts www.netconcepts.com sspencer@netconcepts.com Thanks!  This Powerpoint can be downloaded from www.netconcepts.com/learn/unraveling-urls.ppt  For 180 minute long screencast (including 90 minutes of Q&A) on SEO for large dynamic websites (taught by myself and Chris Smith) – including transcripts – email seo@netconcepts.com  Questions after the show? Email me at stephan@netconcepts.com


Download ppt "© 2008 Stephan M Spencer Netconcepts Unraveling URLs and Demystifying Domains presented by Stephan Spencer,"

Similar presentations


Ads by Google