Presentation is loading. Please wait.

Presentation is loading. Please wait.

IDK0040 Võrgurakendused I Building a site: Publicising Deniss Kumlander.

Similar presentations


Presentation on theme: "IDK0040 Võrgurakendused I Building a site: Publicising Deniss Kumlander."— Presentation transcript:

1 IDK0040 Võrgurakendused I Building a site: Publicising Deniss Kumlander

2 How to make a site really public or search engines’ algorithms

3 Targets We are going to check –how search services are built –how to ensure that your site is included into the search engines DB –how to improve a site's chances of being selected by a search engine in response to a query string

4 High level description Database of URLs, Ranks, Relations etc Crawler Searchers processor scans

5 High level description Notice that search engines creates either general level description of the web as an answer to a searched string or a private view Anyway your are not searching Internet, but an index created by a “search engine” –Simply storing one billion pages of 10 kbytes each (compressed) requires 10TB and another 10TB for indexesTB –Moreover a public search engine requires much more resources than to calculate query results and to provide high availability. –Crawling 1B pages with 10 machines crawling at 100 pages/second would take 1M seconds, or 11.6 days on a very high capacity Internet connection.

6 Adding a site Wait if any already indexed site is refering to your one –The Web is growing much faster than any present-technology search engine can possibly index (see distributed web crawling). In 2006, some users found major search-engines became slower to index new webpagesdistributed web crawling Use an “Add my URL” function in many search engines –a website developer has to be more proactive than ever before about getting listed by search engines and directories. In many cases, this means (unfortunately) that you have to pay a fee to get listed.

7 Searching process All words are ranked by the prevalence of words in standard X language –“Rude” is a more important word that “all” –Common words (”and”, “or”) are thrown away if you are not looking an exact phrase

8 Standard indexing process is important element to look by (is not used by Google) Headlines – –…

9 Prohibit to discover robots.txt –http://www.robotstxt.org/wc/norobots.htmlhttp://www.robotstxt.org/wc/norobots.html Security restrictions

10 Improving the site rank by Google Have other relevant sites link to yours. –Make sure all the sites that should know about your pages are aware your site is online. –Submit your site to relevant directories such as the Open Directory Project and Yahoo!, as well as to other industry-specific expert sites Make a site with a clear hierarchy and text links. Every page should be reachable from at least one static text link. Offer a site map to your users with links that point to the important parts of your site. Create a useful, information-rich site, and write pages that clearly and accurately describe your content. Think about the words users would type to find your pages, and make sure that your site actually includes those words within it. Try to use text instead of images to display important names, content, or links. Check for correct HTML, format etc If you decide to use dynamic pages (i.e., the URL contains a "?" character), be aware that not every search engine spider crawls dynamic pages as well as static pages. Keep the links on a given page to a reasonable number (fewer than 100).

11 Improving the site rank Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Visit http://www.robotstxt.org/wc/faq.html to learn how to instruct robots when they visit your site. http://www.robotstxt.org/wc/faq.html –Use Google Sitemaps Don't use "&id=" as a parameter in your URLs Provide high-quality content on your pages, especially your homepage. This is the single most important thing to do. If your pages contain useful information, their content will attract many visitors and entice webmasters to link to your site –As links says how important your site and what your site is about

12 Improving the site rank: be honest basic principles Make pages for users, not for search engines. Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking." Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you'd feel comfortable explaining what you've done to a website that competes with you. Don't participate in link schemes designed to increase your site's ranking or PageRank. In particular, avoid links to web spammers or "bad neighborhoods" on the web, as your own ranking may be affected adversely by those links. Don't use unauthorized computer programs to submit pages, check rankings, etc. Some specific guidelines Avoid hidden text or hidden links. Don't send automated queries to Google. Don't load pages with irrelevant words. Don't create multiple pages, subdomains, or domains with substantially duplicate content. Don't create pages that install viruses, trojans, or other badware.badware If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.

13 Final notes Keep structure of the site static –It takes a sufficient time for search engines to re-visit your site especially if it is not top- ranked –Other sites starting to refer to your one could produce the “Page not found” error


Download ppt "IDK0040 Võrgurakendused I Building a site: Publicising Deniss Kumlander."

Similar presentations


Ads by Google