Presentation is loading. Please wait.

Presentation is loading. Please wait.

E-insights, LLC © 2000 All rights reserved. www.e-insights.com Understanding Web Traffic Michael Whelan part - 2.

Similar presentations


Presentation on theme: "E-insights, LLC © 2000 All rights reserved. www.e-insights.com Understanding Web Traffic Michael Whelan part - 2."— Presentation transcript:

1 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Understanding Web Traffic Michael Whelan part - 2

2 E-insights, LLC © 2000 All rights reserved. www.e-insights.com What can complicate things Customer Side ISP - dynamic IP assignments AOL/ISPs - proxies Lack of inverse DNS Browser & proxy caching Browser weirdnesses & bugs. Server Side Multiple servers Load balancing Time zone & clock skew Re-directs Scripts Naming conventions Log rolling.

3 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Real World Complications Site coding conventions Caching Server farms Load balancers

4 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Complications - Site Coding HTML redirects –e.g. index.html.. has a HTML redirect to NEWindex.html - what should be counted –(can be delayed re-direct but this is not in the log) Frames –index.html contains body.html, nav.html and adbox.html –what should be counted as the page - parts can be refreshed seperately & used in other ‘pages’ Scripts –things like gettodaysimage.cgi are not ‘obviously image’ types and can get counted –scripts can do different things at different times, or given different input.

5 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Complications - Site Coding continued What is asked for is Logged –references to / / will be fulfilled by invoking the server defaults. May need to count both / and /home.asp in one place and / and /default.htm in another. Consistence when coding helps a lot. Case Sensitivity –Are frontpage.html and FrontPage.html the same thing ? On some servers, the answer is yes and on some it is no. Browsers can’t take the chance, nor can log analysis normally. Only use case WhEn It MaTtErs. Coding Data in URLS –Can result in vast numbers of ‘once viewed pages’.

6 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Complications - Caching Caching can occur in the client’s browser, in their network provider’s ‘proxy’, or in the site provider’s “cache engine” => be careful to count all but not to double count. Without explicit ‘Cache-Control’ headers, default rules (which can vary) apply. Images are particularly susceptible to caching => not good to count. ‘Cache-busting’ tags can make a single ‘page’ appear as many distinct pages => filter carefully.

7 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Complications - Server Farms Multiple servers to handle load & provide redundancy May be different types Configurations (what is logged for example) may differ System time’s may not match. Log’s may be ‘rolled’ at different times.

8 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Load Balancers Spreads user requests across multiple servers in an attempt to optimize response time. User activity can be ‘smeared’ across multiple servers (note issues on time/config in server farms).

9 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Idealized Analysis Flow Format Convert Time Order Merge Virtual Log Bogus Traffic Filter Re-write Rules Traffic Database OperationsMarketing Report Level filter rules Server set std. configs clock synch.

10 E-insights, LLC © 2000 All rights reserved. www.e-insights.com What to Count ? PageView ALL Ignored UnCounted

11 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Standard Excludes.jpg/.jpegJPEG Images.gif/.giffGIF Images.bmpBitMap.cssCascading Style Sheets.class/.jarJava class & archive files.jsJava Script.icoIE Icon Files

12 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Portions of Counting Spec. IGNORE ^/$ ^/accipiter4/.* ^/inc/.* ^/white_bg.html ^/home.html.*css$.*class$.*jpg$.*jpeg$.*gif$.*giff$.*class$.*jar$.*js$.*ico$ ^.*_frame_.* AS SUPPORT IGNORE/robots.txt$ AS ROBOTS IGNORE ^/unique.html$ AS UU COLLECT ^/home4.html ^/home4.aspAS HP COLLECT.*_story_.* AS STORY,,,

13 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Scale Say you do 1,000,000 page views per day. –=> 10,000,000 lines in a log file –=> 1GB of data –=> @1,000 line/sec => 2.8 Hours Log processing can be time consuming. You can get into trouble fast if you need to re-run analysis.

14 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Rules of Thumb Hits should normally be 10x page views. Peak daily traffic should be 4x to 6x average daily traffic. Bandwidth should be something like 10x(total bytes per page)x(pages per sec) - use either peak or average. Visits should be 0.5x to 0.1x (page views). Visit duration can vary as can uniques/visits.

15 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Miscellaneous Thoughts Because of ‘management pressure’ counting errors always involve overcounting. Re-running analysis over an extended period can be pure hell. Re-conciliation is VERY hard & never close to 100%. If someone isn’t looking at the performance on a daily basis - is it worth running the site ? Some ‘event’ WILL occur to force the issue.

16 E-insights, LLC © 2000 All rights reserved. www.e-insights.com Do it RIGHT from the beginning & continuously MONITOR.


Download ppt "E-insights, LLC © 2000 All rights reserved. www.e-insights.com Understanding Web Traffic Michael Whelan part - 2."

Similar presentations


Ads by Google