September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 1 Google, we’ve got a problem Elizeu Santos-Neto
September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 2 Spam Multiple variants , web spam, link spam, tag spam, RSS feed spam, blog spam, etc Blogs are an easy target and tool How ? A spam blog (example) Comment spam (example)
September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 3 What are the effects? Search ranking manipulation Link farms Keyword spoofing User frustration: survey (Schroeder et al.) 25% have seen colleagues kicking their computers 2% confess to have hit the person next to them
September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 4 How to tame spam? Content analysis nofollow attribute Spam-proof ranking strategies “Report Spam” buttons Hybrid solutions
September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 5 Google, we’ve got a problem! “Unusual” posts appeared Design was completely changed Several spam links and comments
September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 6 What did it happen? Hypothesis: operators ignored the messages about spam detection. How does the Blogger spam detection works? (intuition)
September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 7 Spam Detector Blogs Blog Owner Where is my blog?
September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 8 Conclusions and Final Comments Even Google is not immune to operator failures Also, the mechanism seems to make a wrong assumptions about the speed of operators feedback Spam handling turnaround time should be proportional to the volume of visitors? Prefixed trust set of blogs?
September/2007ECE/UBC - Predictable Computing Systems Prof. Sathish Golapakrishnan 9 References Schroeder et al. Collecting, Analysing, and Exploiting Failure Data from Real, Large Systems. Google Tech Talks, October, Spam Blog: mortgage-california-ca.html mortgage-california-ca.html Spam Comment: simply a link to a spam web page in the comments NetworkWorld.com: blog-for.html blog-for.html Risks Digest