{"id":218,"date":"2006-03-25T23:44:44","date_gmt":"2006-03-25T12:44:44","guid":{"rendered":"http:\/\/www.somethinkodd.com\/oddthinking\/2006\/03\/25\/a-snowballs-chance-of-beating-spam\/"},"modified":"2006-03-25T23:44:44","modified_gmt":"2006-03-25T12:44:44","slug":"a-snowballs-chance-of-beating-spam","status":"publish","type":"post","link":"https:\/\/www.somethinkodd.com\/oddthinking\/2006\/03\/25\/a-snowballs-chance-of-beating-spam\/","title":{"rendered":"A Snowball&#8217;s Chance of Beating Spam"},"content":{"rendered":"<p>Subscribers to my RSS comments feed may have noticed a recent increase in the number of &#8220;obvious&#8221; spams getting through. I clean them up quickly, but they sometimes get into the RSS feed.<\/p>\n<div class=\"aside\">WordPress improvement idea: A 2-hour grace period on the RSS feed for new posts, and a 36-hour grace period for comments. That would give me time to correct formatting errors in the posts and manually filter the false-negative spams before they get published to the 5-star readers.<\/div>\n<p>I chased down why these spams were getting through the highly-regarded <a href=\"http:\/\/unknowngenius.com\/blog\/wordpress\/spam-karma\/\">SpamKarma 2.2<\/a> plugin. SpamKarma has a filter called the Snowball filter. Looking at the code, it seems to have two key pieces of functionality. <\/p>\n<p>The first is based on the realisation that if you have provided a legitimate comment before, then this comment is probably legitimate, but if you have spammed before then this comment is probably spam.<\/p>\n<p>The second is based on the realisation that if you have sent a lot of comments recently, and the first few get through, but a later one is detected as spam, then the earlier ones are more likely to be spam than was first thought. The bad spam karma &#8220;snowballs&#8221;, and is applied retroactively. To quote a section of the code, if it detects one of the comments is spam, it will &#8220;unleash all minions of Hell on that bad boy&#8217;s company&#8230;&#8221;<\/p>\n<p>The trouble is that the detection of the comment author is a little naive. It uses a number of factors including the email address, the IP address and the domain name of the URL provided by the commenter. It is the latter that is the problem. The Snowball filter gives too much credence to matching domain names.<\/p>\n<p>Remember when <a href=\"http:\/\/www.somethinkodd.com\/oddthinking\/2005\/12\/08\/dealing-with-common-blog-usability-problems\/#comment-2113\">Sunny Kalsi defended not having his own domain name<\/a>? Well, an unexpected downside of him using blogspot.com is that so do spammers. The spammers post to OddThinking including links to their spam blogs, also hosted on blogspot. SpamKarma looks at the &#8220;blogspot.com&#8221; domain, and notices that this is the same domain as a regular, highly-valued commenter, and figures that therefore this new comment should be let through.<\/p>\n<div class=\"aside\">I should make it clear, <a href=\"http:\/\/quaddmg.blogspot.com\">Sunny<\/a> is totally innocent here. He is the good guy in this story.<\/div>\n<p>The good news is I reported it to the drDave, the author of SpamKarma, and, <a href=\"http:\/\/www.somethinkodd.com\/oddthinking\/2006\/02\/09\/blog-software-upgrades\/#comment-2898\">once again<\/a>, he was incredibly responsive. He has a solution in mind, and hopes to implement it very shortly.<\/p>\n<div class=\"aside\">So it seems that <a href=\"http:\/\/unknowngenius.com\/blog\/\">drDave<\/a> is the other good guy in this story.<\/div>\n<div class=\"aside\">For the record, the SpamKarma SnowBall filter puts the following text in the log file &#8220;Commenter granularity (based on URL):&#8221;, followed by an inappropriately high level of karma. I only mention this so that people searching on the web for this issue can find this description.<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Subscribers to my RSS comments feed may have noticed a recent increase in the number of &#8220;obvious&#8221; spams getting through. <\/p>\n<p>The good news is I reported it to the author of SpamKarma. He has a solution in mind, and hopes to implement it very shortly.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"","footnotes":""},"categories":[32],"tags":[],"class_list":["post-218","post","type-post","status-publish","format-standard","hentry","category-about-oddthinking"],"_links":{"self":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/218","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/comments?post=218"}],"version-history":[{"count":0,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/posts\/218\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/media?parent=218"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/categories?post=218"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.somethinkodd.com\/oddthinking\/wp-json\/wp\/v2\/tags?post=218"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}