Wednesday, April 8, 2009

Cursed comment spam

From"I recently came across your blog and have been reading along. i thought i would leave my first comment. i don't know what to say except that i have enjoyed reading. nice blog. i will keep visiting this blog very often."

Sounds very genuine doesn't it. "My first comment". Makes you feel very - touched. How nice.

A search for this phrase on Google returns 249, 000 matches.

This. Exact. Phrase.

So despite the fact that it got past the Google comment anti-spammer check, it is not a real comment.

I don't know if this is a script, bot, virus or just a plain army of low paid turks. Whatever, it is very annoying.

I went looking at some of the 249,000 victim websites.

Many that I looked at had this exact "post" inline with many other actual real posts.

In some cases the website owners/original posters had responded back thanking "Elaina" or whoever for their thoughtful comment.

The author of the original article and the other commenters likely have no idea in all of those 249,000 cases.

The weird thing is that for a fairly sophisticated exploit, the website the URL's are pointing to are very plain - they seem to follow this pattern:
  • a basic looking wordpress site with 2-3 posts
  • no real obvious google-traps, or large collections of advertisements
  • commercial sort of angle, real-estate and so on - but no immediate money spinner
  • site purportedly run by a single person, with a personal profile
  • some javascript links all looking to have something to do with Wordpresses K2 sidebar
My current theory for what is going on is that the captcha's are being beaten by a "Turk" type setup.

It could work like this. I want to promote my dodgy website, and pay a service called "Evil Promotions" that promises to raise it up in Google rankings.

The service recruits a small army of low paid workers - or perhaps they are paid in kind with gambling credits or pornography. The workers log into a web interface on "Evil Promotions" website and click a button.

Behind the scenes Evil Promotions run a set of scripts that go around millions of websites, looking for ones that have a comment facility.

Then the automated script finds such "Victim" websites, in their hundreds-of-thousands and it "clicks" on the buttons and makes a post. The script drops the very human sounding text above into the comment field.

The "Victim Site" sends back a captcha - an image with some distorted text, which only a human could read - the script passes the captcha on to the workers via Evil Promotions website. The worker types the response to the captch and clicks a button.

The Evil Promotions script sends the captcha response back to the victim website, which faithfully logs it as a human comment.

The worker just types responses, and clicks, over and over again, never seeing the websites that the responses are going to. I don't know what they could get out of such mindless repetitive work, but I hope for their sakes its worth it.

To me, if I'm half right, the whole thing stinks.

A low act.

Now - I have had an idea - maybe I can fight back. I just left comments on two of the websites I found, explaining what I just found. I asked the authors to do two things:
  • delete the comment
  • contact two other sites that had been spammed and suggest they do the same
Lets see what happens.


  1. I don't get it. Was there something else in their post? A URL or something that was supposed to get in there?

    There was a problem with comment spam containing links but when Google started ignoring links with rel="nofollow", pretty much all the blogging platforms started putting those into links found in comments.

    The idea being that leaving a comment on someone's site with a URL won't automatically get you some of the credibility of the original site (though you'd think Google might have a better way to distinguish author-inserted links and comment links by now).

    I get few enough actual comments on my blog that I just turned on moderation. It means I need to explicitly allow ever comment to hit my blog (not something you can do if you have a high volume of commenting I guess) but the volume I get is small enough that it is no big burden (only 2 clicks if I'm at home or work).

    I think the big scary "your comment must be moderated" messages on my pages also serve to stop the comment spammers because (while I haven't tried actual counting) I seem to get far less comment spam attempts than I used to get.

    Oh and for the record. I hate captchas because I seem to have more trouble doing them than a computer. I don't know why that is, something about them is hard for me to decode.

  2. I don't get it either.

    I don't understand what they're getting out of it.

    Unless the bot/script is amazingly broken, I think all it tries to do is get the URL in there.

    You're right that most of the links have 'nofollow' in them - put in there by the blogging software.

    This is an example of one that doesn't:

    This site seems to use "CodeCanvas", and since the post commented on was 2005 maybe the software is behind the times.

    Evil Promotions might count on that.

    You'd think if running their turks has some cost that they'd be a bit more discerning about what sites they target.

    I'm going to keep digging, asking a few friends and see what I can come up with.

  3. Definitely frustrating.

    This recent article on Slashdot is somewhat apropos: while I am not sure about his people-are-to-lazy thing I think the economics of breaking captchas is pretty compelling.

    I wonder if actual authentication of users would help? I notice that as I post here I can choose an identity from one of several sources. I suppose such as system can only be as good as the sign-up process.


Hi, thanks for leaving your thoughtful on-topic comment!