An audacious comment spam hack
Even when they are doing something as annoying as spamming comments, I’m always impressed when someone goes beyond the obvious, and comes up with what fans of Blackadder will recognize as a cunning plan.
http://www.wblogs.com without an e is a hosting domain for the Typepad weblog service from Six Apart: rather than having whatever.typepad.com you can have whatever.wblogs.com.
www.wblogs.com is a doubly-cunning site that pretends to be Typepad hosting, as well as pretending to be Radio hosting. Thanks for correcting me, Ben. Can’t believe I was dumb enough not to keep checking. Into the blacklist with you, wblogs.com!
Rod Kratochwill runs a Radio-powered weblog on the radio.weblogs.com hosting server.
Early this morning, someone left a couple of comments on my old Redirecting RSS redirection entry, with the name Rod Kratochwill and the URL, yep, you guessed it, http://radio.wblogs.com/0100146/ without an e. At the moment, that Typepad-hosted page looks just like Rod’s blog. Suspicious for some reason? Knock his usernumber off the URL, and radio.wblogs.com looks just like the page you’re used to seeing at radio.weblogs.com with an e, a list of the most-subscribed-to feeds. View source, and you’ll see that it’s some Javascript cloaking the fact that it’s 10 links to incest-related porn subdomains at cykanax.com, but who views source while checking out the link someone leaves on a comment?
Sadly (for him) the spammer didn’t catch on to all the flurry of discussion of pubsub.com and other things that allow you to subscribe to an RSS feed of a search, or missed the fact that everyone with a name more unusual than “Bob Smith” is now subscribed to at least one search feed for their name. So after all that work setting up a site where he could pretend to be various Radio Userland weblogs, it only took a couple of hours after he commented here before Rod was seeing himself pop up in his aggregator as having left a couple of not-very-grammatical comments on my blog.
Most times, in a situation like this, I would fire off a few emails to webmaster@ and abuse@, but given how nicely the loosely-connected aspects of subscribing to search for names worked, and how I suspect that someone who either works for Six Apart, or will talk to someone who works for Six Apart today, reads my weblog, I think I’ll just let this entry serve as notification that the person who has radio.wblogs.com is surely violating the Typepad TOS. Well, I was dumb enough not to spot the Typepad-spoofing, but at least this part worked.
When will you blogie boys stop crying about comments. It’s the in thing, I know. Impliment comments, then whine about them. Turn em on…turn em off…but for gods sake stop crying. They’re comments…not world peace.
Hurry up and delete this comment before someone sees it…hurry up..
Oh, shoot. I forgot that I left the cable directly connecting my weblog to your brain plugged in. So sorry to have forced you to read something you didn’t want to see.
No, I delete comments that try to do something I don’t want to have done, like push a particular incest site up in SERPs, not comments that just make you look like a lackwit with too much time on your hands.
I cant tell you have a lot of drive, that’s good. Now if you only had some smarts to help you along. All things considered you actually do quite well.
That is clever. Alas, the page is already 404. And even finding out who’s behind this clever scam will have to wait, as
whois -h whois.OnlineNIC.com wblogs.com
times out as the whois server is currently down.Perhaps we need to revisit PGP-signed comments?
I don’t get it. What are they doing? Please explain slowly.
Can’t go slow, no time!
Unfortunately, they aren’t doing it anymore, or not right now for the fake-Rod URL, but, before they switched to just the actual HTML for his blog, if you went to the URL that looks like it’s Rod’s blog, but is at the ”weblogs.com without an e domain” (which I now can’t say in my own comments since I blacklisted it) instead of weblogs.com, and viewed source (and put it in an editor with word-wrap), there was a ton of nonsense looking Javascript at the start, which your browser interprets and displays as something looking like a Radio weblog, but down below that, in the only part of the page that a search engine would see, there was nothing but ten links to porn. They comment on weblogs, leaving the weblogs.com-without-an-e URL. Careful people actually follow the URL to be sure it’s a real comment and not just spam, and see a weblog, so they leave the comment linked. Search engines crawl the comments, see a link, fetch the page, give it higher PageRank, and it transfers that PageRank on to the porn.
If anyone is really interested I found another site that hasn’t reverted just yet and you can see the wacky script stuff when you view source in a browser:
asdf dot wblogs dot com
I also saved the wacky stuff to a file in case it’s usefull to anyone.
Rod
Ah, thanks, I should have thought of that (since it’s actually the first wblogs URL that spammed me, though I wasn’t sharp enough to realize it was spam at the time). I did save a copy of not-you, too, but I doubt it’s actually useful: I’m betting that it’s the standard output of an ”obscure your source so people can’t steal your valuable HTML techniques” tool.
Got it. Thanks for the explanation. I couldn’t figure out why they would go to all that trouble. It’s to fool someone who’s been comment-spammed into thinking the poster is a real blogger, and they won’t see the links to porn sites.
BTW, Radio/Manila have been relatively untouched by comment spam, until the last few days. So we’re going to be joining you soon, brother-in-spam-to-be. ;->
Yup. I follow every link out of every comment (unless I think I recognize the URL, which is a dangerous habit of mine: I wouldn’t follow scriptng.com with your name and email), and either delete the link or the whole comment if I don’t think that what I find fits. But the first one of these got me, since even though it didn’t make sense that someone using Typepad would be importing YACCS comments in Movable Type, it looked like the blog of someone whose name was vaguely familiar to me, and www dot wblogs dot com looked like a Typepad signup page, so I left it. D’oh, Garrett Rooney wouldn’t be needing my crappy little hack to be getting anything done, but I was in a hurry, and got fooled.
”Registrar Name: ONLINENIC, INC. Address: 8/F, Huiyuan Group Building 267th Jiahe Road, Xiamen, Fujian 361012, CN”. Which means that they are outside the reach of law, I suppose.
This one is neither audacious, nor particularly clever. But it is online for all to use.
What thoughtful, public-spirited spammer!
Somehow, I’m just a little reluctant to pop a URL and entry_id in there, to find out what it does.
Well, sure, I know without asking or trying it what it does, but…
The URL you pop in is URL of a MovableType comment CGI script. The entry_id is the entry_id of the post you are going to spam.
The result of posting the form is 3 clickable links which, when clicked, POST 3 different SPAM comments to that entry of that MovableType blog.
For example, putting in the URL of my comment-entry form yields three links which point to:
Very, very, simple-minded, but a real public service for those spammers too stupid to write their own spambot, no?
Not entirely a public service, unless the source is available somewhere so you can plug in your own spam URLs, but interesting in that it points out that MT will accept a GET for comment submission.
Let me say that again, to be sure I really meant it: MT will accept a GET for comment submission.
Want to spam someones comments, but they’ve banned your IP address, and getting set up to use anonymous proxies is too much work? Just put the URLs in a web page, and let Googlebot spam them by GETting what it thinks are pages that should be got.
Sheesh. Sheesh on a shingle.
Let me try that one more time to be sure I still mean it: MT will accept a GET for comment submission.
Oy. That thing about requiring people to decode a CAPTCHA piped in from a Yahoo Mail signup in order to get to free pr0n? Well, they’d be happy to have a 1×1 iframe that GETs a random MT weblog’s comment script, too. Hell, put one on every page.
{mtdir}/lib/MT/App/Comments.pm line 60, after the start of sub post:
I’d simplify that to
If that environment variable isn’t set, then something very screwy is going on …
But, yeah, this is a bit worrisome.
You’ll note that trackback pings, too, can be sent with a GET. The Spec says,
By my calendar, it’s February of 2004. Support for sending Trackback pings using HTTP GET was removed from my MT installation only this evening.
I’ll never get used to Perl. I started with
!=
, like any PHP hack would, and of course it didn’t work.My vague memory on the Trackback-GET thing is that the unexpected fact that IIS doesn’t bother to give Perl the PATH_INFO is why GET is still supported. If you use URLs like /mt-tb.cgi/500 on IIS, as far as it’s concerned you didn’t supply a Trackback ID, so you have to switch it over to the old-style query-string URL, and then to be backwards-compatible with installations that haven’t upgraded, MT has to switch to sending GETs when it sees a TB URL with a query string, so new installations on IIS have to accept GETs, at which point any reasonable person gives up on the whole thing. I can’t imagine that refusing GETs will cause you any troubles, and if I’m right it really only needs to be enabled if ($ENV{SERVER} eq ’Crap’).
In Perl, ”!=” is for comparing numbers, ”ne” is for comparing strings. Go figure …
I suppose that, if you are running IIS, you have bigger things to worry about than being trackback-spammed by the GoogleBot.
Anyway, I posted a patch for disallowing GET in both cases.
And a good thing you did, too, since I’d forgotten to patch this install, so my test blog was nicely protected but this one would be… oh, yeah, getting forced previews ;)
In the Trackback patch, shouldn’t it fail with
return $app->_response(Error => "Please use POST to send a Trackback Ping.");
so it returns a proper XML Trackback error? I admit, I’ve never looked at the code, but I assumed that was why comments just return
$app->error
, but pings return$app->_response(Error => "Whatever")
Guess I should look.
Ack! I always screw that one up.
Good catch! Fixed.
That’s actually a pretty fair nutshell of the difference between Perl and PHP, for me. PHP uses == and != for what you mostly want to say, ”if these two things were showing on the screen, would I say they are the same?” and has less common operators, === and !== for ”are they equal and of the same type”, which you sometimes but not often need.
I don’t doubt for a second that Perl results in more correct code, but then, my Perl is more correct because I usually give up on it before I even start.
Not Wonderyak: http://www.blogstudio.net/wonderyak/index.html
Wonderyak: http://www.blogstudio.com/wonderyak/index.html
jayallen::Blacklist->add(’wblogs.com’);
jayallen::Blacklist->add(’blogstudio.net’);
Further audacity!
http://www.sixapart.org
Trying to cash in on good names all over!
But… but… that’s evil!
And then Shell’s going to want Ben to fix some bugs in exchange for her signature on your copy, and who knows what Cory and Scott and Rael will want.
I suppose the lesson is to register whatever’s around you, certainly com/org/net, whether you think you need to or not. I don’t feel any need for philringnalda.org (I’d rather have philringalda.com and my other usual misspellings), and can’t imagine anyone else wanting it, but that’s just it: I can’t imagine it. And I never would have thought to turn radio.wblogs.com into a cloaked comment spam target. (Now, other than the fact that it’s going to be unsearchable and blacklisted in comments for years to come, I’d rather like to have it, though. There must be some slightly-less-evil use for it.)
Don’t GET it!
Quite by accident, I discovered that one can post comments to MovableType blogs using HTTP GET requests (instead of the…