URLs including zipcode are prohibited

Among the email which I was absolutely certain would be awaiting my return was a comment notification telling me that my comments had been spammed by a zipcode lookup program salesman. Again. It seems remarkable, given the sky-is-falling feeling I had the first time, but virtually all of my comment spam comes from just one person, spamming one thing. (And I’m far from alone: Luke, if you leave comment spam undeleted, God kills twenty kittens).

So, no more. In /{your MT directory}/lib/MT/App/Comments.pm, at line 95 (in an unhacked 2.64) is


    if ($url) {
        require MT::Util;
        if (my $fixed = MT::Util::is_valid_url($url)) {
            $url = $fixed;
        } else {
            return $app->handle_error($app->translate(
                "Invalid URL '[_1]'", $url));
        }
    }

Alter that to:


    if ($url) {
        require MT::Util;
        if (my $fixed = MT::Util::is_valid_url($url)) {
            $url = $fixed;
        } else {
            return $app->handle_error($app->translate(
                "Invalid URL '[_1]'", $url));
        }
        if ($url =~ m/zipcode/) {
            return $app->handle_error($app->translate(
                "URLs including zipcode are prohibited"));
        }
    }

If he adds a new URL, including zip-code or postalcode, easy enough to add that to the regex. As horrible as the potential for comment spam seemed nine months ago, it’s pretty amazing that that simple hack would have blocked nearly every bit I’ve gotten since. I just hope nobody’s registering blogging-my-zipcode.com while I type this, since I’m no longer accepting any URL including those seven deadly letters.

24 Comments

Comment by franCk #
2003-07-30 01:44:11

I bet the solution here is the ’graphical’ registration : you know those distorted graphics that display a serie of number or letter.
You would need to type the serie before your post is accepted – et voila!
No need for registration and only a human brain can interpret the graphics… of course this only solves the problem of the ’repeated’ spam – but I bet it is the most annoying.

Comment by Kafkaesquí #
2003-07-30 04:52:49

”There’s none so blind as they that won’t see.”

An issue with using a method like this is how it affects the visually impaired. So it’s a solution, but one that would lock out a portion of the non-spamming audience. Never a good thing.

Comment by Phil Ringnalda #
2003-07-30 09:03:55

Exactly: the whole comment spam thing started just a few months after Mark’s Dive Into Accessibility series, so at that point I instantly knew that those things (which have a name that escapes me) were out the window. When you are a big company with lots of employees offering a free service, you can maybe get away with forcing blind users to follow an alternate link that involves a human verifying that they are human (hrm, do they actually do that, and how?), but for comments, where any extra steps mean that you’ll get fewer comments, it’s just not an option.

 
 
 
Comment by Luke Hutteman #
2003-07-30 07:54:18

Thanks Phil – I remember when that comment came in I just wondered what kind of moron would respond ”count me in please” to a 3 month old post, but I just shrugged and forgot all about it, not realizing it was link-spam. I’ve removed that comment now and blocked the IP.

By the way, your spam-block code can easily be circumvented by simply linking to some spam site from the comment-body. I guess the only way around that is to add code to scan for links to forbidden urls in the comment (or by disallowing html comments altogether of course)

Comment by Phil Ringnalda #
2003-07-30 09:11:56

Could be circumvented, but not exactly easily. I assume that ZipcodeBoy has scripted his comment spamming (because it would be beyond insane not to have), and even for a human it’s damn near impossible to find out whether HTML is allowed in comment bodies. For a script it would be miserable (you would have to script in preview parsing, bleah), and while pretty much every comment form accepts a URL and converts it to a link, a huge number don’t accept HTML, so switching to the body loses you a ton of potential victims.

The other reason URL is better for spamming is exactly why it very nearly worked with you: if you got a comment on an old entry saying ”Count me in. <a href=”http://moronzipcode.com”>postal code</a>” you would have instantly known it was spam, but most people don’t actually look at the URL of new senseless comments at all.

Comment by Luke Hutteman #
2003-08-26 14:01:14

ZipCodeBoy is not alone – I recently had to remove 2 more spam comments – one of which included a ton of links in the description to a bunch of porn sites, the other being a standard ”me too!” type entry (but not to a zipcode site).

Thankfully I have not been spam-bombed yet – all spam comments on my site so far have been for single entries. I wish there was an easier solution than changing the MT source-code every time this happens tho…

Maybe Ben and Mena could add banned URLs to the MT Config for a future version of Movable Type.

 
 
 
Comment by Bill Kearney #
2003-07-30 08:49:03

Another way to handle this is to prefix the outbound links when you render the pages. This way you can control (and track) what URLs are being used. Choke off the ones trying to point to problematic sites at that point as well.

Comment by Phil Ringnalda #
2003-07-30 09:20:30

Prefix as in run them all through a redirecting script? It’s an idea with some nice benefits, and would make for a smoother hack, but I don’t like the loss of referer data (I’ve been pulled into lots of interesting comment threads when someone links to one of my entries), and unless you make an effort to shed Googlebot along the way (robots.txt, or as I read the other day, just include an ”id=” variable in the query string), you would probably end up passing even more PageRank, with every comment link pointing to one page on your site, which then appears to Googlebot to point to one other site at a time. Dunno.

 
 
Comment by Pat #
2003-10-07 15:12:32

Thanks for the comment Phil. I installed Stepan’s hack as recommended and it works a treat!

 
Trackback by Neil's World #
2003-07-30 11:03:43

Blocking Zip Code Spam

Phil Ringnalda is offering an MT Hack which blocks Zip Code spam in post comments. I was hit by this myself in the dim and distant past, but it hasn’t happened recently, so that’s one reason why I haven’t done this myself.

 
Trackback by padawan.info #
2003-08-28 13:21:55

Get your zipcode away from my weblog

Enough is enough. I have received so far only five spams through dummy comments on this site, one was some…

 
Trackback by padawan.info/fr #
2003-09-02 00:20:00

Antispam pour Movable Type

Ce billet est une traduction de URLs including zipcode are prohibited de Phil Ringnalda, que j’ai un peu étendu pour inclure le sombre imbécile qui m’emm pollue les commentaires avec son viagra. Cette péthode permet d’exclure certains mots des URLs…

 
2003-09-24 07:44:40

Killing Comment Spam Dead

In response to a LazyWeb request, I present a new low-impact high-efficacy solution for preventing comment spam and killing ones after the fact. Moveable Type and three MT plugins are required, but no source code hacks are used.

 
Trackback by Jeremy Zawodny's blog #
2003-09-28 17:54:08

Blog Comment Spam on the Rise

Yesterday Jay noticed that I was having a comment spam problem. A few low-life moron assholes have been using my blog to try to boost the PageRank of their various businesses: search engine optimization, porn, and cheap prescription drugs. He suggested…

 
Trackback by Brainstorms and Raves #
2003-10-04 15:03:03

Friday Feast #61: Unwanted Comments

I was absolutely horrified when I read Phil Ringnalda’s comment spam alert story last year in which a Las Vegas real estate agent used a script to try to autogenerate comments to every single one of Phil’s entries, including links to the spammer’s real…

 
Trackback by MT Extensions #
2003-10-06 16:26:07

Avoid Comment Spam

Use a blacklist that contains spamming URLs to automatically prevent a comment spammer from using your blog as a link reference to their site.

 
Trackback by A Blog's Life #
2003-10-07 01:51:00

Terminator spree

Okay, enough is enough. When the spammers are posting to your blog more often than you are then it is

 
Trackback by You Who? #
2003-10-08 09:40:21

Die Comment Spam, Die!

For like the last two weeks I have woken up every morning to a piece of comment spam on my…

 
2003-10-12 07:30:11

Blogging: Comment Spam

Like practically everybody else in the blogsphere at the moment, I’m suffering quite a bit of comment spam: I had to block my first IP address yesterday – and now I’m blocking the following 7 IP addresses: 209.210.176.19 209.210.176.20 209.210.176.21 2…

 
2003-10-13 17:31:02

Killing Comment Spam Dead

In response to a LazyWeb request, I present a new low-impact high-efficacy solution for preventing comment spam and killing ones after the fact. Moveable Type and three MT plugins are required, but no source code hacks are used.

 
Trackback by tUP | Blog #
2003-10-16 01:21:25

Moving Forward with Movable Type

As many of you who maintain public weblogs with Movable Type know, there’s an increasing problem with comment spam on… [Movable Type News] I’ve been using this hack from phil ringnalda dot com (link via Padawan.info) on this weblog…

 
Trackback by Jeremy Zawodny's blog #
2003-11-03 07:25:49

Blog Comment Spam on the Rise

Yesterday Jay noticed that I was having a comment spam problem. A few low-life moron assholes have been using my blog to try to boost the PageRank of their various businesses: search engine optimization, porn, and cheap prescription drugs. He suggested…

 
Trackback by Yes, I know it's boring #
2004-11-11 09:45:00

Blocking some BLOG SPAM

This handy URL gave me the initial code to start with. The code samples started off with and older version of Movable Type, but I’ve updated it a bit and come up with something that will block the BLOG SPAM…

 
Trackback by A Day After Yesterday #
2004-11-22 02:59:42
 

Sorry, the comment form is closed at this time.