Spammers are lazy

Last July, wanting to prove that simplistic protection of email links by just escaping them as numeric character references (a@b.com to produce a@b.com) was a lousy idea — and how could it not be? even without any economic incentive, it wouldn’t take me long to write the code needed to harvest them just fine — I put an encoded SpamMotel address in my sidebar, along with a fresh address in the unprotected part of my accessibly spamproofed address. I figured it wouldn’t take long before the encoded address was getting just as spammed as the other.

This morning, when I got my third actual email through the encoded one (I guess the “Harvester Test” headline wasn’t quite clear enough), I finally remembered to turn it off and take it out. The final tally, for the encoded address: 46 spams, 3 actual emails; for the unencoded address: 2632 spams. Apparently, if you don’t have time to really harden an address, it’s worth taking the time to at least convert it to NCRs. Lazy spammers.

16 Comments

Comment by Rob... #
2005-03-07 01:00:40

I’m amazed! I expected the same as you in that the encoded one should have received around the same number of spams.

How do you ”really harden an address” though ?

Comment by Phil Ringnalda #
2005-03-07 08:10:16

The ”accessibly spamproofed” link goes to a post I wrote almost three years ago (which means I should have looked at how embarrasing my JavaScript was before linking to it), about writing in the real address with JavaScript from three out-of-order pieces. You do have to be willing to have a throwaway address for non-JavaScript browsers (and spambots), but while I’ve gone through I think five of those, throwing them away after they’ve been harvested enough to get a couple thousand spams, I’ve gotten exactly one spam at the protected address, and only two real emails that used the throwaway. It’s an ugly hack, but it beats having to have my public address be ”2004JanToMarOnly@”

 
 
Comment by Anne #
2005-03-07 03:24:19

I have used hexadecimal character references on several websites and revieved only a couple of spam e-mails. (On some no spam at all.)

(Somehow this entry feels like deja vu. I believe you posted something similar before. Or was it in the comments?)

Comment by Phil Ringnalda #
2005-03-07 08:16:10

Or was it in your comments, talking about using NCRs? It’s only a matter of time until I start repeating myself: when I go back to read posts from three or four years ago, I’m always amazed at by how enthusiastic and naive the guy who wrote them was, but I don’t really remember being him.

 
Comment by jon #
2005-03-11 13:12:28

anne, i was thinking the same thing (about deja vu).

 
 
Comment by Shawn #
2005-03-07 08:07:33

How do you ”really harden an address” though ?

You don’t use it…
Hardly a good solution though. Amazing that the simple encoding worked as well as it did. Thanks for sharing the results - great exercise.

 
Comment by Greg #
2005-03-07 08:16:59

There are plenty of methods to encode or obfuscate an email address - I prefer contact forms myself, but combining two or more, such as javascript plus css reversal, works pretty damn well.

Comment by Michael Koziarski #
2005-03-07 18:25:34

Greg,

The last comment form I used got spammed to hell anyway? Do you use CAPCHAs or anything like that?

 
 
Comment by Pete Prodoehl #
2005-03-08 06:26:00

You do realize this is a trick right? Those dang spammers are just waiting… waiting for someone to say ”Hey, use numeric character references, it’s pretty darn safe!” And now that you have done so, they will unleash hell and the harvesting will begin.

 
Comment by Clay Dowling #
2005-03-09 06:35:07

While I’ve used the email address encoding, my real address is in wide circulation and I get one to two hundred pieces of spam per day. My user id is also a common male name, so I get hit by all the dictionary attacks. The solution for me was to install dspam, a bayesian filter that works at the mail server level. My spam volume hasn’t gone down, but I receive considerably less in my inbox. dspam is currently running at 98%+ accuracy, which means I get on average two spams a day that make it through to my inbox.

 
Comment by Recipher #
2005-03-09 19:11:21

That’s sounds pretty good Clay. Does anyone know about using javascript encoding functions instead? I’m pretty sure harvesters can’t YET.

I’ve been using this and I barely get any spam.

http://automaticlabs.com/products/enkoderform/

 
Comment by iwanttokeepanon #
2005-03-14 15:17:18

I have used http://www.iki.fi/petterik/jpgind.html from time to time to create image galleries.

Of course there is no real hardening, but if an NCR encryption works well, this one should work even better. The harvester could not use a regex to translate this …. it’d need a full blown JS engine.

If I ask jpgind to encode my email address, this is what I get (BTW, visit http://popfile.sourceforge.net/) :

<script language=”JavaScript” type=”text/javascript”>
<!–
var Rh = new Array(’p’,’%’,’a’,’r’,’u’,’0’,’y’,’f’,’m’,’t’,’b’,’2’,’e’,’n’,’n’,’2’,’c’,’W’,’?’,’d’,’n’,’i’,’.’,’d’,’o’,’a’,’j’,’s’,’E’,’l’,’t’,’i’,’o’,’W’,’I’,’o’,’2’,’%’,’e’,’g’,’0’,’l’,’v’,’0’,’e’,’=’,’0’,’e’,’r’,’n’,’l’,’@’,’t’,’n’,’e’,’a’,’n’,’:’,’o’,’%’,’%’,’a’,’l’,’S’,’g’,’k’,’0’,’w’,’t’,’r’,’e’,’e’,’u’,’n’,’a’,’:’,’%’,’a’,’2’,’2’,’b’,’i’,’d’,’e’,’n’,’t’,’i’);

var E = new Array(8,77,81,50,9,24,57,31,67,61,49,52,30,35,65,44,71,0,55,20,58,53,51,7,69,32,73,85,22,48,4,18,63,72,80,26,38,16,68,45,17,47,10,76,15,43,39,2,41,62,70,3,6,75,60,11,40,28,21,29,83,12,13,59,78,46,74,84,82,1,36,66,34,42,25,56,27,37,79,5,33,54,19,23,86,14,64);

document.write(’<a href=”’);
for (var i = 0; i < Rh.length; i++) {
document.write(Rh[E[i]]);
}
document.write(’”>Feedback</a>’);
// –>
</script>


L8r.

 
Trackback by Flashes of Panic #
2005-03-15 09:00:56

Confirmation

Phil Rignalda ran a trial obfuscating his posted email address with the same entity-encoding method [I spelled out here][2], and discovered that spammers are lazy: apparently this quickie obfuscation method is remarkably effective.

 
Comment by Anonymous #
2005-04-01 04:53:41

Some JS like this works fine:

var a = ”lto:some”
var b = ”one@b”
var c = ”l”;
var d = ”ah.c”;
var e = ”om”;
document.write(”mai” + a + b + c + d +e);

Unless the spammer is evaluating web content the JS will be meaningless. For extra protection, you could do this evaluation from the onclick handler instead of loading. Unless the spammer clicks on the link, they just won’t get your addr.

For those few people with no JS, put a NOSCRIPT version of the addr that contains a few junk chars and instructions for the person on how to remove them.

 
Comment by Anonymous #
2005-04-01 05:03:17

A second comment (I mentioned the JS above). Don’t use enkoder or at least rename the hiveware_enkoder function to something random. Why? A spammer need only google for hiveware_enkoder, strip out and execute the JS and they’ll get a whole bunch of supposedly hardened email addresses. Hell, a malicious spammer might even do it for the amusement value alone.

 
Comment by Greg #
2005-07-01 11:56:24

Well, four months later, here’s my response. :-)
I don’t use any CAPCHAS or anything - part of my success in comment spam may be that I coded my own blog from scratch rather than using WordPress or Moveable Type. Perhaps the bots can’t figure which fields to fill. And I’ve got hidden links in the page which don’t display to humans but lead here - to a page I wrote which spits out fake email addresses. That one’s intended to keep my contact form clean by sending spambots there before they reach my contact page. Works so far - no comment spam and no contact form spam.

 
Name (required)
E-mail (required - never shown publicly)
URI
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <del datetime="" cite=""> <dd> <dl> <dt> <em> <i> <ins datetime="" cite=""> <kbd> <li> <ol> <p> <pre> <q cite=""> <samp> <strong> <sub> <sup> <ul> in your comment.