No unread items

In the past, when I got back from my annual offline vacation, my biggest problem was plowing through email and spam. This time, that wasn’t too bad, but the unread item count in Bloglines was a killer: 5250 unread items. A few hundred, certainly less than a thousand, were just things like stale weather reports, or Wired articles that someone else would have linked if they were worth reading, but still… Most of a week of solid reading, to get to that blessed “No unread items.” Herewith, a nice fat linkdump of some (though certainly not all) of the things that caught my attention:

Anne on the sensible step of taking a site with no use for XHTML back to HTML. What struck me, though, was the assertion that using numeric character references to encode an email link worked perfectly for avoiding spam harvest. After I told Luke the other day that they don’t work, I put one in my main page with a throwaway address (got to work on that pontificate-experiment order thing), and got spammed in 9.5 hours. Other than an unknown difference in how often each page gets crawled by harvesters, the only difference I can see is that his is encoded with hex character references, and mine with decimal. Curious.
Comment Spam, Again
“In line commenting is an essential element of the emerging transparency which makes online communication interesting, and possibly revolutionary. So I’ll deal with the spam protections, though I’d rather just see them shot.” Well said!
Feeds Without Dates and Being Too Clever for Your Own Good
Well, sometimes the workarounds for feeds with missing info work, sometimes they don’t. I probably wouldn’t have ever thought to adjust the faked date of an item without a real date to fit with another item that links to it, but if you told me you were going to try it, I also wouldn’t have guessed the pitfall.
Atom and Cool URIs: dogma, idealism, expediency
Every time I read about Atom Co-Chair Tim Bray saying essentially that if your permalinks aren’t absolutely certainly fixed for all time and thus suitable for Atom ids, you should make it so, as though that was all it takes, I worry for Atom’s future. A tiny shred of understanding about how things are for mere mortals would do wonders.
RSS Scaling Problems: How Can We Help?
Some good and some other ideas. Maybe with our next feed format, we’ll really realize that it’s all about entries, and come up with a workable way for a client to say “I know about entry {id} and previous as of {datetime}, got anything newer or any changes for me?”
Sleeping better and blogging more
Sounds like Russ is quite happy with comment moderation. Of course, the fact that one of the comments on that entry looks to me to actually be comment spam is a bit worrisome, but still…
GMail Notifier Extension 0.3.2
Handy Firefox extension for those with GMail who aren’t using it to subscribe to the atom-syntax list (as Bill de hÓra put it in Freak Atom Occurence, “There’s been no mail from the atom-syntax list in the last 90 minutes or so. How odd.”). atom-syntax means never having to wonder whether you’ve got unread mail.
There Be Dragons Here
“Phil’s ‘well-engineered stuff under basic presentation'” – damn, that’s the nicest thing anyone’s ever said about my total lack of design ability or sense!
The trouble with comments
“Every time I have to wade through a pile of comment spam pointing to sites that sell degradation and the sexualization of misery I feel a little more depressed. At some point in the past few months, I passed out of the relativistic bubble I’d sealed myself into as that sort of stuff passed through my inbox and over my pages and into a state of anger and sadness.” Me, I rarely think about the actual what of what they’re spamming, but, “Maybe it’s about having Ben in the house now and the involuntary process I go through, as have other parents I’ve talked to, when I’m exposed to something I might have previously blown off.”
WordPress Gotcha
Mostly just a note to myself, to look into why using HTML entities in entries would be making Dorothea’s XML feeds not-well-formed. Surely they would be escaped, wouldn’t they?
Keeping Technorati up to date with Apache log analysis
As usual with Ben Hammersley’s stuff, this is clearly either brilliant or insane. Anyone with a good idea who is linking to them knows that Technorati misses lots of things, so Ben has a handy Perl script to help them out: it runs through your Apache log, looking for referers that Technorati doesn’t know about, and when it finds one it pings Technorati on their behalf. Hmm.
Attacking spam methods is useless
Jay’s staying on message, saying that there’s no way to combat comment spam except by targeting the URLs they want to have linked in your comments. Unfortunately, targeting URLs is targeting methods, too. URL-space isn’t quite infinite, but it’s too bloody big to block all the possible bad parts for each individual. If you block, and a thousand URLs that redirect to it, then the spammer just needs to find someone who doesn’t block it, spam them, and then spam their blog in your comments, to transfer some of your PR on to them. The only thing that isn’t targeting methods is when Google spots comment spam, and penalizes the spammer, and does that so often that all possible comment spammers know that the risk is far greater than the possible reward. Oddly enough, looking at backlinks with and at toolbar PageRank for random spammed domains, it looks like they already are working on it. Until that happy day, we’re all targeting methods, some good (forcing preview, blocking URLs, selective moderation), some not (CAPTCHAs are evil, referrers don’t work, IP banning’s naive).
PHP in contrast to Perl
For me, the most telling thing is the quote at the bottom: “Comparing PHP to Perl is like comparing pears to newspapers.” But, I would note that PHP, with its 3079 functions that you have to look up every time, makes it very easy to look them up ( and has very good clear documentation, whereas Perl, with its 260 functions, jolly well better be easy to memorize, because if you forget how to do something you’ll be googling for a comprehensible answer for hours.
Wishlist: the million monkeys at a million typewriters plugin
Matt wants an MT plugin that will let a select circle of friends correct typos in his posts, without having to email him saying “Matt, old stick, ‘you are’ is still you’re, not your” every time. I can’t quite picture it as plugin, since you’re really talking about adding users with a new class of permissions, probably to edit posts (by choice, with a revision history that can be rolled back), and edit comments and pings or send them to moderation, so it’s going to take pretty serious hacking for anyone but 6A to do it, but having a half-dozen friends using a dozen eyes to make all typos and spam comments shallow would be very nice.
A more technical note on Blogger’s implementation of WYSIWYG editing in the browser
Despite having no real interest in using a WYSIWYG editor, I’m always interested in seeing how people do them.
XML on the Web has Failed
Well, like so many things, it’s failed miserably to fulfill its wild promises, but it still sometimes sort of works, and every so often we notice some broken parts, and fix a few of them. Next time? Next time something will promise even more (certainly including “the tools will save us”), and maybe as a result of delivering on the same small percentage of its promises, will deliver more.
XHTML Frequently Answered Questions
I keep hoping that the answer to “why XHTML?” will have a real reason, for people who aren’t doing MathML or SVG. Someday, maybe. Or not.
User Authentication on the World Wide Web
The basics of cookies and alternate methods of authenticating users, from Tom Pike.
Not dead yet
Jim says his Whole Wheat Radio blog isn’t dead, just resting while he’s distracted by things like the Wheat Hole House Concert building and the whole novelty of having more than a few minutes of sunlight. “Maybe I’ll make blog posts more frequently during the dark winter months when things slow down.” More than anything else, that’s what I like about RSS. It will cost me absolutely nothing, and Bloglines and Jim nearly nothing, for me to patiently wait until he feels like saying something through his blog, instead of somewhere else, again. Checking a bookmark, or remembering to visit the web page some other way? I’d probably forget all about it long before winter sets in.
In-ter-esting. Morbus is working on a Drupal module to catalog your movies, books, comics, whathaveyou, according to FRBR (in PDF, but it’s still one of the best ideas to come out of library cataloging in years), using RDF, and aiming to be usable even if you not only don’t know what either one is, you very strongly don’t want to ever know; all you want is to know that yes, you do own The River Why, and further that you own a copy of the 1983 Sierra Club hardcover and a copy of the 1984 Bantam Windstone paperback (spine broken twice, top shelf in the guest bathroom).
Wishlist notifier
Found via the presence of Ed Summers in the LibDB wiki, Wishlist is a Perl script you call from cron, which checks your Amazon wishlist for things that either either heavily discounted (50% off list, by default) or cheap (less than $5, by default), and emails you when it finds them. Sweet and simple: the code is very nearly shorter than the documentation.
Script-killer comments
Oddly enough, one of the last questions I answered in the MT forums before I went offline was about exactly this: if you insist on putting your Javascript in HTML comments, which blows up in XHTML and was intended to keep it from being displayed in, what?, IE 2.0?, then you have to be sure you have a newline after the opening comment tag, or you don’t have any Javascript at all.
Table rows…revealed!
Scott Andrew on how to hide and reveal table rows with Javascript and CSS. I wouldn’t have thought to set display to the empty string to get back to the default behavior, no matter what various browsers think that default behavior is. Nice!
Interesting. What’s Ben up to, that he needed to write a way to access the results of parsing either RSS (through XML::RSS) or Atom (through XML::Atom) without worrying about which format it was?
RSS Scaling Issues
Mark Fletcher is interested in suggestions on how Bloglines can help syndicated feeds scale. I know what I want, but I’m not sure I’ll ever get it: if Atom ends up in a form where a feed can include items from multiple other feeds, with the feed-level data intact, then Bloglines could provide both their browser-based frontend, and also an Atom feed of all your unread items that you could access from any desktop reader that understands Atom. A little fiddling with preferences and an extension element to say whether an item is read or not, and you could read through the browser on the road or at work, then download the items for local storage or search or whatever later, without having to see them again.
JRoller and SharpReader
The pain of doing things right when something goes wrong: SharpReader properly supports Last-Modified headers, so when JRoller returned them with dates in 2028, that left SharpReader poised to wait 24 years before getting updates again. Same sort of thing goes for other HTTP guff: properly support 410 Gone by immediately removing the subscription, and any time someone drops a Redirect gone / in the wrong .htaccess, you mistakenly unsubscribe. HTTP is hard.
DOM scripting book
Stuart Langridge on modern Javascript, the DOM, and unobtrusive DHTML. I am so looking forward to this.
Global worming
Google VP of Operations Urs Hoelzle on the perhaps more widely reported than felt DoS of Google by MyDoom.O: “A very small percentage of our users and networks–most notably, a few media outlets that write about us–were heavily infected with MyDoom, so our systems temporarily blocked their queries.” Heh.


Comment by Tim #
2004-08-01 22:09:01