Who knows a <title> from a hole in the ground?

Results from a few testcases of various forms of escaping an Atom title which is not markup. Not exactly overwhelmingly impressive. In every case (unless I’ve screwed up) the title should be displayed as <title>.

Aggregator HTML CDATA HTML entity HTML NCR Text CDATA Text entity Text NCR XHTML entity XHTML NCR
My Yahoo!
Newsgator Online
Google Reader
RSS Bandit

Some notes:

If you are using Internet Explorer Windows, all those results may well look like identical empty squares, instead of checkmarks for pass and Xes for fail. May I recommend some fonts and a better browser that will make use of them? Or another better browser? That’s gotta be better than digging in the source for the classnames.

Bloglines has one “?” because I failed to subscribe to one feed when I started this last night, and they apparently take eight or ten or twelve hours before they first fetch a feed. Maybe the new datacenter will improve things. Passed, when it finally showed up.

Rojo doesn’t appear because they either take more than 24 hours to first fetch a feed, or they don’t support Atom 1.0, and don’t support it by claiming that it hasn’t yet been fetched. Can’t tell, really.

Kinja doesn’t appear because it says that it’s down for “routine maintenance” and has been all day. Has it been for longer than all day? Is it gone forever? Don’t ask me.

“Windows” Live fails to support Atom 1.0 in a most unamusing way: it pretends that nothing happened. Import the OPML file of test cases, and you get a blank content-hole for a second, then you’re back where you started. Tell it to subscribe to a particular URL, and it starts talking about the results of your search for words in the URL.

My Yahoo! found the most amusing way to fail: it would have been fine on the three HTML tests, except that it fails to realize that > only matters when it follows ]] — I don’t bother escaping it otherwise, and so they stripped it, while allowing through the <title. It also gets points for being eager: despite the fact that it wasn’t willing to actually show me content from the feeds I added until the next morning, during the ten minutes after I added them, it requested robots.txt alone 47 times, with four different user-agents. Don’t worry, Yahoo!, the rest of the world has plenty of bandwidth and server threads to make up for you using three different feed fetchers as well as your regular spider, and we’re happy to have you all over us like an untrained Saint Bernard puppy. No, really, we don’t mind your slobber at all.

Netvibes is quite nice, as Flash-based portals with aggregators go. Shame that it seems to fail on the text testcases not by stripping them, but by putting them in as markup, which is the sort of thing that winds up with me hollering about security holes. Though, the opacity of the interface may foil me.

The Firefox bugs I’ve known about forever, and Rob’s on ’em with a patch waiting on review, but Thunderbird’s problems with the two CDATA tests worry me: in both cases, it just gives a blank title, rather than just fumbling the escaping. I even built it for the first time in months, to catch up with the trunk, but still got the same result. Bad enough that it’s an odd and worrisome bug that I might have to file, but worse yet that the competition for non-sucky Windows browser managed fine with every test.

Oh, and Luke? Pretty nice showing for a one-person unpaid hobby aggregator, mate ;)

Sam sensibly put it in the Atom wiki, without the need for decent Unicode font support and with the opportunity to add your own results.


Comment by travis #
2005-12-20 07:37:38

Results for JetBrains Omea Reader 2.0 (build 671.6):

html cdata: ✘
html entity: ✘
html NCR: ✘
text entity: ✔
text in CDATA: ✔
text in NCR: ✔
xhtml entity: ✘
xhtml NCR: ✘

Comment by Jacques Distler #
2005-12-20 10:09:23