Who knows a <title> from a hole in the ground?

Results from a few testcases of various forms of escaping an Atom title which is not markup. Not exactly overwhelmingly impressive. In every case (unless I’ve screwed up) the title should be displayed as <title>.

Aggregator HTML CDATA HTML entity HTML NCR Text CDATA Text entity Text NCR XHTML entity XHTML NCR
My Yahoo!
Newsgator Online
Google Reader
RSS Bandit

Some notes:

If you are using Internet Explorer Windows, all those results may well look like identical empty squares, instead of checkmarks for pass and Xes for fail. May I recommend some fonts and a better browser that will make use of them? Or another better browser? That’s gotta be better than digging in the source for the classnames.

Bloglines has one “?” because I failed to subscribe to one feed when I started this last night, and they apparently take eight or ten or twelve hours before they first fetch a feed. Maybe the new datacenter will improve things. Passed, when it finally showed up.

Rojo doesn’t appear because they either take more than 24 hours to first fetch a feed, or they don’t support Atom 1.0, and don’t support it by claiming that it hasn’t yet been fetched. Can’t tell, really.

Kinja doesn’t appear because it says that it’s down for “routine maintenance” and has been all day. Has it been for longer than all day? Is it gone forever? Don’t ask me.

“Windows” Live fails to support Atom 1.0 in a most unamusing way: it pretends that nothing happened. Import the OPML file of test cases, and you get a blank content-hole for a second, then you’re back where you started. Tell it to subscribe to a particular URL, and it starts talking about the results of your search for words in the URL.

My Yahoo! found the most amusing way to fail: it would have been fine on the three HTML tests, except that it fails to realize that > only matters when it follows ]] — I don’t bother escaping it otherwise, and so they stripped it, while allowing through the <title. It also gets points for being eager: despite the fact that it wasn’t willing to actually show me content from the feeds I added until the next morning, during the ten minutes after I added them, it requested robots.txt alone 47 times, with four different user-agents. Don’t worry, Yahoo!, the rest of the world has plenty of bandwidth and server threads to make up for you using three different feed fetchers as well as your regular spider, and we’re happy to have you all over us like an untrained Saint Bernard puppy. No, really, we don’t mind your slobber at all.

Netvibes is quite nice, as Flash-based portals with aggregators go. Shame that it seems to fail on the text testcases not by stripping them, but by putting them in as markup, which is the sort of thing that winds up with me hollering about security holes. Though, the opacity of the interface may foil me.

The Firefox bugs I’ve known about forever, and Rob’s on ’em with a patch waiting on review, but Thunderbird’s problems with the two CDATA tests worry me: in both cases, it just gives a blank title, rather than just fumbling the escaping. I even built it for the first time in months, to catch up with the trunk, but still got the same result. Bad enough that it’s an odd and worrisome bug that I might have to file, but worse yet that the competition for non-sucky Windows browser managed fine with every test.

Oh, and Luke? Pretty nice showing for a one-person unpaid hobby aggregator, mate ;)

Sam sensibly put it in the Atom wiki, without the need for decent Unicode font support and with the opportunity to add your own results.


Comment by travis #
2005-12-20 07:37:38

Results for JetBrains Omea Reader 2.0 (build 671.6):

html cdata: ✘
html entity: ✘
html NCR: ✘
text entity: ✔
text in CDATA: ✔
text in NCR: ✔
xhtml entity: ✘
xhtml NCR: ✘

Comment by Jacques Distler #
2005-12-20 10:09:23

NetNewsWire has a 3-paned interface.

In the ”headlines” pane, 3 of your test cases display incorrectly: text/entity, xhtml/entity and xhtml/ncr. The rest display correctly.

In the ”entry” pane, all the test cases display the title correctly.

P.S.: Thanks for ”fixing” your Entry Feed. I will await, eagerly, the return of your comment feed.

Comment by Phil Ringnalda #
2005-12-20 22:17:45

’kay, should be ”fixed” — I just commented out the adding of the comments, since I don’t really feel any need to constantly debug something that pretty much seems to just work.

Comment by Alastair #
2005-12-20 16:39:25

Another good, complete, unicode font for Windows is the updated Ariel supplied with MS Word 2002 (and later). You need to install it separately, as described by this KM article.

Installing this (or any other comprehensive unicode font) fixes Firefox, but not IE. And here is an attempt to explain why.

2005-12-25 13:18:46

Mały test czytników RSS

Phil Ringnalda opublikował mały test czytników RSS. Sharpreader, którego od dłuższego czasu używam, nie zawiódł mnie i tym razem, przechodząc wszystkie testy celująco. Cóż, zawsze mnie cieszy, gdy mój wybór oprogramowania okazuje się s…

2005-12-25 23:31:11

[…] I will continue to be in awe of Phil and posts like these., […]

Comment by Arve #
2006-01-10 08:49:40

Bloglines (and probably even a few other aggregators) fail miserably with < and > elsewhere.

See this example to experience what I have come to call the ”Bloglines experience”: My posting on <canvas> renders the canvas element literally, instead of displaying the correct TEXT.

That’s the reward for using JavaScript where it’s not supposed to be used. Perhaps I should prepend every entry with <marquee> when Bloglines accesses the URL?

[ Oh, and Phil: Could you get WP to recognize the fact that I’m not a spammer if I have turned off sending of the referer header? I’m seeing Error: This file cannot be used on its own. every time I try to submit a comment. ]

Comment by Phil Ringnalda #
2006-01-10 09:23:22

Sure, I’m willing to let you not send referrers. Just give me something that will work as well: so far this morning, I’ve gotten 3 comments, and 90 foiled attempts at directly posting spam to wp-comments-post.php (not counting your false positives). What do you have in mind as an alternative? I thought about allowing Opera through, but one of the spammers is using a UA rotator that includes Opera. Maybe only allow Opera 9.xx on Linux? That shouldn’t be in spammers’ rotation lists for a while.

Comment by Jacques Distler #
2006-01-10 11:31:09

Just give me something that will work as well: so far this morning, I’ve gotten 3 comments, and 90 foiled attempts at directly posting spam to wp-comments-post.php


Where’s that ”forced-comment-preview” when you need it? Heck, where’s that ”comment-preview”?

Comment by Jacques Distler #
2006-01-10 11:36:48

More seriously, back, years ago, when I last worried about comment-spambots, they were smart enough to fake the referer header.

Now, some can’t even do HTTP correctly. But, surely one can’t rely on that.

(Sorry for forgetting to sign the last comment.)

Comment by travis #
2006-01-13 06:59:52

Results for JetBrains Omea Reader 2.1 (build 914.3):

html cdata: ✘
html entity: ✘

html NCR: ✘
text entity: ✔
text in CDATA: ✔
text in NCR: ✔
xhtml entity: ✘
xhtml NCR: ✘

Looks like not much changed in their rendering engine

Name (required)
E-mail (required - never shown publicly)
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <del datetime="" cite=""> <dd> <dl> <dt> <em> <i> <ins datetime="" cite=""> <kbd> <li> <ol> <p> <pre> <q cite=""> <samp> <strong> <sub> <sup> <ul> in your comment.