Your forthcoming feed errors

Back in July, Sam Ruby said of the “Obsolete Version” warnings the feedvalidator produces for Atom 0.3 feeds:

Possibly as early as October, and certainly no later than the end of the year, these warnings will be converted over to errors.

According to the cvs commit mailing list, that time was a couple of hours ago, so once the public site syncs that’ll be that, and feedvalidator.org will no long tell you your Atom 0.3 is anything but obsolete.

And as Ben de Groot notes, that means that as of now, WordPress 2.0 is going to ship with an Atom template that produces a feed which will not validate.

That’s not the end of the world: I’d guess very few consumers will remove Atom 0.3 support in the near-term, there may well be other validators that will continue to approximately validate 0.3 feeds, and if nothing else, having just recently removed the template we offered to users of old versions of Movable Type that shipped with invalid feeds, we can certainly find room on the error page to offer WordPress users an Atom 1.0 template.

Anyway, I don’t have any room to get up in arms about it, since I don’t have a working patch in the bug (and, I’m a selfish bastard with a workable if personalized Atom 1.0 template of my own), but I would like to understand why that’s the way it is.

Of course, at the most basic level, there is no patch.

As far as I’m concerned, that’s enough: if someone who understands both Atom and WordPress cares enough to provide a patch which will actually work (the current patch assumes incorrectly that WordPress can guarantee that posts and titles are well-formed XML, and given the crappy state of support for type="xhtml" in aggregators at the moment, that WordPress users want to be foot-soldiers in the battle to improve that support, and also does some rather odd special-case ignoring of the user’s express wishes when Technorati is fetching the feed), and does it while there’s still enough testing time before 2.0 ships, and it still doesn’t go in, that would be interesting.

And if nobody provides a workable patch, well, the most basic tenet of open source is that you’re the only one you can order around.

Luckily for me, Trac (or at least WordPress’s installation of it) is so unfriendly, and WordPress’s process is so opaque (and remember, this is coming from someone whose baseline is Bugzilla and Mozilla, fer gossake!) that I don’t feel at all involved in WordPress development or in its fate, so I can just watch how it plays out — by remembering that I’ve got the bug page bookmarked, since if Trac will actually send bugspam, it’s beyond me how to persuade it to do so.

60 Comments

Comment by Manuzhai #
2005-12-14 02:48:36

Apparently, you need to login first. On the login box, it says you need a forum account. So you go to wordpress.org/support/ and register for a forum account (if you don’t already have one). You receive a password in the mail, log back in to the forum, edit your profile to change the password to something you can remember, now you go back to the bug, click Login, enter your forum username and password, and then you can add your email-address to the CC-list: bugspam!

You’re welcome. ;) (Please don’t stop persuading them.)

Comment by Phil Ringnalda #
2005-12-14 07:30:15

Ah, I was thrown off by the fact that my forum login knows my email address, and then again by the fact that my Trac settings include space for yet another email address, and finally by the fact that I just didn’t believe that Trac would recreate one of Bugzilla’s most hated features, email addresses splattered hither and yon, when it already had a username-to-address mapping.

Comment by Manuzhai #
2005-12-19 03:21:45

Well, yes, the coupling of email-addresses to sessions in Trac should provide for enough information, except that maybe not everyone commenting on a ticket might want to also receive email. I’ve done some Trac development, though; I might look into getting this fixed. It’s enough of a problem that it bugs me too, sometimes.

 
 
 
Comment by Darryl #
2005-12-14 06:39:28

Rowr. I searched around last week for an official looking atom 1.0 template for my wordpress 1.5.2 installation. No luck….

Comment by Phil Ringnalda #
2005-12-14 07:56:56

Yeah, I know. Worst comes to worst, I’ll make something available, but I really don’t know my way around WP well enough to get things right. For instance, while looking at the broken patch I noticed that it will use the same values for /feed/id/ and /feed/link/@rel="self" even when it’s creating a category feed rather than the main feed. I don’t know about any other clients, but when Bloglines sees a @rel="self" that isn’t the URL the user told it about, it subscribes to the @rel="self" instead, making category feeds unsubscribable. And, er, I seem to have that exact same bug, so I guess I have to figure that out, this weekend maybe.

 
 
2005-12-14 09:15:20

Why No Atom 1.0 in WP 2.0?

From Phil Ringnalda, I learn that Ben de Groot has been working to see if Atom 1.0 support will come out in WP 2.0. Apparently, WP 2.0 won’t support Atom 1.0, and as Sam Ruby promised he would, the feed validator will now declare Atom 0.3 feeds …

 
Comment by Aristotle Pagaltzis #
2005-12-14 09:22:02

Of course, it’s not like there was no advance warning.

The competition, ie. MovableType 3.2, already ships with an Atom 1.0 template by default. And LiveJournal now serves Atom 1.0 feeds (though it’s type="html" tagsoup). So there is already some weight besides WordPress in the arena to establish Atom 1.0 support as required for clients. Hopefully Blogger will switch away from 0.3 sometime soon now that we have an RFC number for 1.0 and the validator does not like 0.3 anymore.

So I wouldn’t mind making WordPress users guinea pigs for a short period if that hastens along the perfect bug storm in client implementations.

But then, I haven’t been known to be conservative about good standards and about standards support, have I?

 
Comment by kellan #
2005-12-14 10:03:31

Of course, I don’t yet have a shipping version of Magpie that supports Atom 1.0, especially all of its luvly ”simplifications”. (and certainly the 0.5.x version that is embedded in WP won’t support it)

Comment by Robert Sayre #
2005-12-14 10:33:01

There’s nothing in Atom 1.0 that would be difficult for an RSS parser that handles a decent percentage of the RSS feeds that are out there.

 
 
Comment by kellan #
2005-12-14 14:45:53

Umm, Robert, having written and maintained a handful of feed parsers I respectfully disagree.

Comment by Robert Sayre #
2005-12-14 15:00:12

Well, having written and maintained some feedparsers myself, I can’t think of anything that would make it harder than RSS. What makes Atom harder in your experience?

Comment by Mike Mariano #
2005-12-14 15:31:59

I don’t know about parsing, but Atom 1.0 makes it a little bit harder to generate a feed.

As Phil describes above, the current experimental Atom 1.0 patch for WordPress can sometimes display incorrect feed ids and rel=selfs.

This is partially because (as far as I can tell) WordPress has no sense of self. WordPress can tell you where the main weblog page is and where entry permanent links are, but it can’t tell you what current page or feed you are looking at.

Atom 1.0 asks for something WordPress never even considered providing, even if it does seem basic. I wish feed parsers, aggregators, and generators could just make a few template changes to update themselves, but true changes will require deeper digging.

Comment by Phil Ringnalda #
2005-12-14 15:43:31

WordPress has no sense of self.

Heh. I wish I’d seen that opening.

And since we’re talking PHP (and 4.x to boot), Kellan’s not going to get anything for free: I’d guess that the biggest two problems would be xml:base and type="xhtml", which is considerably less fun with a SAX parser than with (at least some) DOM parsers, but there’s also the fact that MagpieRSS’s existing architecture doesn’t seem too happy with repeated elements (which are pretty thin on the ground in RSS, and quite common in Atom).

Comment by Robert Sayre #
2005-12-14 17:52:36

I’d guess that the biggest two problems would be xml:base and type=”xhtml”

xml:base is two or three lines of code, maybe five or six if you get anal about string allocation and keep a depth integer. Dealing with type=”xhtml” is a matter of writing a SAXRecorder. There are lots out there, and they’re generally useful. In Python, you stream the results of the SAXRecorder to a CStringIO. If you want to HTMLize the result, you grab only the localNames of the elements. Browsing the Magpie source reveals that it doesn’t enable namespaces when it invokes expat… that makes parsing RSS1, RSS2, Atom, and OPML a lot less fun.

Comment by Phil Ringnalda #
2005-12-14 19:25:27

Two or three lines, eh? Did I mention that we’re in PHP here?

php.net/urljoin

(I’d be delighted to be told how stupid I am, missing PHP’s equivalent, since that might stick it in my memory, and I know it doesn’t really matter, since it’s just a function you write once (and then fix the bugs two or three times), but it still adds up when you don’t much feel like starting.)

The namespace handling… um. PHP has supported xml_parser_create_ns since April 2001, so I dunno why Magpie pretends namespace prefixes are significant instead.

Comment by Robert Sayre #
2005-12-14 20:46:26

Did I mention that we’re in PHP here?

Can we help those who refuse to help themselves? :) Anyway, this is the best I found:
http://pear.php.net/package/HTTP/docs/1.3.4/HTTP/HTTP.html#methodabsoluteURI

Also, I found this: http://cvs.php.net/viewcvs.cgi/pear/XML_Feed_Parser/

Comment by Phil Ringnalda #
2005-12-14 23:02:15

Yeah, I’ve been watching XML_Feed_Parser for a while now, though I’d forgotten that in his case he does get xml:base for free, by virtue of going PHP5-only. Not a bad idea, though I see that laggard Dreamhost is still fobbing us off with 5.0.4, so to get truly evil things like recover="true" (or Tidy, which I can’t believe they don’t include) I’d need to go back to compiling my own, at which point you’re no longer talking meaningfully about something for distribution.

 
Comment by kellan #
2005-12-15 05:45:18

Yes, DOM based parsing is wonderful, and fun.

expat’s proto-SAX, and no PEAR are my particular crosses to bear unfortunately.

 
 
Comment by kellan #
2005-12-15 05:21:28

Because xml_parser_create_ns throws away all the information about prefixes, which means that if you encounter a namespace the parser has never seen before you have no option but providing access to the data at:

$item[HTTP://WWW.W3.ORG/2005/02/22-FDR-FOOGLE-NS#][somedata]

People have enough trouble figuring out how to use Magpie as is.

Should I be explicitly mapping the namespaces to a controlled vocab of short names that match the most prefixes? Yes, of course.

Could we resort to regex trickery like we do to pull out the charset for doing re-encoding? Probably.

I will happily accept patches that do this, and sleep better at night.

 
 
 
 
 
Comment by kellan #
2005-12-15 05:37:39

Hi Robert,

You’re statement was: ”There’s nothing in Atom 1.0 that would be difficult for an RSS parser that handles a decent percentage of the RSS feeds.”

Off the top of my head, the short list: the content constructs, frequest use of nested tags, recurring elements, explicit inheritance (xml:base) and implied inheritance (of date and author constructs)

Doable? Of course. More complicated then RSS? Absolutely.

Comment by Robert Sayre #
2005-12-15 06:13:42

Doable? Of course. More complicated then RSS? Absolutely.

I think I disagree pretty strongly here. More complicated than an idealized subset of RSS? I guess. I think this attitude is fallout from numerous tutorials that suggest RSS feeds are a good subject for ”My First SAX Handler”. Well, XML isn’t simple, RSS isn’t simple, and Atom isn’t simple to the same extent, only it’s documented.

xml_parser_create_ns throws away all the information about prefixes

You have to handle the start_prefix_mapping events.

Comment by Phil Ringnalda #
2005-12-15 08:55:33

Eh, at this point you’re denying Kellan’s actual experience.

It was possible to write by far the most widely used PHP feed parsing library for RSS with inelegant to no handling of several things that cannot be omitted for Atom. All my feed reading for almost a year now has been through things using Magpie, and I’m utterly indifferent to the things it doesn’t do to RSS, but not at all indifferent to the way it concatenates multiple different titles and links together in Atom.

Comment by Robert Sayre #
2005-12-15 09:33:30

Hmm. Well, magpie doesn’t have unit tests, so it seems like what you’re saying is conjecture. I will say I find it completely strange that someone parsing XML would complain about ”frequest use of nested tags” and ”recurring elements”.

 
Comment by kellan #
2005-12-15 12:41:02

the way it concatenates multiple different titles and links together in Atom.

Phil, have you checked if the problems you’re seeing are fixed in the current dev release?

Comment by Phil Ringnalda #
2005-12-15 14:41:33

Not yet: so far, I’ve been bad, and just been a user of Gregarius; I don’t even know whether it’s using an unmodified MagpieRSS, much less which. Todo. My biggest problem was with PubSub subscriptions, and I seem to have dodged around it by switching to RSS for them, where you don’t get multiple links that might get munged into one.

 
Comment by Phil Ringnalda #
2005-12-15 15:15:40

Hmm, and the feed which is currently annoying me most with concatenated titles is quite noncompliant.

Comment by Sam Ruby #
2005-12-15 18:48:18

Not any more.

Comment by Phil Ringnalda #
2005-12-15 20:00:13

I’ll bet I don’t want to go back in the archives to where you presented your use case and went either unheard or inadequately understood, leaving you with the need to both fake metadata you don’t have and also use something that’s unlikely to get much interop, do I?

Comment by Sam Ruby #
2005-12-15 20:17:58

Three classes: ipaddr, foaf, and other.

I did try to make a case for ipaddr, and failed.

the foaf elements were valid, but the feedvalidator doesn’t like extensions in known namespaces unless explicitly allowed. I would classify this as a bug in the feedvalidator. But I removed it from my feed anyway.

As for the rest, it truly was data I keep track for which there has been no expressed interest by others. Should that interest emerge, I’d gladly switch this from an intertwingly namespace to a more generic one.

Comment by Phil Ringnalda #
2005-12-15 21:04:05

The part that particularly interested me was Trackbacks (or excerpts or pingbacks, since they have essentially the same data). My undersized comments feed doesn’t currently contain one, but I’d bet without even looking at my code that I call a weblog title a person’s name, and a post permalink URL the URL for that ”person.” Even after three years of repeating my mantra that ”a Trackback is a comment that lives on someone else’s server” calling their weblog a person still doesn’t taste quite right.

 
 
Comment by Robert Sayre #
2005-12-15 22:03:21

leaving you with the need to both fake metadata you don’t have and also use something that’s unlikely to get much interop

Phil, you’re going to have to stop talking yourself into these corners. I don’t think any feed format gets out of that sentence alive. Here, have a default title? :)

Comment by Phil Ringnalda #
2005-12-15 22:46:36

Yeah, that’s some atrocious UI, combined with a title-substitute chosen by someone who probably shouldn’t be allowed to write copy that the public will see. So?

Not only does it have nothing to do with how to express the authorship of a Trackback, even if it was a good idea to use an entry title to expand and collapse items in the page, the better to fight against muscle memory of it being linked to the permalink page, then a sane person would use [Collapse] for expanded untitled items, and the first however many characters fit the display from the content for collapsed untitled items.

Surely you’ve seen the counter-screenshot, with a UI that doesn’t need the title, but still has to display faked titles because it doesn’t have any way of knowing that

3 movies show
3 movies show the devastation of Biloxi and Gulfport

isn’t actually a title and a post, it’s forced faked data and a post.

 
 
 
 
 
 
Comment by Sam Ruby #
2005-12-15 18:47:36

re: ”denying Kellan’s actual experience”

Every once in a while, I wonder… if we held the Atom process open for a while longer, would someone like Kellan have chosen to share his experience with us?

Comment by Robert Sayre #
2005-12-15 19:21:32

Seems like the XHTML is what burns people most. Try removing that from a standards committee (ha!).

Comment by Sam Ruby #
2005-12-15 20:08:14

I was the second person on the planet to add xhtml:body to my RSS 2.0 feeds. This was quickly supported by all the major feedreaders.

But somehow it is harder in Atom.

 
 
Comment by kellan #
2005-12-16 05:41:25

would someone like Kellan have chosen to share his experience with us?

I did Sam. And gave feedback on the process. Apparently it went unnoticed? I know any number of early, Echo enthusiasts didn’t have the stamina or force of will (loud enough voice) to stay engaged with the process.

Wasn’t really planning to come back this morning, but I got a little alert about ”kellan” (uncommon name).

Can’t say I’m thrilled with where this thread has gone, after some intial grousing, I was told the added complexity was in my imagination. I thought I presented an uncontroversial list of add complexity, at which point my comptence was questioned.

Comment by Robert Sayre #
2005-12-16 07:55:48

I thought I presented an uncontroversial list of add complexity, at which point my comptence was questioned.

I’m sorry you got offended, but you’re the one turned the discussion to the parser you wrote. After looking it over, I think it’s a very nice library with a parser that reached the limits of its current design a long time ago. Bad software with bad Atom support doesn’t worry me, but Magpie is good software, so it needs fixing. Expect patches.

 
 
 
 
Comment by kellan #
2005-12-15 12:36:35

You have to handle the start_prefix_mapping events.

If expat had such a thing. Alas, it doesn’t. (we’re dealing with a parser which pre-dates SAX by several years, or in the case of PHP5, we’re intentionally crippling libxml to pretend to be said ancient parser).

In *certain* versions of PHP you can get the namespace declarations using the default_handler as long as you use the the non-namespace aware version of the parser.

Comment by Robert Sayre #
2005-12-15 13:05:21

Sorry, wrong name. I meant xml_set_start_namespace_decl_handler… or is that broken in certain versions as well?

Comment by kellan #
2005-12-15 17:13:53

Robert, now you’ve got me doubting my sanity. I’ll check.

 
 
 
 
Comment by Roger Benningfield #
2005-12-19 02:12:08

Kellan: I feel your pain, brother. Contrary to some folks’ predictions, I’m actually feeling pretty positive about Atom these days… but supporting it ain’t as simple as supporting RSS.

For example, my xml:base implementation in Coldfusion… it’s around 150 lines of code, and only allows the publisher to set the base on feed, entry, content, and summary elements. Just to make things more difficult, it won’t work at all for users whose webhosts have their accounts sandboxed, since sandboxing locks out access to the JVM, and I’m relying on java.net.Uri to make it possible at all. Fun fun.

 
 
 
 
Comment by Pete Prodoehl #
2005-12-14 22:57:42

Oh sure, just when I thought I had all my WordPress sites running with nice little (valid!) Atom 1.0 feeds, you come along and pee in my pool…

Ah well, I’ll keep complaining and keep hacking the PHP to do what appears to be the right thing. (Even if no one listens…)

 
Comment by Ben de Groot #
2005-12-15 03:57:17

As far as I know, WordPress’s Atom feed has only ever worked correctly for feeds on the blog root, not for categories or individual posts. I would be satisfied if it at least produces a valid Atom 1.0 feed for this — the categories and single post feeds and so on can be hacked later. Although, if we could provide all this as a patch in the current bug, there would be no reason not to include it in the next release!

As you mentioned, WordPress cannot guarantee valid X(HT)ML out-of-the-box (now if all hosters provided PHP 5 with Tidy we could do some interesting things, but dream on…) so we need the feed to use plain text or html. Do I understand it right that we can use non-well-formed HTML (”tagsoup” if you like) in the feed without escaping?

Comment by Sam Ruby #
2005-12-15 07:55:00

tagsoup needs to be escaped.

If you have access to an XML parser, you can do what blosxom does:

  if (eval {$parser->parse("<div>$$body_ref</div>")}) {
    $type = 'xhtml';
    $body = "<div xmlns="http://www.w3.org/1999/xhtml">$$body_ref</div>";
  } else {
    $type = 'html';
    if (index($$body_ref,']]>')<0) {
      $body = "<![CDATA[$$body_ref]]>";
    } else {
      ($body = $$body_ref) =~ s/($escape_re)/$escape{$1}/g;
    }
  }
Comment by Sam Ruby #
2005-12-15 08:01:26

Bits of the example got eaten. In pseudo-code instead:

  1. if the body, wrapped in a div, can be parsed; declare it as xhtml and include it inline with an xhtml:div element.
  2. if the body does not contain the sequence ”]]>”, declare it as html and wrap it in CDATA.
  3. otherwise, declare it as html and escape everything
Comment by Phil Ringnalda #
2005-12-15 08:27:45

I think I escaped it back to wholeness; preview this weekend, I hope, and perhaps a fix for those silly non-working scrollbars on overflowing comments.

I wonder how expensive creating a PHP 4 parser actually is (or for that matter, whether things like simplexml_load_string() are as cheap as they appear to be), whether we can afford to create one even if we can’t have PHP 5’s lightweight DOM parsers. Probably, not, given the way Dreamhost’s on a CPU usage rampage, and most of many people’s usage comes from feeds.

Then if only I could convince myself that that ordered list is a heirarchy of desirability, rather than ocean-boiling followed by a weakly-held aesthetic preference in machine readable data (if you think other people’s computers will be happier eating CDATA than they will be eating entity references, why not escape the string ]]> and stick with CDATA?)

Comment by Sam Ruby #
2005-12-15 08:43:38

re: ”why not escape the string ]]> and stick with CDATA?”

Well, then because you will end up with the escaped string instead of the string you want.

As to the blosxom algorithm and CPU usage, I do the same thing for my feeds (see ”def content” in template/__init__.py), but then again, my weblog is half baked and a little fried, so CPU usage is not much of a concern.

Comment by Phil Ringnalda #
2005-12-15 09:24:25

D’oh. I wonder what taught me that misunderstanding of CDATA sections in XML. MT, I’m looking at you.

I knew correctly that people massively over-use &gt; while entity escaping in element content, but what I failed to realize was that the only time you need to escape > is when it is in the string ]]> not in a CDATA section, for compatibility with SGML which doesn’t allow that string to appear anywhere other than as the end of a CDATA section; I somehow got the idea that you only needed to use it to put that string inside a CDATA section, and that… well, it wasn’t too reasonable to think that that, and only that, would be unescaped, was it? Learn something new every day, and usually it’s something you should have learned thousands of days before.

 
Comment by Phil Ringnalda #
2005-12-15 23:37:22

Hmm. Actually, will you?

If you are using an actual CDATA section in a post, then you must be serving application/xhtml+xml, since otherwise your CDATA section will be turned into a comment (Mozilla) or simply removed completely (IE). And if you are serving application/xhtml+xml but your post is not well-formed, you’re already hosed in many more ways than having your CDATA section’s end delimiter escaped to tunnel your broken markup until you fix it.

On the other hand, if you are serving text/html, and you just happen to use ]]> in a post, then you are breaking the rules of the underlying SGML, and escaping that to ]]&gt; is just doing you a favor, cleaning up after you.

Since the most popular browser I know of that treats a CDATA section in text/html as CDATA is Opera, the most harmful situation that escaping and CDATA wrapping could cause would be if you are an Opera user publishing ill-formed HTML which includes CDATA sections that you expect to be published as unparsed text (assuming that Opera’s feed reader behaves the same as the browser does with CDATA sections), but instead you would see your escaped end delimiter, and everything to the end of the post treated as CDATA. And then, who knows?, you might actually look at how it was working in another browser, and see that you were completely broken all along.

Or am I missing some situation where an unescaped ]]> could occur in ill-formed HTML without it being either an error, or a cross-browser nightmare?

Comment by Aristotle Pagaltzis #
2005-12-16 04:21:41

What I don’t understand is how any of these situations are supposed to affect the production of a valid feed. They’d affect the construction of valid HTML from said feed, maybe, but in no case is the validity of the Atom envelope put in question, unless I’m completely missing your point.

Comment by Phil Ringnalda #
2005-12-16 07:43:20

I think maybe you are: there are lots of ways you can make a valid, well-formed feed, some of which (simply remove every <, >, and &) are less likely to please the user than others (use CDATA sections so they can pretend there’s no escaping).

If your goal is to use inline XML when you can, and when you can’t to prefer CDATA escaping to entity escaping, I don’t think there’s a situation where you can’t use CDATA escaping, because either your user has already blown his chance of using a CDATA section in the post, so you won’t be illegally nesting CDATA sections, or he wasn’t trying to use one at all, just happened to use the end delimiter unescaped, which he’s not allowed to do in HTML, or XHTML, or XML, or SGML, so you can get away with escaping it even though it won’t then be unescaped in the output. If you are code in a protocol client (or editing code on the server) that’s probably overstepping your bounds, but for feed-generating code I don’t think that lack of fidelity can ever make things worse.

Comment by Aristotle Pagaltzis #
2005-12-16 15:00:51

I’m still not sure I follow. Is the following case what you’re trying to cover?

  • The user wants to include a CDATA section in the content.
  • The content is not well-formed.
Comment by Phil Ringnalda #
2005-12-16 19:08:59

Nearly right, it’s not well-formed or it would have gone through as type="xhtml", and it includes an unescaped ]]>, for an unknown reason.

It might have that ]]> not as an actual CDATA section end delimiter, just as an accidental part of some ASCII art or as an example: in that case, it should have been escaped for the user some time prior to feed generation, because that’s invalid in anything descended from SGML, and escaping it now won’t hurt.

It might have that ]]> as the end of an intentional CDATA section. If so, WTF is the user up to? If he serves application/xhtml+xml then that gets treated like CDATA should, but nobody knows whether his XHTML fragment will eventually be served as text/html or not, so his CDATA section should have already been replaced with entity escaping so it would work either way (plus, remember, he’s not well-formed, so he’s got bigger problems already, that we expect will soon be fixed). Escaping his end delimiter and then wrapping the post in a CDATA section will break it, but it’s already broken, and when he unbreaks it it will go through as type="xhtml" again. Or, he’s serving text/html, where using a CDATA section is a rather odd way of commenting things out, and escaping his end delimiter ought to alert him to how broken his behavior is (particularly since if his fragment is later Tidied and served as application/xhtml+xml, his CDATA-as-comments suddenly becomes CDATA-as-text). Or, in either mime-type, he’s using a combination of SGML comments, JavaScript comments, and a CDATA section to hide JavaScript, in which case he should be looked at sternly while his entire <script … </script> is stripped out, because one of the reasons it shouldn’t appear in a feed is that you don’t know what will happen to it once it’s neutralized on the other end (Gregarius seems to display it, so the result of those Structured Blogging plugins that try to tunnel XML in <script> is a blob of crap at the end of the post).

I don’t really see a case where you would get down to the entity escaping alternative that doesn’t involve utterly broken garbage that should have already been dealt with anyway.

 
 
 
 
 
 
 
 
 
 
Comment by Ben de Groot #
2005-12-15 12:53:16

So my Atom 1.0 feed for WP 1.5.2 is correct then? If I adjust this for WP2, we should be okay. Note that this is just an ”all posts feed” and doesn’t work for categories and so on. But then WP only gives you this feed link.

 
Comment by Pete Prodoehl #
2005-12-15 13:00:09

Oh, and don’t get me started on how WordPress handles enclosures in Atom feeds… (I just hacked it in myself, but last I checked, they just used all the RSS enclosure code in the Atom feed, which was no good…)

 
Comment by Ben de Groot #
2005-12-15 15:50:21

OK, new patch attached to the bug report. Let’s see what’s gonna happen now…

 
Comment by IO ERROR #
2005-12-16 10:26:36

Ben, you could at least try working with us, rather than against us.

Comment by Ben de Groot #
2005-12-16 12:46:44

No what’s that supposed to mean?

 
 
2006-01-18 09:51:21

[…] Apparently, 2.0 is going to ship with an invalid feed on account of Atom 0.3 is no longer a valid or supported spec and Matt is refusing to support the new spec in the next release, or explain his reasons for this decision. […]

 
2006-01-22 16:02:01

[…] Owen, I don’t know enough about the ins and outs of these feeds to be of much use, but there are people who do. [Phil Ringnalda seems to care, and you could probably talk Sam Ruby into helping with test cases—and if nothing else, Sam’s focus on it would get some interesting comments from Mark Pilgrim on the subject, and that would make enough of a kerfluffle to get some attention. ] But I am certainly arguing that we need to take time and figure out what testing needs to be done—fishbone diagrams, whatever. […]

 

Sorry, the comment form is closed at this time.