Resolving relative URLs in RSS items

Brent asks “should RSS aggregators and newsreaders parse (expand) relative URLs in RSS items?” (Before anyone else heads down the wrong track, he doesn’t mean in <link> or <url>, just in HTML in <description> or <content:encoded>) His first inclination is to say “no”, most of the people leaving comments say “no”, but Aaron Swartz says “yes”, and in a rare state of affairs I see that Aaron is right even before someone has to explain it to me.

Using relative URLs in your HTML is a good thing: with relative URLs it’s easy to mirror your site (whether it’s locally, on a server running on your PC, or on a public server because you’re just too popular), it removes one painful step in moving to a new domain, and if you do a lot of linking to yourself, it can even cut down on your bandwidth.

Expanding relative URLs is relatively easy1: <channel><link> is the URL for the HTML version of the channel, so getting an absolute URL should be as simple as chopping off the filename if the link URL has a path and doesn’t end in a slash, and slapping that in front of the relative URL. If that doesn’t work, then there’s probably something wrong with either the link or the relative URL, and a bunch of 404s should let the feed author know that something’s wrong. I’m more than a little tired of seeing feeds that are filled with broken image placeholders.

1Oops, in the original I claimed that it was harder in RSS 1.0, where you would have to use the channel’s rdf:about, which might not be the URL for the HTML page. Thanks to Morten for reminding me that RSS 1.0 also has a <channel><link>. D’oh!

14 Comments

Comment by Bryce #
2002-10-25 22:31:46

My take is that RSS generators are in the position of knowing how to resolve a given relative URL, while an RSS consumer can merely make an educated guess. That the guess has a high probability of success doesn’t change the fact that it could be wrong.

 
Comment by Phil Ringnalda #
2002-10-25 23:02:58

Nope, sure doesn’t. On the other hand, while Brent is in control of one of the most popular RSS readers, none of us has been granted King of the World status yet, so while he can choose to be liberal with what he receives, we can’t choose to be strict with what the whole world produces. If Ev and Steve choose to expand relative URLs in RSS the way they do in emailed posts from Blogger, that’s cool, but Brent can’t force them to, and Blogger users can’t control their RSS output. If Ben chooses to add an MT global attribute expand_urls (or someone writes it as a plugin), again that’s cool, but I doubt that every existing MT blog will add it to their template. If Radio… sigh. There will be feeds with relative URLs, no matter what we think of them. Even if I didn’t approve of them, I don’t see this as an appropriate thing for the ”throw an error in your user’s face, and let them argue with the feed producer” style of RSS improvement. If you are talking about invalid XML, or clear violation of the spec, sure, but since it’s not at all clear whether or not it’s right, and for most feeds it’s dead simple to fix (RSS 1.0 feeds with an rdf:about that’s not the HTML page and with relative URLs in description/content have to be somewhere between a tiny minority and a microscopic minority), and trying to fix it but failing won’t change the user experience but will alert the producer that something’s wrong, I can’t see why Brent (and all the other reader developers) shouldn’t go ahead and fix it. If you want to start evangelizing Blogger, MT, and Radio to change what they produce, more power to you, and I’ll start making a list of other RSS producing products that don’t expand relative URLs for when you’re done with those, but I think I’ll save my IOUs for other things (and add relative URL expansion to my own aggregator code).

 
Comment by KafkaesquĆ­ #
2002-10-26 06:08:04

Automatic relative URL expansion. There’s a phrase I never imagined I would see back when I teaching myself HTML[null]. Progress is a wonderful, dreary little thing. Any chance we can just go back to WAIS? Nah, I didn’t think so.

 
Comment by Sam Ruby #
2002-10-26 07:47:10

Phil, there certainly are kinder and gentler approaches that should be considered. One thing that many validators do is have levels of validation. If you are doing something that, while valid, may cause some consumers some grief, then perhaps a warning is in order.

 
Comment by Phil Ringnalda #
2002-10-26 08:57:41

+1

Especially if some of the aggregators start to expand relative URLs, so that someone producing RSS with relative URLs might not even know that their images don’t display and their links don’t work for other people.

BTW, I did log expanding relative URLs as a feature request for Blogger Pro, since people on Blog*Spot really should be using relative URLs, anticipating the day when they’ll move out, and they’ve got absolutely no control over their RSS generation.

Other than worrying that people don’t understand the precise meanings, I rather like Ben’s take: SHOULD NOT produce, and MAY expand.

 
Comment by Morten Frederiksen #
2002-10-26 09:29:45

Phil,

Why would you wan’t to use the channel’s rdf:about for this with RSS 1.0? The description of the channel’s link element says:

”The URL to which an HTML rendering of the channel title will link, commonly the parent site’s home or news page.”

… which I read essentially the same way as your description of the RSS 2.0 link element?

 
Comment by Phil Ringnalda #
2002-10-26 09:56:36

Premature (it is premature) senility isn’t very pretty, is it? I was confusing the fact that we don’t know the URL for the feed with not knowing the URL for the HTML. Thanks for setting me straight.

 
Comment by michel v #
2002-10-29 11:47:51

So, I take it that it would be nice if RSS producers (this mostly reads ’weblog tools authors’) implemented a way to automatically expand URLs in href and src attributes so that the RSS gives absolute URLs to make the aggregator happy.
I’m all for it, I’m going to try to add that to b2 :)

 
2002-10-26 06:33:35

Expansion of relative URLs

A interesting debate is going on over at Brent’s and at Phil’s. It’s over the question of whether or not

 
Trackback by Sam Ruby #
2002-10-26 16:02:10

RSS Best Practices

Brent, Phil, and Benare discussing whether RSS feeds should have relative or absolute URLs in encoded HTML. This has bothered me in the past, particularly when viewing Joel On Software’s RSS feedthrough the lens of the Radio Aggregator. His feed has

 
Trackback by Sam Ruby #
2002-10-27 04:52:47

RSS Best Practices

Brent, Phil, and Benare discussing whether RSS feeds should have relative or absolute URLs in encoded HTML. This has bothered me in the past, particularly when viewing Joel On Software’s RSS feedthrough the lens of the Radio Aggregator. His feed has

 
Trackback by Mike @ Home #
2002-11-02 14:10:04

http://gibolin.dnsalias.org/archives/000021.html

phil ringnalda dot com: Resolving relative URLs in RSS items: ”should RSS aggregators and newsreaders parse (expand) relative URLs in

 
Trackback by Too Much News #
2002-11-03 02:49:00

Resolving relative URLs in RSS items

phil ringnalda: ”should RSS aggregators and newsreaders parse (expand) relative URLs in RSS items?”

 
Comment by Dave #
2006-03-16 10:30:27

I don’t know much about RSS specs, but I know that WordPress has a plugin to fix the whole relative URL problem: AbsoluteRSS. I wrote about it on my site here.

 
Name (required)
E-mail (required - never shown publicly)
URI
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <del datetime="" cite=""> <dd> <dl> <dt> <em> <i> <ins datetime="" cite=""> <kbd> <li> <ol> <p> <pre> <q cite=""> <samp> <strong> <sub> <sup> <ul> in your comment.