Since Bill mentioned supporting the RSS syndication module, I’ve been looking at the way it is actually used, and I have to say that I hope no aggregator I use supports it, at least not until people start using it right.
mod_syndication borrows three elements from Ian Davis’s OCS format: updatePeriod, updateFrequency, and updateBase. In OCS, all three are defined as optional, though the RSS spec doesn’t mention whether they are or not. If updatePeriod is omitted, it’s assumed to be “daily”, and if updateFrequency is omitted, it’s assumed to be 1.
If you give only updatePeriod and updateFrequency, then you are just saying “I think it’s appropriate for an aggregator to poll me this often” (to which I say “I know more about how I use my aggregator than you do, so keep your hints to yourself: if I’m only online for an hour and fifteen minutes before work, I want to poll all my subscriptions twice, and you can jolly well take that bandwidth hit”).
However, if you include an updateBase, you are saying that you update on a schedule, and that aggregators shouldn’t expect any content outside that schedule. Suppose I updated my site religiously at midnight, 6 am, noon, and 6 pm (stop that giggling about the idea of me updating regularly, you!). By saying:
I would tell compliant aggregators “after you do your first update after noon, whenever that might be, you should then wait until after 6 pm to update again,” and the same for an update after 6 pm, midnight, and 6 am. You don’t have to check at noon precisely (in fact, you’d be foolish to risk cutting it that close), but having checked once between noon and 6 pm, I guarantee that you won’t find anything new until after 6.
However, when it’s misused by a site which doesn’t update on a regular schedule, updateBase says foolish and damaging things. To pick on an anonymous site I was just looking at, using:
says “this site is updated once per day, at noon in Western Europe.” Once an aggregator which supports mod_syndication has gotten the feed at five after noon, it shouldn’t try again until after noon the next day. In fact, as is typical of the sites I’ve seen using updateBase, it updates at any old time: some days there’s an update at 9-something in the morning and another at 3-ish in the afternoon, other days there’s nothing at all. Say one day the site updates twice in the afternoon, and then once the next morning. If I use an aggregator which supports mod_syndication, and the first day I fire it up shortly after noon, then leave it running all afternoon, I won’t see a single one of the afternoon updates, since they told me that there wouldn’t be anything new until noon the next day. Next day, I start my aggregator when I get up, and shut it down shortly before noon. I still won’t see the updates from the day before, nor will I see the updates from that morning, because my aggregator is still waiting for noon to roll around so it can check again.
So: for those rare sites that are updated by the clock, rather than by when content is available, mod_syndication with an updateBase is a good thing, but for most sites, it’s either an unwelcome suggestion about what the owner thinks is an appropriate polling period, or a positive nuisance, denying you updates if you happen to use your aggregator at the wrong time of the day. I’m pretty sure there are vastly more sites that can say “I never update during these hours,” which is what skipHours says, than there are sites that can say “I update with this exact frequency, starting at this time of day.” Using RSS 1.0 rather than 0.9x/2.0? No problem, just use mod_rss091, add the namespace, and you can use
Erm. Note to implementers of skipHours: could you please support both skipHours and http://purl.org/rss/1.0/modules/rss091#skipHours?