phil ringnalda : Twisty paths to RSS archives and import/export

Twisty paths to RSS archives and import/export

I haven’t actually seen it myself, since I still haven’t caved and bought a Mac just so I can run Brent’s (RSS Archive) Blog Browser, but thanks to Robert Barksdale, you can see a screenshot of my MT blog’s RSS archive being BlogBrowsered.

Once I get a fresh copy of Radio downloaded, I think I’ll try importing my MT blog, just to see what happens. Since the import-from-RSS feature is a few minutes old at the moment, and didn’t expect to have to import my ad-hoc MT stuff, I expect I’ll break it, but if it works… oh, I love the smell of interop.

This entry was posted on Wednesday, November 27th, 2002 at 8:56 pm and is filed under blogging tech. You can follow any responses to this entry through the post feed. You can skip to the end and leave a response. Pinging is currently not allowed.

9 Comments

Comment by Phillip Pearson #

2002-11-27 21:13:04

I’m writing a decoder for all this in Python and may have time to do a wxWindows blog browser so Windows and Linux people can join in … the importer for bzero will come first though ;-)

Reply to this comment

Comment by Phillip Pearson #

2002-12-03 03:09:39

Update: it works!

http://www.myelin.co.nz/gazer/

Reply to this comment

Comment by Mark Gardner #

2002-12-03 09:09:56

I’m now storing my Blogger-driven site in RSS 2.0 and the ”files.xml” format. Could one of you guys test it with your apps? I’ve got a main RSS file and a list of archive files. On the last one, I’ve added titles to each file (using the Dublin Core title element) — I hope this is legal.

Reply to this comment

Comment by Phil Ringnalda #

2002-12-03 09:48:42

Bravo! I’d idly thought about Blogger (it would work for an MT import format for more than 999 entries, where the current scheme fails), but hadn’t got past idle thought. BlogGazer 0.01 completely fails to read your blog, since it’s a pretty radical departure from the files.xml format and 0.01 doesn’t even tolerate my pubDates that don’t have the weekday in the date, but it’s still early days.

Dave: could you make the size=”” part of files.xml optional in the spec, so apps won’t depend on it being there, or do you need it for Radio’s implementation? It’ll be somewhere between hard and impossible to do in other apps (I just hand-coded my files.xml to get around the problem, but that keeps me from doing it as a live archive – if I have to I can do files.xml as a PHP script that reads the directory and fills in the file sizes, but that really complicates the process).

Reply to this comment

Comment by Dave Winer #

2002-12-03 12:40:52

Phil I just got back from NY, so this has to be quick. There is no spec for files.xml. However I’m sure when there is one, size will be optional. Radio doesn’t need it, particularly, other than to be able to tell if a file changed, but the mod date should do the trick.

Reply to this comment

Comment by Mark Gardner #

2002-12-03 13:12:44

I can’t really put modification dates, sizes, or what have you in my Blogger-generated archive file. Just the file paths and titles.

Also, looking at the screen shot of BlogBrowser, it looks like entry dates are inferred from the file paths listed. Is this really a good idea? What if someone stores things using a different directory hierarchy? (OK, forget the ”what if” — that’s what I’m actually doing, by necessity of how Blogger dumps its files loose into one directory.)

And, well, if the intent is to express hierarchy, shouldn’t we be using OPML?

Reply to this comment

Comment by Phillip Pearson #

2002-12-03 16:06:17

BlogGazer uses the pubDate element in the RSS to determine date … I assume that’s how BlogBrowser does it as well.

Reply to this comment

Comment by Phil Ringnalda #

2002-12-03 20:07:49

Are you guys talking about the same dates? I’m pretty sure Mark means the years and months in the tree menu on the left side, which surely aren’t coming from pubDate (since otherwise you would have to download and parse every bit of RSS before you could display anything, and it doesn’t look like BlogGazer’s doing that). Don’t those come from files.xml?

And isn’t he screwed for that, since Blogger is a complete bitch to work with in the archive template? The best thing I can think of is to publish the Blogger archive template as PHP, with a bit of code to parse Blogger’s date ranges and turn them into creation dates, tell your server to parse it as PHP in .htaccess, and then use mod_rewrite to serve up Blogger’s archive files for requests for /2002/10.xml. It’s not very satisfactory, but then we are talking about Blogger-produced archives, so you can’t expect to be very happy with it no matter what.

Reply to this comment

Comment by Mark Gardner #

2002-12-04 14:22:24

Yes, I mean the tree menu, which as Phil said, clearly isn’t coming from pubDate.

I really don’t want to have Blogger generating a PHP file. I like the fact that I can use the same XSLT to generate both content and archive pages. (Though it could use a lot of refactoring… not enough time lately. :-( )

Reply to this comment