Sharing’s good, mmmmkay?

Like Sam, I ping weblogs.com because it’s the big daddy, and if something reads any changes.xml it’ll read that one, and I ping blo.gs because it’s fast and feature filled, and even though it will pick up weblogs.com pings once an hour, there are features (like IM notification) that you only get if you are pinging it directly, and I ping Movable Type because it puts me in good company, and now I ping blogrolling.com, too, just to see what Jason can do with the pings, but I shouldn’t have to do all that pinging, and they shouldn’t have to do all their cross-polling of changes.xml files.

If weblogs.com just registered for notifications from blo.gs’s cloud, then with a little trust it could just treat pings received from blo.gs as pre-verified, and cut down on all the traffic load that I presume is what’s slowing it down so much. Pinging programs would only need to ping until one went through – ping blo.gs, and if the ping succeeds you’re done, if it fails because blo.gs is down fall back to weblogs.com, if that doesn’t go through then try blogrolling.com. But it’s going to require a bit of cooperation, and quite a bit of trust. Why, if Blogger would cooperate on what constitutes a change (more than just clicking the Publish button), and weblogs.com and blo.gs would trust them to provide a clean update list without having to check for themselves, we could finally have update notification for more than just the privileged few Blogger Pro weblogs.

12 Comments

Comment by Sam Ruby #
2003-03-16 11:41:47

There’s a Russian saying: Trust but Verify. These people don’t actually have to trust each other that much: they always have the option to verify. To my knowledge, most if not all of these of these applications already verify that pages that are identified by pings were, in fact, changed since the last ping.

Comment by Phil Ringnalda #
2003-03-16 12:02:26

Trust but Verify works for us, letting us just ping once and have it quickly propagate, but it doesn’t offer any real advantage to them: maybe they pick up a few more pings, but they still have to work just as hard with those pings, and add on a bit more burden having to sort out which cloud pings are things they’ve already seen. A weblogs.com cloud would be good for blo.gs, but since most things that ping blo.gs but not weblogs.com are the banned things that were the impetus for blo.gs, there’s no real benefit to weblogs.com if it has to treat pings from blo.gs’s cloud just like any other ping.

I can’t seem to track down any mention of how many Blogger-powered weblogs update per minute, but I know it would be a serious number compared to what weblogs.com is processing now, with the added burden of having to try to get pages out of Blog*Spot to verify the pings. If Blogger provided an update cloud that could be trusted without verification, then I’d think there would be enough clear benefit to make it worthwhile to all four, plus whoever else is out there, to start sharing.

Comment by jim winstead #
2003-03-16 14:02:05

on the other hand, one reason that a service like blo.gs may want to still download information from the pinging site is to be able to do what the aim bot does — include the most recent rss item in the update sent via im. (or sniff out the rss feed from sites using the non-rss-including weblogs.com interface.)

but i’m surprised none of the rss search engine and link analysis sites has really tapped into the blo.gs cloud yet. although only about a third of the sites listed ping blo.gs directly (and are thus forwarded via the cloud interface), that gets you 30% closer to real-time updates.

what really needs to happen is for a nice peer-to-peer distributed network of rss item flow to spring up. i’m not sure how you bootstrap it, though.

(another question looming is ”what will google do?”)

 
 
 
Comment by ruzz #
2003-03-16 12:41:14

Hey, if you can think of a way bbt can make this cleaner send me an email. I’m happy to work towards pinging efficency. even if i cant spell it right now :)

Comment by Phil Ringnalda #
2003-03-16 12:58:07

Unless the pingees are willing to share, I don’t think there’s anything the pingers can do to improve things beyond pinging everything in sight. You have to ping weblogs.com because it doesn’t read anything else, you have to ping blo.gs because otherwise you’re part of the once-an-hour crowd from weblogs.com, and you don’t get included in the IM notifications. Probably the place to clean things up is convincing Jason to register with the blo.gs cloud rather than taking direct pings, since unless he’s got something planned for direct pings only, that ought to cover the same people.

 
 
Comment by Morten Frederiksen #
2003-03-16 14:12:22

Hmm, perhaps someone should invoke the LazyWeb for a standalone personal ping proxy, for asynchronous pings…

 
Comment by Julian Bond #
2003-03-17 07:24:49

As well as the standard (REST+XMLRPC+SOAP) for Pings, and the FanOut server (that the client site pings and which then pings all the MetaPing sites), when are the search engines (Google) going to get involved.

As a web developer, I want to ping one place, really fast, to say ”I’ve changed. Please index me”. I then want that ping to propagate to all the search engines and Meta aggregators without any further involvement from me.

How much evangelism and cooperation will this take? Or will it just take one site to fire up a FanOut server? Or would it be better to follow Morten’s suggestion and build a few personal ping proxy servents in various environments and distribute the problem?

ps Lazyweb

Comment by Phil Ringnalda #
2003-03-17 07:58:20

Is there actually any difference between a personal ping proxy server and a FanOut server?

If I was doing a PPPS, I would want it to accept a weblogUpdate ping, check the URL to be sure it had changed (to keep me from sending spam pings when-not-if I screw up somehow), fire off pings to everywhere I know of, log the ping, and mail error reports to webmaster@. If I was doing a FanOut server, I would …

Hrm. Having typed that, I realized that I would see that I was building a way to spam webmaster@. How do you handle errors async other than by email? Required registration and registration of each URL you’ll ping from would be a pain, but I don’t see a way around it.

Then there’s the search engine question, which Jim asked as well. Even though the centralized nature of it makes me nervous, the best fit I can see for Googler working together without being evil is for Google to take over the accepting pings market, by accepting weblogUpdate pings, and allowing essentially unlimited (something like ”no more than once a second” rather than ”no more than once an hour”) access to their changes.xml file. If pinging them means that Google will instantly crawl and index your site, I would think that they could pretty quickly corner the market on being pinged, leaving Jim to work on value-add like favorites registration and IM notification powered by Google’s changes.xml.

If they do, I predict a serious explosion in pinging tools and in the sort of sites that ping, since being able to get your newspaper site (or your tshirt store site, for that matter) indexed as soon as you add something would be a huuuge thing. I’m not sure how (or if) you could deal with that: determining whether or not something’s a blog programmatically sounds tough, since I sometimes have trouble looking at it in a browser, and now that Barbie has a blog it’s only a matter of time before every business is saying the same things about how you have to have a blog to survive that they were saying about having a brochureware site last decade.

Comment by Morten Frederiksen #
2003-03-17 13:23:06

Phil,

With regard to the security/spam issues, I was thinking along the lines of a private and local ”server”, meaning a somewhat simple script of sorts that lived with your weblog software of choice, and only accepted pings from your own weblog (software and machine).

This of course only solves half the problem, the posting delays and possible timeouts – the ”fetching changes.xml” problem is, well, unchanged.

With regards to error handling, for this case I’d simply ignore a non-sent ping.

Hmmm, tangentially, I guess the concept could be extended to asynchronous plugins and be of general use, perhaps with updates to other local files, including search indexes etc. (the latter would of course not really be necessary if Google were to be instantly updated).

Comment by Phil Ringnalda #
2003-03-17 13:51:44

Sure, I understand what you mean for a personal proxy, and for all anyone knows I’m already using one with security by obscurity (I’m not, yet…), but you’ll want a PPPS that accepts weblogUpdate pings, even if you have just one weblog and already know the URL and name for it, just so you can easily hook it up with MT or anything else that lets you ping an arbitrary server, and having lived with both programs that ignore the response and ones that don’t, I really prefer ones that don’t, so I’ll want it to log results, and tell me when there’s an error (since the error might be that I screwed something up, rather than just a timeout), and once I’ve got all that, it’s arguably more trouble to keep it personal than it is to make it public: you either have to do something to keep it private, like look at the IP sending the ping (and beware when your host moves you without telling you), or maybe something cunning that I’m not sharp enough to think of, to block outside pings, or you can figure some way around the notification issue and just make it public (not that I think there would be a huge demand for any particular person’s ping proxy, since anyone who can ping a particular URL probably also has the ability to run at least one of Perl/PHP/Python, and if you add in bookmark(let) pinging, you could proxy your Blogger-using friends around with a single click too).

It’s not something that I think has to be done, just sort of interested me once I got to thinking about it, how little difference in effort there would be between making a private one and a public one.

 
 
Comment by jim winstead #
2003-03-18 08:50:49

as far as accepting unlimited access to a hypothetical google.com changes.xml, it would need to be something smart enough that you could say ”tell me the updates since <some date/time>”. pulling down three hours of updates every hour is already pretty wasteful.

and i think if google entered the picture, nobody operating at my level would be able to handle the volume of pings they would attract.

Comment by Phil Ringnalda #
2003-03-18 09:17:18

We certainly could use some way of saying ”I already know about everything up to this, what’s new” – see several dozen discussions on RSS here and elsewhere, where people who say that HTTP solves all problems come up against the fact that there’s no way to say in an HTTP header that you don’t want anything if the file hasn’t changed since datetime, but if it has you only want the new stuff since then.

The ”what’s a blog?” problem just flat out baffles me, but it seems inescapable. In fact, right now I’m thinking that Google will have to be evil, and only instantly crawl Blogger-powered blogs, just because that’s the one sure way of saying that something’s a blog: we know it was published with blogging software. Whether it’s Google’s own ping-list or weblogs.com/changes.xml, if it’s known that Google will instantly crawl and index you if you appear there, there’s going to be a flood.

 
 
 
 
Name (required)
E-mail (required - never shown publicly)
URI
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <del datetime="" cite=""> <dd> <dl> <dt> <em> <i> <ins datetime="" cite=""> <kbd> <li> <ol> <p> <pre> <q cite=""> <samp> <strong> <sub> <sup> <ul> in your comment.