Meet the new tag soup

Of course Hixie’s pretty much right that we’ve now traded a tag soup of <table> and <font> for a tag soup of <div> and <span class=”boldred” style=””>, but it’s not all bad. While you could sometimes make some sense of the tables thanks to autoindenting, they were quite often tab-indented so far that with wrap on you sometimes had whole blank lines of tabs. The new soup is harder for a human to parse (“</div>? Which one, when there are 13 open?”), but the indenting’s a bit better.

Oh, the question, how do we get people to use semantic markup instead of soup? That’s simple, the same way we get them to use rich semantic RSS 1.n, or any other bit of abstract goodness: make it worth their while. Having a <link rel=”alternate”> that points to another version of the page makes good semantic sense, and virtually nobody did it until Mark pushed and pulled us into using it for RSS autodiscovery. Had it been a push alone, “oh, it’s semantic and meta-filled,” probably nobody would have paid attention. But because it also pulled, with bookmarklets you could use right there and then to subscribe in multiple aggregators, we all said “cool, I can see that working!” and did it.

If you want people to use sane and semantic headlines, you have to give them (er, actual ideas tend to be my downfall) a browser extension that shows a linked-up outline in the sidebar, or an on-hover outline in a search results page, or convince them that <h2>Blue Widgets</h2> will just absolutely kick <div style="font-size:14px">Blue Widgets</div> so far out of the results that they’ll be buying the competition’s leftover inventory and insisting that the former owner deliver it himself. On Saturday.

The people who are persuaded by elegance, and dreams of building the data that will somehow cause things that use it to spontaneously form, have already been persuaded, and didn’t quite form a critical mass. For the rest, I think we need an actual current benefit.


Comment by Lachlan Hunt #
2005-01-21 21:15:26

Outliner 0.2 is a Firefox extension for showing the document outline based on the headings that worked as a selectable table of contents. However, it claims to be compatible only with Firefox 0.8 to 0.9.1. I haven’t hacked the version numbers to test if it really is incompatible though.

Comment by Phil Ringnalda #
2005-01-21 21:25:32

Oh, FFS! I should have known that when it felt like I had half an idea, I was stealing it, and stealing from jgraham, not even a stranger. None too easy to link to a canonical page for it, but there is actually a Outliner 0.3 that’s 1.0+ compatible.

Comment by jake #
2005-01-22 23:03:03

Actually James updated his version back in November (I think) but noticed a bug so he didn’t release it. I don’t know the current status but I’m sure if knew more people besides me were interested (seems there are) he’d find a little time to advance it. Maybe we can come up with some ideas on where he can take it. But for now I’m going to bed…

Comment by jgraham #
2005-01-25 07:26:39

So, it appears that I’m a long way behind on following up here. The latest status of the Outliner is that the 0.3 ’release’ is known to work with the 1.0 branch and seems to be good enough for day-to-day use. I tried to get the listing on updated but the relevant bug was reassigned to nobody and, despite getting several emails telling me the facility for authors to update extensions was ready for sign up, it doesn’t seem to work. So I can’t update the official listing. Instead you’ll have to grab it from

The lack of an independant web page is an issue that I shall try to address immediate future.

Bug reports, suggestions and, especially, patches are more than welcome :)

As for the general thread, I entirely agree that the only way to get people to write decent code is to provide some value for their efforts. That means client side tools that make browsing well-written web pages a more pleasant experience than poorly written ones. The problem, however, is that the majority of the web is so poorly written that users will reject the client-side tools themselves rather than the site when the tools fail to deliver the expected functionality.

Comment by Phil Ringnalda #
2005-01-25 07:56:25

All of update.m.o is a trainwreck right now, including using the ”nobody@mozilla” default assignee where people are likely not to understand it. It’s fine in parts of the Core where everyone understands that it means ”anybody@mozilla”, but for something so public-facing they really would have been better served by creating a default assignee like ”update.listings@gmail”.

I hope that we are gaining some project memory that deploying a web app isn’t a trivial little thing, so we won’t do this again, but I fear we probably aren’t.

Comment by jgraham #
2005-01-25 08:42:02

I was under the impression that really did mean ”nobody’s working on this” whereas other aliases like general@dom.bugs actually represented real groups of people who actually recieve the bugmail (which they then ignore). I assume there isn’t some sort of crazy system whereby nobody@mozilla can redirect bugmail depending on the component of the bug …

In any case I tried to update the listing almost a month and ago and nothing’s happened. Which is just one more sign that update is, as you say, a trainwreck. But it’s brought webtools back into vouge in the project – at the very least we have the layout problem reporter tool (probably a good idea) and Gerv’s thing to take feedback and dump it in an unread newsgroup (eh?) both in development. No doubt we’ll have the same problems there…

Comment by Phil Ringnalda #
2005-01-25 09:10:46

Up to the moment when someone starts the few minutes it takes to install an extension, make sure it doesn’t completely break the browser and at least seems to work, it’s literally true that ”nobody is working on this” and nobody really needs to read bugmail on it, either: whatever there may be, they can read in the bug when it gets to the top of the pile. So it’s ”just” a cosmetic issue, but during this rather long lull in adding listings, it’s a rather large cosmetic issue.

Comment by Henri Sivonen #
2005-01-22 00:46:25

rel="alternate" is not necessarily semantically correct. The expert hindsight seems to agree on rel="feed" being more appropriate. But rel="alternate" is implemented, so…

Comment by Phil Ringnalda #
2005-01-22 07:41:46

Or, alternatively, it’s correct but people like the taste of the carrot so well they want to use it outside the semantics (people including me, I fear).

Since there is no single English word which expresses the relation between a single instance of the author’s writing and a sliding window of his most recent writing, changing from a working and deployed and sometimes semantic rel to a brand new and only vaguely semantic (and sometimes just as wrong: the rel between a five year old weblog post and the author’s current, nondynamic feed which will not allow the reader to work backward to the point where that post is included is, er, rel="feed-what-once-contained-me") rel strikes me as silly, but I don’t have enough code to maintain in that area to worry about it. If the four or five people I saw discussing it for Atom make it so that for all time, any discussion of how to implement feed discovery starts out ”Well, it depends on which flavor of feed you have…” so be it.

Or will someone realize before last call that in fact, taking this page as an example, rel="alternate" is semantic for the entry-plus-comments feed, while rel="parent-feed" is semantic for the link back to the main feed? If we’re going to fork from well-known structure, we might as well fork ourselves while we are at it.

Comment by Firas #
2005-01-22 11:17:41

That’s what the title attribute is for! We can trust humans to make some decisions.

It’s like everyone’s itching to awaken Skynet.

Comment by Firas #
2005-01-22 11:22:43

I presume you’ve seen the W3C’s Semantic data extractor? Rather neat extension that would make.

Comment by Aankhen #
2005-01-24 01:28:03

No extension that I know of, but I did put together a small bookmarklet:

Extract semantic data

Right-click on the link and bookmark it.

Comment by seth #
2005-01-22 20:46:12

The new soup is harder for a human to parse (”? Which one, when there are 13 open?”), but the indenting’s a bit better.

I always put comments with my </div>s to help with the human parsing. Like…

</div><!-- closes #subnav -->
Trackback by Antipixel #
2005-01-22 08:24:55

Some people do need convincing

Phil Ringnalda on the new tag soup

Trackback by Dare Obasanjo's WebLog #
2005-01-23 16:22:52

Taxonomies, Folksonomies and Metacrap

Trackback by #
2005-01-30 01:08:45

Metadata Musings

Summary: Random metadata musings, and a note about language information….

Name (required)
E-mail (required - never shown publicly)
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <del datetime="" cite=""> <dd> <dl> <dt> <em> <i> <ins datetime="" cite=""> <kbd> <li> <ol> <p> <pre> <q cite=""> <samp> <strong> <sub> <sup> <ul> in your comment.