Super-viral Creative Commons licenses

As Richard notes, the new Yahoo! Creative Commons search returns any matching result which includes a link to one of the Creative Commons licenses, including his posts (and mine) talking about someone else’s use of a particular license. Creative Commons’ own search apparently only looks for embedded RDF licenses, but as a result finds fewer, as more and more people abandon the hack of putting RDF in comments in HTML, both for licenses and for TrackBack. So since people indicate their licensing by just linking to the license, Yahoo! assumes anyone linking to a license is licensing something under it, and is licensing the broadest possible URL: if they catch you with an entry linking to a license on your main page, they’ll return that as being what’s licensed. Makes for a rather viral license, if you can’t even mention it with a link without being infected.

Eiffel Tower picture by Norman Walsh

If everyone who was actually using a license followed the Creative Commons advice to use their logo, display it prominently, and clearly explain exactly what is and isn’t licensed, that wouldn’t be too big a deal. However, the example that Yahoo! gives in their documentation, searching for Eiffel Tower, returns as its first result an entry on Gothamist, which links to its license with a teeny tiny copyright symbol, and no discussion of what is and isn’t licensed. I happen to know that I’m looking at a weblog entry, and since the linked comment policy doesn’t mention licensing your comment under their license, whatever that copyright symbol intends to CC license, it doesn’t cover comments, but I’ll bet that J. Random “What’s a Blog?” Searcher doesn’t know that, and would think he could use text from a comment under the same license as any other text on the page. Then, there’s the photo: I suspect that it isn’t actually CC licensed, because it isn’t actually Gothamist’s to license. That’s why I grabbed a copy of Norm’s Eiffel Tower picture from which to create my derivative work. Even there, I’m on rather shaky ground: the HTML page links to a by-nc Creative Commons license (which I’m then required to link to as well, though only that one picture is under that license, no matter what a search engine tells you), but the metadata he extracts from the RDF he embeds in his JPEGs says “All rights reserved.” Luckily, I don’t expect him to sue me, and I’m at least reasonably sure that he does hold copyright in that picture; the Gothamist picture is certainly copyrighted by someone, but I have no idea by whom, or what rights they do or don’t reserve.

I don’t expect the ocean to boil, and I don’t expect people to add precise and accurate metadata: after all, even Flickr, which is quite capable of getting it right, claims in the embedded RDF of that page that the CC license applies to <Work rdf:about="">, which says that it is the HTML of the page including the photo that is licensed, rather than <Work rdf:about="">, which would say that it’s the photo which is licensed.

But if I was the Yahoo! lawyer who vetted their Creative Commons search, and let it loose without any disclaimer that “Yahoo! makes no assertion about what, if any, content in these results is actually offered under a Creative Commons license” I’d be hanging my head in shame.


Comment by Geodog #
2005-03-30 01:04:29

S/He will be hanging his/her head in shame as soon as your post makes its way through the blogosphere.

Comment by Norman Walsh #
2005-03-30 09:58:57

Hmm. More recent photographs explicitly give the CC license. I’ve made a note to update the license for all the old photos too.

You’re welcome to the picture under the Creative Commons Attribution-NonCommercial License. :-)

Comment by Firas #
2005-03-30 10:26:12

Where should the RDF go instead? I mean, if one insists on using RDF and not link rel="license" or meta name="copyright" etc.

Comment by Mike Linksvayer #
2005-03-30 11:32:40 finds far fewer licensed pages mostly because it is a small project and has only crawled a few million pages so far (the current index has 1.2 million licensed pages).

Of course you can link to a license without licensing the content containing the link. Any search result that includes such content when doing a CC-only query is inaccurate. The Y! search for CC is just a (very valuable) start.

Regarding imprecise metadata (subject is always the current page) — it’s the best we can do without deep integration into whatever publishing software a user is using. It’d be great to have explicit license statements about every image and other non-page resource licensed, but requiring users to generate such statements would prevent most users from publishing metadata at all. See my related comments here and here.

As a partial workaround the CC search engine takes dcmitype assertions as hints that a page contains licensed images, video, audio, etc. (That’s what the ”format” if chosen restricts on).

Firas: One can link to an external RDF file in a page’s head, but this is beyond most people. I hope to replace admittedly ugly embedding RDF/XML in HTML comments with RDF-A when that is finalized.

Comment by Firas #
2005-03-30 13:06:48

Ah. I was concerned about a tool I write, throwing a link to an RDF file in the head and making it output the RDF upon a request on that link would be quite simple. Will do.

Comment by Phil Ringnalda #
2005-03-31 00:49:20