…, glorious spam

Not much over two years after Andrew “I see the Googlebots walking among us” Orlowski incorrectly announced it as both imminent and likely to involve removing weblogs from the general search, Google Blog Search has launched.

I swear it’s not ego, just a useful testing set where I’m familiar with the possible results. I generally start testing anything that searches with my last name, and then maybe fiddle with it a bit. So, my second search was for ringnalda -site:philringnalda.com sorted by date. Yuck.

I was actually a little bit amused by the search engine spammer who quoted Shirley Kaiser’s

I was absolutely horrified when I read Phil Ringnalda’s comment spam alert story last year in which a Las Vegas real estate agent used a script to try to autogenerate comments to every single one of Phil’s entries, including links to the spammer’s real estate site.

in a Las Vegas real estate spam blog, when I saw it flicker through a PubSub feed the other day, since it rather nicely combines irony and nostalgia, but other than that? I got as far as page seven of the results, without seeing anything but page after page of identical mortgage spam posts. True, I haven’t been doing much lately to cause anyone to take my name other than in vain, but that’s just awful. Right now, the only way I’ll ever use Google’s Blog Search again is if I see someone’s post saying “finally, at last, Google has cleaned the spam out of their Blog Search.”


Comment by Phil Ringnalda #
2005-09-14 01:21:09

Getting a little less lazy and jumping ten pages at a time, actual content starts on page 51, so it’s ”only” around 500 spam posts between the few real ones at the head and the rest of the real posts.

Comment by Stephen Duncan Jr #
2005-09-14 03:06:17

It does seem to work quite well when sorting by relevance though, which I imagine is Google’s main advantage, their ranking ability…

By the way, I think the site looks better at http://search.blogger.com

Comment by Adrian #
2005-09-14 04:36:06

As you say the key is removing the spam from the search results not the blogs.

Half the time when I am looking for say ”information about how to change my ipod battery”, I’m as likely looking for a blog post on it as apples official gumpf.

Orlowski seems to present the idea that blogs are the noise, which simply isn’t the case. Now comparison bot driven shopping site and spam makes up much more noise. More often than not blogs are part of the signal. Even blogs about lunch and cats. :-)

Comment by Shelley #
2005-09-14 07:00:51

Even without the spam, how viable is this? When I search on Missouri, I want to find out things about the state–the government sites, the maps, the tourist info, and so on. I don’t want to hear about a person who drove through the state on his way somewhere else.

Only good thing about it is you’ve posted. ’bout time ;-)

Comment by Adrian #
2005-09-14 07:26:42

But surely blogs of people who have driven through the state on holiday may link to maps, government sites, and that really top restaurant everyone should try, both giving you what you are looking for and increasing the value of those non blog sites so their are higher up in the search anyway.

Comment by Shelley #
2005-09-14 07:35:11

If there was a level of sophistication, perhaps. If I had a way of searching on blogs for information related to Missouri among sites that primarily write on travel or food, the Ozarks or photography, then I can start to see some extra added value.

But I have enough blog reading I do during the day; when I search, I want to find fact, not opinion or throwaway asides.

Comment by Adrian #
2005-09-14 09:01:52

Oh I agree. What we are all looking for is to get to the search results quicker. But how is the system (in this case google) to know you are looking for fact and not opinion.

Lets look at the case of restaurant review sites of which their are loads. Do we move these these to a special google search page called foodreview? I mean I often filter them out or add them in by putting the word review in. this works ok for food and pubs, but for consumer electronics makes no difference as all I get price comparison sites either way. Or should these review sites be under blogs because they are opinion not fact.

And then what about technical issues? When I’ve had a problem with coding something, I hit google. I no longer bookmark anything because stuff changes so much. My solution is as likely to be on a blog (including yours or Phil’s sites) as to be in a web reference on the general net, as likely to be linked from a blog to a resource with the answer.

I totally agree that it’s difficult to find things, as the web has grown really big really faster, and blogging has contributed to that just as much. However the only use I can see for a blog only search is to search about things when people talk about blogging. Which I barely ever do.

I’m not against ways of making google more likely to find the right result. But I just don’t see how google bots can tell the difference between ”really useful factual blog post on how to code X” and ”opinion about why coding X is a load of monkey bollocks”.

And I really don’t want to have to keep putting my searches into two google pages to find what I want. I sometimes have to do that already with google.com and google.co.uk and it drives me nuts.

I think the better way to go is what google seems to be doing with using peoples search history and behaviours to refine what people (and specificly me) are looking for with some sort of smart heuristics or analysis.

Comment by Phil Ringnalda #
2005-09-14 08:58:42

I would guess, exactly as viable as Technorati or Feedster. Probably I just travel in the wrong crowd, but I don’t actually know anyone who uses either one for search, only for research, answering questions like ”who is talking about me,” ”who is talking about my product,” or ”who wrote about this that I can TrackBack.”

So far, I don’t see any sign that they’ve done the hard part: it’s Google Post Search, not Google Blog Search. They may do a wonderful job of finding the most relevant posts about a search term, but that’s not really a problem I need to have solved, because for that I want a general search, not one restricted to blogs (-and-things-with-feeds). If they could successfully tell me the most relevant blogs, in general and over the long haul, for a search term, that would be quite nice, but other than that all I’m likely to want from them is a feed of sorted-by-date hits on a term, to replace or supplement PubSub, and there the only way they can compete is by keeping it spam-free (though being able to see what you’ll get before you subscribe certainly competes very well with PubSub’s starting experience).

(And, yeah, I know, I’m sorry – Firefox release coming up, and I’m posting plenty in Bugzilla, but that doesn’t help people who don’t read every Firefox and Toolkit bug. I’ll try to do a little better than last fall, anyway.)

Comment by Shelley #
2005-09-14 10:45:47

I agree, I wouldn’t use this for any other reason than general weblog research. As for relevant blogs, I am wary of this because it comes down to link counts again, and yet another way of highlighting highly linked sites just leaves me cold.

Sounds like you’ve been busy and if the blog needs to slide, it needs to slide. I’m thinking of a long, long break myself.

Comment by Richard Evans Lee #
2005-10-16 15:13:41

I wasn’t really that surprised when I got some comment spam using ”admin@philringalda.com” as the email. But the really surprising thing was the domain: microsoft-rape@…

Since the domain isn’t actually registered this may have been a test to see if I block comment spam or perhaps just a way to villify you.

Usually stupid even for a comment spammer.

Comment by Phil Ringnalda #
2005-10-16 15:29:48