AP says "Web log" but real bloggers say "weblog"… and Google says "glarbifulous"

Well, Google didn’t say “glarbifulous” on its own, but I had a good reason to search the internet for a nonsense word.

In order to confirm my feeling
that the Associated Press’s preference for “Web log” is far less popular online than the traditional “weblog,” I did a quick Google search.

12,900,000 Google hits for [“weblog”]

I expected that. For years, my own blog has been ranked anywhere
from 99 to about 180 out of however many hits there are for “weblog,”
and I’ve been tracking that number every year when I submit my annual faculty report. I thought that maybe that number was a little lower than I remembered, but I realize that Google’s numbers fluctuate as it re-indexes older sites.

I wasn’t surprised when I found only a paltry

250,000 Google
hits for [“web log”]

… since only AP writers format the term that way. But when I tried to exclude the AP articles that use
“web log,” I found… 

24,700,000 Google hits for [“web log” -AP]

Why do I get ten times more hits  for what should be a more restrictive search? 

The Googly weirdness does not stop there. When I include AP, why do I get 25,000 more hits than when I
exclude it?

275,000 Google hits for [“web log” AP]

The nonsense word “glarbifulous” appears nowhere
on the internet
(though that will change once Google notices this post). I was quite surprised, then, to see that after excluding “glarbifulous” from my search,
I find…

175,000,000 hits for [weblog -glarbifulous]

That’s more than ten times as many sites as I get when I
don’t exclude the nonsense word. How can so many more pages NOT have a word that doesn’t exist?

Maybe Google has paid closer attention to the quality of pages that contain the word “weblog,” removing a lot of junk results that it figures are pointless. But maybe when I ask Google to do a search for something less popular, it thinks I might actually be interested in some of the sites it would otherwise ignore. Suddenly, every single page in its database that doesn’t include “glarbifulous” becomes potentially relevant, since each of those pages has met a criterion that I have specified.

That seems to make sense, but it also seems, well, twisted. I just did a search for “the” by itself, and “the -glarbifulous” and got similar results…. about twice as many hits for the more restrictive search.