In a world where every self-respecting software product or service has an API, it’s surprising how convoluted it is to get simple result counts from the leading search engines.
While working on a recent coding project, I needed total results counts for a particular word or phrase. So I turned to the 800-pound gorilla, expecting that with its dozens of API projects, Google would be a walk in the park. Apparently not.
Jumping through hoops
First, you have to request an API key, then create a Custom Search engine. Considering the small data I wanted, pure overkill. And it doesn’t get any better. Next, you have to specify at least one site to search — although I don’t want to restrict the search, I want the entire web. Google says you can’t do that.
But wait, there’s actually an option to “Search the entire web but emphasize included sites.” Huh?
What you see is not what you get.
Well, let’s select that option and compare results with regular search:
- Google search for ‘tintin’: 30,700,000 results
- Google CSE search for ‘tintin’: 2,080,000 results
What?! That’s less than 7 percent — not even remotely close. Going by comments from users in the API forums, Google supposedly uses different indexes for its custom search engines. Not cool. Yahoo, here we come.
Brother, can you spare a key?
At first, Yahoo seems promising, providing good ol’ RSS feeds for any keyword searches without needing an API key, which Google does not have. Unfortunately, no result count is available in the data returned.
Turning to Yahoo! Search BOSS, the equivalent of Google’s Custom Search, we run into a paywall immediately. Fine for a larger project, unnecessary to programmatically get the occasional result count. At least Google gives you 100 queries per day free.
Oh well, on to Bing which, by the way, now powers Yahoo Search.
Salvation comes from Redmond
Microsoft surprises sometimes, in a good way. Then again, Bing itself got generally good reviews when it was released and its Cure for Search Overload Syndrome ad campaign did hit the spot. Like Yahoo, Bing provides no-API access to RSS versions of search results. (Good.) Like Yahoo, the feed is missing result counts. (Bad.) But unlike Yahoo, full API access is free (Very Good!) and unlike Google, the result count matches regular Bing Search. (Very Very Good!)
- Bing search for ‘tintin’: 3,820,000 results
- Bing API search for ‘tintin’: 3,820,000 results
Phew! Who knew getting a search count could be so complicated?
A better approach
To put this whole experience in perspective, let’s consider how two other services provide API functionality: Topsy (a Twitter search engine) and Tumblr (well, you know, Tumblr):
- Basic access is free and has reasonable limits: Topsy allows 3,000 free API calls per day, no questions asked, no API key needed.
- Graded access level: Tumblr has three options — No authentication for open information, and for higher level calls, an API key or OAuth authentication depending on the request.
Done. Seems the smaller companies are thinking this through better.