Musematic
We all work for Google

Posted by on Thursday September 17 2009

One way or another, it seems, Google’s got us all. I’m not claiming that’s evil — after all, their motto is Don’t Be Evil – just that it’s ironic.

The latest Google acquisition is reCaptcha – that little authentication program that you’ll find at the bottom of Musematic’s comment box. But why would Google — specifically, the Google Book Search program — want reCaptcha? I couldn’t figure it out until I read this:

“Since computers have trouble reading squiggly words like these, CAPTCHAs are designed to allow humans in but prevent malicious programs from scalping tickets or obtain millions of email accounts for spamming. But there’s a twist — the words in many of the CAPTCHAs provided by reCAPTCHA come from scanned archival newspapers and old books. Computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text.

In this way, reCAPTCHA’s unique technology improves the process that converts scanned images into plain text, known as Optical Character Recognition (OCR). This technology also powers large scale text scanning projects like Google Books and Google News Archive Search. Having the text version of documents is important because plain text can be searched, easily rendered on mobile devices and displayed to visually impaired users.”

So we, the crowd, are now working for Google Book Search. Interesting twist. These guys are so smart it’s scary.

Another commentator has pointed out that reCaptcha was developed by a University researcher who was funded by a MacArthur grant. And that scanned words from the Internet Archive (among other sources) are used as “Captchas.” Meaning that this is, ironically, “privatization of public talent.”


Filed under: Random Musings

Leave a Reply

Bad Behavior has blocked 866 access attempts in the last 7 days.