Googlebooks crusade captures CAPTCHA king

Fights spam. Pumps OCR


Google has acquired reCAPTCHA, a free CAPTCHA service that also serves as a means of digitizing printed books and newspapers. Among other things, the Mountain View web giant is looking to juice its ever-controversial library-scanning Book Search project.

Google announced the acquisition this morning with a post to the Official Google Blog, and it couldn't help but trumpet the news with, yes, a CAPTCHA:

Google Acquires ReCaptcha

"The image above is a CAPTCHA — you can read it, but computers have a harder time interpreting the letters. We tried to make it hard for computers to recognize because we wanted to give humans the scoop first, but we're happy to announce to everybody now that Google has acquired reCAPTCHA, a company that provides CAPTCHAs to help protect more than 100,000 websites from spam and fraud," the post reads.

But its not just spam and fraud protection that interests the Mountain View Chocolate Factory. ReCAPTCHA is also a way for Google to improve the OCR (optical character recognition) technology it uses to digitize printed materials for both its Book Search and News Archive Search services.

In providing websites with CAPTCHAs - visual Turing tests that separate humans from machines - reCAPTCHA often includes text scanned from books and newspapers that can't be read with OCR. It pairs this unknown text with a recognized word or phrase. Website visitors are asked to read both words, and if they get the known word correct, ReCaptchas can assume they also read the unknown text correctly.

ReCAPTCHA - a Pittsburgh, Pennsylvania-based outfit that spun off from research originated at Carnegie Mellon University - is currently helping the New York Times to digitize its archive.

Luis von Ahn, the reCAPTCHA founder who co-authored Google's blog post, is one of the Carnegie Mellon researchers who coined the term CAPTCHA, short for Completely Automated Public Turing test to tell Computers and Humans Apart. ReCAPTCHAs first hit the web in 2007, and Ahn founded the company in 2008. The Carnege Mellon assistant computer science professor has not responded to our request for comment.

"Google is the best fit for reCAPTCHA," reads a canned statement from von Ahn tucked into a press release. "From the very start, people often assumed the project was connected to Google, so it only makes sense that reCAPTCHA Inc. ultimately would find a home within Google."

Von Ahn will remain on the Carnegie Mellon computer science faculty, but he will also work at Google's Pittsburgh engineering office, which is on the university's campus. In the press release, he indicated that reCAPTCHA aleady has close ties with Google. In 2006, the company licensed an Ahn-developed game for use in its Google Image Labeler. Terms of Google's acquisiton were not disclosed. ®

Similar topics

Broader topics


Other stories you might like

  • Infosys skips government meeting - and collecting government taxes
    Tax portal wobbles, again

    Services giant Infosys has had a difficult week, with one of its flagship projects wobbling and India's government continuing to pressure it over labor practices.

    The wobbly projext is India's portal for filing Goods and Services Tax returns. According to India’s Central Board of Indirect Taxes and Customs (CBIC), the IT services giant reported a “technical glitch” that meant auto-populated forms weren't ready for taxpayers. The company was directed to fix it and CBIC was faced with extending due dates for tax payments.

    Continue reading
  • Google keeps legacy G Suite alive and free for personal use
    Phew!

    Google has quietly dropped its demand that users of its free G Suite legacy edition cough up to continue enjoying custom email domains and cloudy productivity tools.

    This story starts in 2006 with the launch of “Google Apps for Your Domain”, a bundle of services that included email, a calendar, Google Talk, and a website building tool. Beta users were offered the service at no cost, complete with the ability to use a custom domain if users let Google handle their MX record.

    The service evolved over the years and added more services, and in 2020 Google rebranded its online productivity offering as “Workspace”. Beta users got most of the updated offerings at no cost.

    Continue reading
  • GNU Compiler Collection adds support for China's LoongArch CPU family
    MIPS...ish is on the march in the Middle Kingdom

    Version 12.1 of the GNU Compiler Collection (GCC) was released this month, and among its many changes is support for China's LoongArch processor architecture.

    The announcement of the release is here; the LoongArch port was accepted as recently as March.

    China's Academy of Sciences developed a family of MIPS-compatible microprocessors in the early 2000s. In 2010 the tech was spun out into a company callled Loongson Technology which today markets silicon under the brand "Godson". The company bills itself as working to develop technology that secures China and underpins its ability to innovate, a reflection of Beijing's believe that home-grown CPU architectures are critical to the nation's future.

    Continue reading
  • China’s COVID lockdowns bite e-commerce players
    CEO of e-tail market leader JD perhaps boldly points out wider economic impact of zero-virus stance

    The CEO of China’s top e-commerce company, JD, has pointed out the economic impact of China’s current COVID-19 lockdowns - and the news is not good.

    Speaking on the company’s Q1 2022 earnings call, JD Retail CEO Lei Xu said that the first two years of the COVID-19 pandemic had brought positive effects for many Chinese e-tailers as buyer behaviour shifted to online purchases.

    But Lei said the current lengthy and strict lockdowns in Shanghai and Beijing, plus shorter restrictions in other large cities, have started to bite all online businesses as well as their real-world counterparts.

    Continue reading
  • Foxconn forms JV to build chip fab in Malaysia
    Can't say when, where, nor price tag. Has promised 40k wafers a month at between 28nm and 40nm

    Taiwanese contract manufacturer to the stars Foxconn is to build a chip fabrication plant in Malaysia.

    The planned factory will emit 12-inch wafers, with process nodes ranging from 28 to 40nm, and will have a capacity of 40,000 wafers a month. By way of comparison, semiconductor-centric analyst house IC Insights rates global wafer capacity at 21 million a month, and Taiwanese TSMC’s four “gigafabs” can each crank out 250,000 wafers a month.

    In terms of production volume and technology, this Malaysian facility will not therefore catapult Foxconn into the ranks of leading chipmakers.

    Continue reading
  • NASA's InSight doomed as Mars dust coats solar panels
    The little lander that couldn't (any longer)

    The Martian InSight lander will no longer be able to function within months as dust continues to pile up on its solar panels, starving it of energy, NASA reported on Tuesday.

    Launched from Earth in 2018, the six-metre-wide machine's mission was sent to study the Red Planet below its surface. InSight is armed with a range of instruments, including a robotic arm, seismometer, and a soil temperature sensor. Astronomers figured the data would help them understand how the rocky cores of planets in the Solar System formed and evolved over time.

    "InSight has transformed our understanding of the interiors of rocky planets and set the stage for future missions," Lori Glaze, director of NASA's Planetary Science Division, said in a statement. "We can apply what we've learned about Mars' inner structure to Earth, the Moon, Venus, and even rocky planets in other solar systems."

    Continue reading

Biting the hand that feeds IT © 1998–2022