Ukraine uses Clearview AI to identify slain Russian soldiers

Plus: GPT-NeoX-20B, the largest open-source language model available yet, and a new AI Silicon Valley VC fund


In brief Ukraine's vice prime minister has confirmed Clearview AI's controversial facial recognition system is being used to identify dead Russian soldiers just weeks after it started using the tech in the conflict.

"As a courtesy to the mothers of those soldiers, we are disseminating this information over social media to at least let families know that they've lost their sons and to then enable them to come to collect their bodies," Ukraine's vice prime minister and head of the ministry of digital transformation, Mykhailo Fedorov, told Reuters.

Clearview AI, a New York-based startup, initially made headlines when CEO Hoan Thon-That admitted to scraping billions of images from social media sites like Instagram and Twitter to build a large database. Its facial recognition algorithms are trained to match against images in that database given a photograph. By linking people's selfies to their social media accounts, their identity can be uncovered.

The upstart's technology has raised alarm bells. The company faces international fines, and was ordered to cease operations in some countries. Despite this, the biz continues to grow and now its facial recognition is being used in the Russian invasion of Ukraine.

New $50m Silicon Valley VC firm to invest in AI

AIX, a fresh $50m VC fund focused on investing in startups focused on AI technology, launched this week.

The new venture is spearheaded by a number of notable names. Richard Socher, ex-chief scientist at Salesforce and CEO of You.com, a machine learning-powered personalized search engine, and Shaun Johnson, former VP of product, design, and engineering at Lilt, a translation services company, are listed as co-founders.

Other founders also include Kaggle CEO Anthony Goldbloom, UC Berkeley robotics professor and Covariant president Pieter Abbeel, and Stanford University NLP professor Chris Manning. Fang Yuan, a VC with stints at Baidu Ventures and Stripe, will be AIX's part-time general partner.

"It is clear that machines are still just beginning to learn and the next few decades are going to be an exciting time for AI and humanity. There is going to be generation after generation of AI entrepreneurs who fundamentally rethink our approach and enable step changes in the technology," Socher said in an announcement.

"These entrepreneurs are going to need a strong AI community to help them achieve the best outcomes. That's why we are launching AIX Ventures, a new AI-focused venture firm made up of some of the world's top AI practitioners."

A free and open 20-billion-parameter language model

The recent phenomenon of language models in AI has launched new technological capabilities, but the best state-of-the-art systems are difficult to access and study. 

Now there's an open-source 20-billion-parameter language model named GPT-NeoX-20B that anyone can use for free built by Eleuther AI, a group of developers and researchers collaborating with one another to make the technology public.

"The current dominant paradigm of private models developed by tech companies beyond the access of researchers is a huge problem," Stella Biderman, a mathematician and artificial-intelligence researcher of the EleutherAI consortium, told IEEE. "We – scientists, ethicists, society at large – cannot have the conversations we need to have about how this technology should fit into our lives if we do not have basic knowledge of how it works."

There is a lot of interest in solving issues of bias, toxic language, or misinformation generated by these language models, but it's difficult to tackle them if you can't access the machine's inner workings. Eleuther has been steadily building and releasing increasingly larger models relying on companies like Google or CoreWeave to donate free hardware to train them.

GPT-NeoX-20B succeeds GPT-J-6B and is currently the largest open-source language model although they're both smaller than commercial systems that contain hundreds of billions of parameters. Another group effort from the BigScience team is currently training a separate open-source 176-billion-parameter language model that has not been released yet. ®

Broader topics


Other stories you might like

Biting the hand that feeds IT © 1998–2022