How Facebook uses public videos to train, deploy machine-learning models and harvest those eyeballs

Plus: Google Ethical AI team firings backlash worsens


In brief Facebook this week revealed an internal project to create machine-learning models that can understand visual, audio, and written content from videos publicly uploaded to its social network.

One of the models, known as Generalized Data Transformations (GDT), is now used on Instagram. Users viewing short video recordings, or Reels, can quickly find other Reels they might like to watch, thanks to an AI-powered recommender system that picks similar clips that might be interesting.

For example, if someone on Instagram tends to watch videos of tractors, the GDT recommender system will highlight other videos of tractors. The model notices similar features; both videos will contain images of a vehicle with big wheels or the sound of an engine whirring away.

Facebook hopes that the project, Learning from Videos, will help the company build more useful AI tools, like searching for specific photos taken during a birthday party.

“Recalling memories in this way requires teaching systems how to match the phrase “happy birthday” to cakes, candles, people singing various birthday songs, and more,” it explained this week.

Earlier this week Facebook announced a similar plan for facial recognition, using Instagram data. Don't blame Zuckerberg, you signed up for it.

Activists tell machine-learning community to turn down jobs from Google

Googlers appalled by the ousting of the two leaders of the web giant's Ethical AI unit have urged the members of the machine-learning community to turn down job offers at the super-corp, and for academic conferences to decline funding from the company.

Timnit Gebru and Margaret Mitchell were expelled from Google after they pushed back on management, who had asked them to remove their names from a paper scrutinizing the social and environmental impact of large language models, like the ones used by Google.

“Therefore, we call on members of the AI community, especially those who make their careers researching the social and ethical consequences of tech, to take the following actions in solidarity with the Ethical AI team,” the group, Google Walkout for Real Change, wrote on Medium.

It urged researchers to not cooperate with Google’s recruitment teams, and to not allow the biz to sponsor academic conferences or for organizers to accept any type of funding. You can read the post in full here.

New AI toolkit to help scientists unpack the genomes of rare types of cells

Researchers at Nvidia and Harvard University have built software to help sequence the DNA of cells and help geneticists study the causes of some diseases.

The toolkit, named AtacWorks, was described in a paper published in Nature Communications, this week. Essentially, it takes messy computational data describing the overall epigenetic profiles from cell sequencing experiments and cleans it up to predict its genome.

“AtacWorks both denoises sequencing data and identifies areas with accessible DNA, and can run inference on a whole genome in just half an hour with NVIDIA Tensor Core GPUs,” according to an Nvidia blog post.

The software is most useful for rare cell types, where scientists haven’t been able to sequence as many of them as they’d like in their experiments. “With AtacWorks, we’re able to conduct single-cell experiments that would typically require 10 times as many cells,” said Jason Buenrostro, co-author of the paper and an assistant professor at Harvard University.

AtacWorks is available to use here. ®


Biting the hand that feeds IT © 1998–2021