Googlers and co offer video dataset-generating Kubric

How your computer-vision model learned to stop worrying and love the Python


Developers can create large datasets of synthetic videos to train computer-vision models using software written by a team of researchers led by Google.

Data is one of the most important ingredients of deep learning. You need good quality samples in large quantities to boost a model's performance, though datasets of this ilk are by their nature few and far between. If you need images or videos, you could scrape them from internet profile pages and galleries, though this is tedious to clean and process at large scale, and can give everyone privacy and legal headaches. If you need more complex data, such as video footage with depth and object annotations, your available sources dwindle further.

Fake computer-generated material avoids these problems by giving you exactly what you need, and there are various ways to output it, including by using machine learning. And yes, there is the caveat that you need to know what you're doing when training a system on synthetic data: if the dataset doesn't match reality, your model is going to get a skewed view of the world.

Anyhow, here's a way to create terabytes of fake training data – specifically, video footage of objects interacting with each other – which could be useful for teaching models to understand what they can see around them in the real world.

More than 30 researchers from top AI research labs at Google, DeepMind, the University of Toronto, MIT, and more, collaborated to develop Kubric, an open-source Python-based software library that can simulate scenes of objects, and is aimed at deep-learning engineers. The library can export generated synthetic datasets directly into machine learning models during the training process. The code is built on top of physics engine PyBullet with Blender used for rendering.

"Kubric is a high-level Python library that acts as glue between: a rendering engine, a physics simulator, and data export infrastructure," its developers said in an arXiv-hosted paper shared this month about the project. "Its main contribution is to streamline the process and reduce the hurdle and friction for researchers that want to generate and share synthetic data." 

Its main contribution is to streamline the process and reduce the hurdle and friction for researchers that want to generate and share synthetic data

They demonstrated how Kubric-made datasets can train AI systems to perform multiple computer vision tasks, from segmenting objects in images, to reconstructing frames in videos, to pose estimation.

Developers using Kubric can write scripts to produce scenes filled with objects, and run the code numerous times to generate footage from different viewing angles and lighting conditions. Kubric's documentation states it is "a data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow."

These datasets won't be cheap to make, however. The researchers said it currently requires "substantial computational resources" to run, and they needed "[three] CPU-years of compute-time" to create one particular dataset.

"We hope that it will help the community by lowering the barriers to generating high-quality synthetic data, reduce fragmentation, and facilitate the sharing of pipelines and datasets," they concluded. 

You can download Kubric here. ®

Broader topics


Other stories you might like

Biting the hand that feeds IT © 1998–2022