Need to make some 3D models but lack the skill and talent? Say, have you tried... AI?
That's going to be a recurring question in 2024
Luma, a generative AI startup building software that transforms text descriptions to corresponding 3D models, just raised $43 million (£34 million) in a series-B funding round led by Andreesen Horowitz, Nvidia, and others.
Founded in 2021 by CEO Amit Jain, a former systems engineer working on computer vision at Apple, and CTO Alex Yu, a graduate student from the University of California, Berkeley, Luma AI develops machine-learning software that goes a step beyond what we've seen from most existing generative neural networks.
Unlike text-to-image models that emit flat bitmaps of digital art, Luma uses AI to create from photos, videos, or text descriptions three-dimensional models of objects that can be downloaded, manipulated, edited, and rendered as needed.
The upstart, based in Palo Alto, California, has already launched this technology as an app called Genie – available via the web, as the Luma iOS app, and via Discord – which is capable of converting images and video into 3D scenes or producing 3D models of user-described objects. These machine-made models can be previewed on screen and exported to art packages like Blender, popular game engines like Unreal or Unity, and other tools for further use.
Screenshot of Genie's attempt at creating a vulture holding a cup of coffee for us ... This was generated from the simple prompt: a vulture with a cup of coffee. Click to enlarge
"We believe that multimodality is critical for intelligence," the upstart's founders gushed in a statement announcing their latest funding round on Monday. "To go beyond language models and build more aware, capable and useful systems, the next step function change will come from vision. So, we are working on training and scaling up multimodal foundation models for systems that can see and understand, show and explain, and eventually interact with our world to effect change."
Luma says it uses various proprietary computer vision techniques, from image segmentation to meshing, to generate these 3D models from footage and descriptions. The models could well end up being used in video games, virtual reality applications, simulations, or robotics testing.
Some folks may find this technology rather useful if they have an idea for something involving 3D graphics or a 3D scene, and lack the artist talent or skill to create the necessary models. Now they can ask Luma to generate those assets and import them into whatever software suite or engine they have to hand.
- Fake views for the win: Text-to-image models learn more efficiently with made-up data
- Pokémon Go was a 'success disaster' and Niantic is still chasing another hit
- Meta teaches AI image model to stop generating human fingers like a drunk Picasso
The upstart's tech could be used for prototyping or testing 3D systems before a more skilled human artist is brought in to create better models – or Genie's output may be good enough for your particular production.
We toyed with the system and found Genie's output looked kinda cute but may not be for everyone at this stage; the examples given by the upstart looked better than what we could come up with, perhaps because they were produced via the startup's paid-for API while we were using the freebie version.
That paid-for interface costs a dollar a pop to construct a model or scene from supplied assets. The biz even makes the point that this is cheaper and faster than relying on a human designer. If you can out-draw Genie, you don't have anything to worry about right now.
Luma previously raised $20 million in a series-A round led by Amplify Partners, Nventures (Nvidia's investment arm), and General Catalyst. Other investors included Matrix Partners, South Park Commons, and Remote First Capital. After raising a total of over $70 million so far, it has a valuation estimated between $200 million and $300 million. ®