AI-driven creativity gives overpowered PCs something worthwhile to do, at last
Take your desktop rig out for a proper run, turning words into images, 3D models and videos – if it can
Column Until recently, personal computer hardware seemed to have leapt past any demands software could possibly place upon it. Even high-end games – traditionally the leading edge of user demands on performance – barely taxed the massively overpowered, top-end silicon available. Then AI art came along.
Apple's M1 Ultra microprocessor sports a transistor count north of 100 billion. Nvidia just released its flagship RTX 4090 GPU, with 76 billion transistors – a three-fold increase to the previous generation, the product of the latest process node, and a devil-may-care attitude toward power consumption. Nearly 500W TDP? Crank it up and heat your home this winter.
But to what purpose? A 300fps Fortnite battle royale? In April I wrote: "These monsters need to be tamed, trained, and put to work." Technology abhors a vacuum – four decades in the field has taught me that. Where there's capacity, something will come along to employ it.
That other shoe dropped at the beginning of September, when Stability.ai released Stable Diffusion.
Similar to systems such as DALL•E and Midjourney, Stable Diffusion hoovers up then reduces billions of images to symbolically weighted tokens that can be conjured back into visibility with an appropriately crafted text prompt. The whole thing sits just on this side of witchcraft – yet it works remarkably well.
Unlike DALL•E or Midjourney, Stable Diffusion is both entirely self-contained – able to run on any powerful-enough machine – and pure FOSS. This meant that although the initial release required some of Nvidia's highest-end GPUs, within a week project contributors had stripped back its code and reduced its hardware requirements. The current version can run quite comfortably on the beefy PC I bought six years ago to explore the newly reborn world of virtual reality – as well as on pretty much any M1-based Mac. Many gaming PCs and laptops can run Stable Diffusion well enough to use it for project-based creative needs – or just for fun.
Then a group of researchers published a paper on something they called Dreamfusion – capable of conjuring an infinite series of fully realized 3D models from text prompts. Type in
pineapple, and the computer will have a think, then generate its best approximation of what that model should look like. Although that group hasn't yet released its code, the paper provided enough of a blueprint for an ambitious coder to adapt the Stable Diffusion codebase to create Stable Dreamfusion – which, again, requires fairly powerful hardware.
An image produced by Stable Diffusion from the text prompt 'A robot painting a picture while running on a treadmill' ... Click to enlarge
Not to be outdone, another group at Tel Aviv University astounded the world with the Human Motion Diffusion Model. This paper showed how researchers had used Diffusion-based AI techniques to convert a prompt such as "
the person walks forward two steps and does a cartwheel" into a humaniform animation. A week later, the researchers themselves released their code as FOSS.
We're still a bit early into this exponential growth in AI capabilities to know where any of it will lead. Already, both Canva and Microsoft have integrated prompt-based image generators within their creative tools. Meta, Google, and others have demonstrated proprietary prompt-to-video generators. On current trend, we won't have to wait long until we have FOSS equivalents to play with.
- To preserve Earth's treasures, digital silence is golden
- Amazon has repackaged surveillance capitalism as reality TV
- After 40 years in tech, I see every innovation contains its dark opposite
- If you didn't store valuable data, ransomware would become impotent
The visual arts have powerful new tools that aren't the exclusive domain of giants like Google or OpenAI – the latter a firm that promised to democratize AI at its foundation, but perversely seems to have focused on creating its own proprietary empire with Microsoft as its unofficial owner.
In one of my first columns for The Register I pointed to the end of the endless upgrade cycle for PCs. No more treadmill: good enough, they would only be replaced when they wore out. With the exception of a flurry of upgrades to accommodate pandemic-driven videoconferencing that prediction has proven correct.
But the personal computer has shed its skin, revealing its slick new form as a creative supercomputer: Diffusion-powered, and creatively capable in ways the PC of old couldn't begin to approach. Rather than offering another new stylus or paintbrush, these qualitatively different tools forge a new kind of creative partnership.
In June I made a purchase of a high-spec PC laptop – and immediately felt guilty for it, thinking I'd never really put it to work. Today, I make full use of a machine that can do both the quotidian and the incredible. In retrospect, that purchase looks like a clever bargain – a harbinger of a true renaissance – as the PC, reborn, gets to work. ®