Breaking the rules is in Big Tech's blood – now it's time to break the habit
Microsoft: All your data are belong to us? World: That's so last century
Opinion Microsoft's journey through intellectual property has been a multi-year saga that makes Game of Thrones look like a haiku.
From the young Bill Gates and Paul Allen plundering software companies' garbage bins for source code listings, Gates writing a snotty letter to computer clubs a few years later complaining about software copying, and Ballmer calling Linux a cancer, to Windows having an embedded Linux layer, Microsoft has shown a very variable attitude to intellectual property.
We're now back full circle, except that while Gates helped himself to a few sheets of fanfold, Redmond's AI CEO Mustafa Suleyman is claiming dibs on the open web and the content therein. It's "fair use" because of the "social contract" that anyone can copy anything – a novel legal argument of the sort only 1980s cyberpunks and 21st century giant corporations can make. Spoiler: Cyberpunk is dead and today's version rarely ends well either.
It's also remarkably short-sighted. It says that a huge body of work that absolutely does have copyright protection has had it removed by stealth. It hasn't; it is simply in Big AI's interests to strip legal protection from AI training data. Or so it thinks. In the indecent race to eat the world, those jockeying the apocalyptic horses either don't realize or don't care that lobbying for more anarchy in IP weakens protection they rely on.
Imagine, for example, an AI system that exercises third-party software and services, analyzes their behavior, and then generates new and directly competing products. Good luck arguing that this behavior, an area entirely outside intellectual property law, deserves special new protection if you've just eviscerated the concept.
To be fair, he did add: "There's a separate category where a website or publisher or news organization had explicitly said, 'do not scrape or crawl me for any other reason than indexing me,' so that other people can find that content." But he went on to add that this was a "gray area. And I think that's going to work its way through the courts."
We've been here before. Intellectual property law exists to strike a balance between protecting creators' rights to their work, including making money, and recognizing the benefits of new ideas to human culture. Whenever there's a new technology that upsets that balance, three things happen. There's a barrage of lawsuits that try to contain or amplify the consequences of the new technology within the law as it stands. Then there's a lot of lobbying and debate, and finally new law that rebalances what the old law cannot. The dogs bark, and the circus moves on.
This happened with photocopiers, phonograms, tape recorders, and, most aptly, the internet. You can't transport data through a packet switching network without making innumerable copies along the way. The law was changed to exempt this from copyright. Rules were set out on the rights of content providers and consumers over things like digital rights management and the like, and we all moved on. The special pleading of established businesses was heard but not always heeded. In the same way that home taping did not kill music, torrenting has not killed film.
We are moving apace through this process for AI, and we can use the experience of previous technologies to make good calls on why the protagonists are saying what they say. Companies committed to the AI wars want a wartime economy devoted to serving them, with little regulation and all resources available without restriction. Those who generate those resources see this as a huge heist in a flimsy disguise, one that threatens their existence. Should they be sacrificed in someone else's battle?
Add to that the fact that unregulated markets fail because corporate interests do not align with customer interests, and that we can't know all the ways AI can fail due to unforeseen factors in its training data. It's a very new technology feeding off, and directing into, our digital existence. The demands for maximum access and minimal oversight are not going to end well for anyone.
- Polyfill.io claims reveal new cracks in supply chain, but how deep do they go?
- Meta, Microsoft SQL Server make strange bedfellows on a couch of cyber-pain
- Can platform-wide AI ever fit into enterprise security?
- Fragile Agile development model is a symptom, not a source, of project failure
Until we get to the end of the process of creating new laws and regulations, there are things we can do. Although well-lawyered organizations are happy to use ambiguity and force to get their own way, they are peculiarly sensitive to acting illegally where there's no doubt or room for denial. That's how open source's use of permissive IP works – you can use this code, this content, but you must agree to share the results equally. It's not perfect, but it's worked well enough to create a global ecosystem that no large tech company can escape.
The same process will work with setting the wishes of IP creators regarding their use in training data. While the law gets its head around copyright and AI regulation, the existing protections can be used in licenses. You don't want your IP used for training data? Say so in your license. Use it, but only if you publish all other training data it's used with? Say so.
It's one thing for a seven-digit exec to claim that they have the right to anything they can Google (Bing having inexplicably failed to verb) because that's how people behave and AI's just like that. It's quite another to deliberately break an explicit licensing condition aimed exactly at you. That makes lawyers nervous. All it takes to get them there is if enough such licenses are out there to make contamination plausible. That's why Ballmer compared open source to cancer. Once established in even small amounts, it has very strong self-replication.
Although the current flavor of AI is both novel and extremely economically active, it will follow the arc of any IP-disruptive technology. We have the grandiose self-interest of the self-entitled big tech firms. We have the swarm of lawsuits. We have the growing recognition of the need for new regulation and new law. And those of us who create or distribute IP have the choice, right now, of making our own small but significant input to the process. It would be a crime not to use it. Pass it on. ®