Microsoft's boffins in its New York research lab are encouraging a Minecraft character to teach itself how to climb a hill.
Their work, described here, is an advertisement for AIX. No, not the Unix – Redmond's platform, due to be open-sourced this summer, that helps computer scientists test their machine-learning algorithms in the Microsoft-owned Minecraft world.
With AIX, you build a landscape and then test your AI software on it. This saves you having to create a physical arena to test code: can your neural network overcome obstacles? Find its way out of a maze? Build structures? Waste whole weekends mining stuff while shunning friends and family and avoiding contemplating the inevitability of death? And so on.
If there's one way to fuel a killer AI, it's to trap it in Minecraft – basically, SimCity for toddlers – and teach it frustration and boredom.
Fernando Diaz, a senior researcher in the New York lab and one of the people working on the hill-climbing project, said: "We're trying to program it to learn, as opposed to programming it to accomplish specific tasks."
This may be a misleading description, however. Reinforcement learning is the AI equivalent of a brute-force search. As Microsoft's Allison Linn noted, the agent "needs to endure a lot of trial and error, including regularly falling into rivers and lava pits. And it needs to understand – via incremental rewards – when it has achieved all or part of its goal."
That means that the agent starts out knowing nothing at all about its environment or even what it is supposed to accomplish. It needs to understand its surroundings and figure out what's important – going uphill – and what isn't, such as whether it's light or dark.
A paper written by the team explains [PDF] that their work is based on addressing the issues of "high-dimensional observations and complex real-world dynamics" which "present major challenges in reinforcement learning for both function approximation and exploration."
Meanwhile, Facebook's director of AI research Yann LeCun hilariously tore into a "completely, utterly, ridiculously wrong" bit of AlphaGo coverage on Slashdot over the weekend, following claims that: "We know now that we don't need any big new breakthroughs to get to true AI."
LeCun said "most of human and animal learning is unsupervised learning. If intelligence was a cake, unsupervised learning would be the cake, supervised learning would be the icing on the cake, and reinforcement learning would be the cherry on the cake. We know how to make the icing and the cherry, but we don't know how to make the cake."
"We need to solve the unsupervised learning problem before we can even think of getting to true AI," wrote LeCun, "and that's just an obstacle we know about. What about all the ones we don't know about?" ®