AI shoves all in: DeepStack, Libratus poker bots battle Texas Hold 'em pros heads up

Both robo-players ace humans – in one-on-one matches

15 Reg comments Got Tips?

DeepStack is the first AI computer programme to beat professional poker players in a game of hands-on no-limit Texas hold’em, a team of researchers claim in a research paper out this week.

The use of games to train and test AI is prolific. Surpassing human-level performance in a game is considered an impressive feat, and a mark of progression for machine intelligence. DeepStack had an average win rate of more than 450 mbb/g (milli big blinds per game) over 44,000 hands of poker – a high number considering 50mbb/g is a respectable margin in professional poker.

Last year, Mark Zuckerberg announced his AI team was building an agent to play the ancient game of Go, but was upstaged when news broke that Google’s AI arm, DeepMind, had already beaten Lee Sedol, a professional player, in a series of matches streamed on live TV with AlphaGo.

Now, researchers are battling it out over poker. The paper [PDF], written by researchers from the University of Alberta in Canada and the Charles University and Czech Technical University in the Czech Republic, is currently under peer review but has been published unofficially on arXiv.

It was released in the same week that academics from Carnegie Mellon University in Pennsylvania announced their AI poker bot, Libratus, will play against top Heads-Up No-Limit Texas Hold’em poker players at Rivers Casino in Pittsburgh, Pennsylvania.

A live stream of the Brains vs Artificial Intelligence tournament shows that Libratus is leading at the time of writing on Friday, after beating its human opponents on the first day of the competition.

Poker is a difficult game for machines to master. Unlike chess, Jeopardy!, Atari video games or Go, no-limit Texas hold’em is considered an imperfect information game. Players do not have identical information about the current state of the game, as they withhold private information from one another.

AI poker intuition

A game between two players betting any number of chips produces 10160 possible situations – a number too large for a computer to handle. To skirt around the problem, DeepStack “squeezes” it down to 1014 abstract situations that are learned by playing against itself.

Like DeepMind’s AlphaGo, DeepStack picks the best move to take by drawing on a bank of possible moves by calculating what types of scenarios are more likely, something the researchers compare to intuition: “A gut feeling of the value of holding any possible private cards in any possible poker situation.”

The programme’s “intuition” has to be trained using two neural networks. One learns to estimate the counterfactual – or “what-if” values after the first three public cards are dealt, and the other neural network recalculates the values after the fourth public card is dealt.

Simplifying the number of situations means the decision tree computed by DeepStack is effectively pruned, and it’s easier to approximate the Nash equilibrium – a solution in game theory which states that no player has an incentive to change his or her strategy – continuously, after each round.

Since it doesn’t have an overarching strategy decided before the game, it doesn’t need to keep tabs on all 1014 abstract situations – it can solve the decision tree in under five seconds.

“The DeepStack algorithm is composed of three ingredients: a sound local strategy computation for the current public state, depth-limited lookahead using a learned value function over arbitrary poker situations, and a restricted set of lookahead actions,” the paper said.

DeepStack in luck

A closer look at the results shows that DeepStack did not win by a statistically significant amount against every player. Eleven out of the 33 professional poker players completed the requested 3,000 games, and for all but one of the 11 games, DeepStack didn’t win by a remarkable amount – the top human poker player only lost by about 70 mbb/g.

DeepStack “was overall a bit lucky,” the researchers admitted, but taking into account slight adjustments, its estimated performance win was still a sizeable 486 mbb/g instead of the initial 492 mbb/g.

Although DeepStack’s opponents aren’t the best poker players, the result is still impressive, Miles Brundage – who is not involved in this research and is an AI Policy Research Fellow at the Future of Humanity Institute at the University of Oxford – told The Register.

“It still very decisively beat new players – and is reasonably good – considering it’s not the best possible neural net. It seems very scalable and there are a lot of obvious ways to improve it, like adding more GPUs,” Brundage said.

With seven layers, each equipped with 500 nodes, it’s not the biggest neural network. Part of DeepMind’s AlphaGo success in mastering Go and beating the world’s best players is due to its larger and more powerful neural network, Brundage explained.

In comparison, AlphaGo was trained with four neural networks each with tens of layers and many nodes.

DeepStack and Libratus seem to be making great strides in poker, but both AI poker bots can only play against one other player. The next challenge to conquer is to build an AI that can play against multiple players all at once, and maybe even bluff. ®

SUBSCRIBE TO OUR WEEKLY TECH NEWSLETTER


Keep Reading

Epic Games floats $1m bounty to ID source of 'commercial smear' claiming Houseparty chat app has been hacked

Lots of non-savvy users may be recycling previously hacked creds

Tabletop battle-toys purveyor Games Workshop again warns of risks in Microsoft Dynamics 365 ERP project

Project holding steady for resident techies but white knuckle ride continues

Tencent pop group, formed on a Tencent TV show, boosted Tencent games and Q1 revenue

Chinese web giant also flings cash at cloud and videoconferencing to capture 'This working from home thing could really catch on' sentiment

Looking for a new tech gig? Here are vacancies for web devs, games programmers, server engineers and more

Job Alert Advertise with us here, or browse the listings to see if a role would suit you

Apple bans COVID-19 games and restricts virus-related apps to authoritative souces

No virus-fragging fun unless you’re actually fragging viruses – and no universal developer fee waiver either

Crunch time: It's all fun and video games until you're being pressured into working for free

UK industry survey sheds light on ridiculous hours, culture of harassment and bullying

Nvidia's A100 GPU coming to a cloud near you, DARPA details AI war games, Intel wants to help scan your brain

Roundup Plus: Zuck wants machines to spot bad memes on Facebook

Brits swarm Dixons Carphone for laptops, printers, games consoles, fridges, freezers to weather out COVID-19 storm

Online sales up 72%, but retailer warns of impact of store closures

Biting the hand that feeds IT © 1998–2020