Alphazero
AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. The algorithm uses an approach similar to AlphaGo Zero.
On December 5, 2017, the DeepMind team released a preprint introducing AlphaZero, which within 24 hours achieved a superhuman level of play in these three games by defeating world-champion programs, Stockfish, elmo, and the 3-day version of AlphaGo Zero. In each case it made use of custom tensor processing units (TPUs) that the Google programs were optimized to use. AlphaZero was trained solely via "self-play" using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks, all in parallel, with no access to opening books or endgame tables.
After four hours of training, DeepMind estimated AlphaZero was playing at a higher Elo rating than Stockfish 8; after 9 hours of training, the algorithm decisively defeated Stockfish 8 in a time-controlled 100-game tournament (28 wins, 0 losses, and 72 draws). The trained algorithm played on a single machine with four TPUs.
DeepMind's paper on AlphaZero was published in the journal Science on 7 December 2018. In 2019 DeepMind published a new paper detailing MuZero a new algorithm able to generalize on AlphaZero work playing both Atari and board games without knowledge of the rules or representations of the game.
Maybe Alphazero or it's approach can be used to design the ultimate Bridge Program!?
We can of course take the
Dreyfus line and say that bridge is different
because, unlike chess and Go, it requires real
human judgement and understanding. But given
what's happened so far, this seems optimistic.
From the point of view of the AI engineer, the
thing that makes Bridge hard is that each player
has only partial information, so the search space
includes all the possible distributions of the
unknown cards. That means a lot more to think
about. But as we saw with Go, a very large search
space doesn't mean that machines can't do it.
There have been a couple of false starts. GIB, which every bridge player knows, was supposed to become the world's best bridge player a little after the Deep Blue breakthrough. GIB can basically do double-dummy analysis perfectly. It handles partial information by generating a hundred or so layouts that fit what it already knows, doing double-dummy on all of them, and then picking the choice which works in the largest number of layouts. It does bidding by using rules that tell it what the allowed bids are in a given situation, generating layouts that fit the bidding,then again making the choice that works in most layouts.
As GIB's inventor Matthew Ginsberg discovered, this doesn't give you more than a strong amateur player. But if you applied deep learning methods and the same kind of hardware as AlphaGo uses (it runs on a network containing hundreds of processors), I think you would see a huge increase in strength. There are plenty of online hand records to train the neural nets. The 'move generation function' would be one net, which looks at the current situation and gives you the plausible candidates for next bid or play. The 'evaluation function' would be another net, which looks at a layout and estimates how likely each contract is with single-dummy play - basing everything on artificial double-dummy play is one of the reasons why GIB's judgement has never been that great. If you have enough processors to use, you wouldn't just be limited to creating a hundred layouts to model what you don't know. You could create more layouts to model the other player' uncertainties too, and in effect think about what they are thinking.
Of course, this sketch is simplistic. Building a world-class bridge AI would probably be a big software project that required dozens of personyears of expert effort. But all the pieces now seem to be there. It took 44 years to get from Turing's initial paper on computer chess to Deep Blue, and it took another 20 years to get from Deep Blue to AlphaGo. My guess is that it will take significantly less than 20 years to get to the point where a deep learning system will beat the best human bridge players. It's mainly a question of finding someone who has a strong enough desire to make it happen and enough money to pay for the work. Well: it isn't hard to think of a person who's very rich, has access to hundreds of highly talented AI experts, and likes bridge. I'm starting to wonder why this hasn't already happened.
What might be the effect on the bridge world, if a world-class bridge AI emerges? Looking at what's happened in chess, it probably would be more good than bad. Since everyone who can afford a basic laptop now has access to a worldclass chess player, chess has taken off in many countries where the game was hardly played before. All grandmaster chess tournaments are now broadcast online with reliable real-time computer commentary, so amateurs can follow what's going on. And, a development that interest bridge players, chess AIs are good at unmasking cheats. Since the machines know what the right move is in most positions, they can spot when someone is playing too well and give statistically significant evidence that something funny is going on. The US chess master and computer expert Ken Regan has been a pioneer in this field. In fact, when you think more about it, a strong AI might be exactly what bridge needs...
Links
Google's AlphaZero Destroys Stockfish In 100-Game Match
Leela Chess Zero: AlphaZero for the PC
Alpha Zero: Comparing "Orangutans and Apples"
MuZero figures out chess, rules and all
DeepMind's MuZero teaches itself how to win at Atari, chess, shogi, and Go
AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning
Game Changer, the Book on AlphaZero launched today
AlphaZero: Shedding new light on chess, shogi, and Go
Creative' AlphaZero leads way for chess computers and, maybe, science
Is Alphazero really a scientific breakthrough in AI
Is AlphaZero really a scientific breakthrough in AI? Discussion on Bridgewinners
Monster Machine Cracks the Game of Go
Google Wants Robots to Acquire New Skills by Learning From Each Other
DeepMind's AI Shows Itself to Be a World-Beating World Builder
yper-Parameter Sweep on AlphaZero General
The entrophy of AI and a case stuidie of Alphazero