Artificial intelligence plays Stratego and defeats humans: a breakthrough – knowledge

By Christian J Meier

Acting strategically, accepting short-term disadvantages for long-term success and, yes, sometimes bluffing: qualities that people are more likely to have than artificial intelligence (AI). But an algorithm from the London Google subsidiary Deepmind now shows precisely these abilities, at least when playing “Stratego”, a highly complex board game. The AI ​​”DeepNash” masterfully played against human players on the largest online platform for Stratego. She won 84 percent of the games and was at times in third place, according to researchers led by Julien Perolat in the specialist magazine Science communicate. Previous algorithms would only have played at amateur level, the researchers write.

Stratego has been considered the next step for AIs playing games for years. Computers have been better than humans at chess since 1997, when world champion Garri Kasparov lost to IBM’s Deep Blue chess computer. However, chess is much easier for a computer to calculate than, for example, the Asian board game Go, which allows many, many more possible positions of the pieces. Deepmind cracked that nut in 2016 with its AI “AlphaGo”, defeating one of the world’s top Go players, Lee Sedol.

Stratego is another league again. Because the game is highly complex in two respects. In Stratego, similar to chess, two parties, each with 40 pieces, face each other. These have different ranks and one figure is the flag to be conquered. Higher ranked characters beat lower ranked characters when they meet. The only problem is: the player does not see the ranks of the opposing pieces. Only by attacking a piece does he find out its rank. So he has to get the information bit by bit at the risk of losing pieces. The opponent’s train behavior also gives clues, which of course can also be used to bluff. Thus, Stratego not only has a myriad of possible moves, which is even larger than in Go, but also hides information that is otherwise known from card games.

Above all, artificial intelligence is good at processing large amounts of data

You can’t beat a game like this by going through all the possible sequences of moves in your head, but only through experience, different strategies or feints, i.e. typical human skills.

Artificial intelligence, on the other hand, is particularly good at finding patterns in large amounts of data. But this can certainly be applied to board games. If a player plays very often, he recognizes recurring patterns in the position of the pieces and knows which strategy is promising in each case. The advantage of an AI is that it can play against itself – at the speed of a supercomputer. The DeepNash algorithm has played against itself about ten billion times. A person couldn’t play that often in a lifetime. In this way, the AI ​​learns patterns in the course of the game and appropriate actions. Important: The AI ​​does not encounter all possible game variants, there are far too many for that. Thus, their ability can definitely be compared with intelligence.

What is special about DeepNash compared to AlphaGo is that it pursued the goal of achieving a so-called Nash equilibrium in its self-training. A game in Nash equilibrium is stable. Because the unilateral deviation from the strategy would mean a disadvantage. In real life, you can see that among discounters: deviating from the cheap strategy upwards would cost customers and you can’t make the prices even cheaper: A stable situation. In zero-sum games like Stratego, this strategy leads to good results even against strong opponents, write the Deepmind researchers.

The scientists write that DeepNash showed behavior in the game that is familiar from top players. For example, it sacrificed pieces to learn the positions of the opponent’s high-ranking pieces. So DeepNash considered the value of this information to be higher than the disadvantage of having fewer pieces on the board. A consideration that is not trivial, as the researchers write.

The program can even successfully bluff

The algorithm even proved to be adept at bluffing. For example, he was chasing an opponent’s piece with a lower-ranking piece, causing the opponent to mistake it for a higher-ranking piece. Because of this error, the opponent ultimately lost a valuable piece.

“This algorithm is impressive,” comments Marc Toussaint, head of the Intelligent Systems department at the Technical University of Berlin. “Methodically, the DeepMind authors are making considerable progress towards optimal game strategies in zero-sum games,” says Toussaint. However, he emphasizes that it is a game that has fixed rules and can be simulated exactly. Everyday challenges, such as road traffic or household chores, are not of this type. “These game AI methods can hardly be directly applied to problems without an exact and efficient simulator,” says the researcher. “Nevertheless, the past has shown time and again that research on game algorithms can also advance basic research.” Thus, with DeepNash, an AI again trumps humans in a certain area – but by no means in all respects.

source site