You are currently viewing AlphaZero: When IA masters chess game

Chess has been useful for the development of technology, and the game itself has also benefited from technological development. Chess was notably a computer programming challenge, which Alan Turing and D. G. Champernowne took up in 1948 with Turochamp, a chess program and the first game developed for a computer. Conventional chess programs became so powerful that it was no longer interesting for humans to attempt to play directly against them, but rather to use them as sparing partners to improve their own game. One particular chess program, AlphaZero, outperformed all chess programs developed up to that point – which were already much better than humans. AlphaZero is built on a remarkable architecture, since it possesses an artificial neural network entirely built by its play… against itself.

A quick history of the power relations between human chess players and chess programs – then the top chess program against each other

  • In 1997, the DeepBlue program developed by IBM defeated the world chess champion at the time, Garry Kasparov. This event is pivotal, both in the history of artificial intelligence and in the history of chess. It symbolizes the end of the superiority of humans over machines – and the end of a long history of power relations between human chess players and chess programs.
  • A new balance of power followed, the one involving commercial chess programs against each other. This balance of power is materialized by the Top Chess Engine Championship taking place annually since 2010, in which the Stockfish program has been the grand winner of almost all editions.
  • It was AlphaZero’s victory over Stockfish in 2017 that put an end to any debate – proving the superiority of artificial intelligence in closed systems such as chess

A consistent system compared to the ambitions of DeepMind (Google) and the company’s projects all built around “intelligence resolution”

The AlphaZero self-learning system was developed by the British company DeepMind, which was acquired in 2014 by Google. AlphaZero is part of the company’s artificial intelligence research. In addition, its predecessor AlphaGo had managed to beat Lee Sedol (one of the world’s best players) in the game of Go in 2016. 

DeepMind’s goal is not to commercialize these programs and become a benchmark in chess and board game programs – which it became anyway. DeepMind’s ambitions go far beyond that, and the company’s remarkable achievements in reinforced learning allocate the company a certain scientific credibility. DeepMind’s goal with both AlphaGo and AlphaZero was to get as close as possible to human intuition. Although it is difficult, if not impossible, to settle on this notion when it comes to AlphaZero, since intuition is subjective, its way of playing is no less efficient, flexible and of a very high level, especially since AlphaZero learned everything on its own, and has therefore developed a technique and a style of its own. We can consequently mention its creativity, in other words, its capacity to discover something new or unexpected. In this sense, AlphaZero’s process of reinforcement learning or self-learning is the essence of creativity.

DeepMind’s goal is to “build intelligent systems that can learn to solve any complex task on their own,” and then apply that technology to help find solutions to some of the bigger challenges and unanswered questions.

The architecture of AlphaZero, a self-taught artificial intelligence

Unlike traditional chess programs, AlphaZero was trained using a reinforcement learning process without any prior knowledge, whereas chess programs are usually fed from a database. AlphaZero’s training lasted nine hours, during which time the system relied solely on the rules of the game. During these few hours, AlphaZero played a total of 44 million games against itself, meaning more than 1,000 games per second, going from a non-existent chess level to the ability to beat Stockfish. As the games progressed, AlphaZero continuously adjusted the parameters of its neural network in order to “record” the moves played against itself and their consequences, representing for this volume of games thousands of analysed variants.

The term ‘self-play’ refers to systems that learn by themselves in a situation where there is more than one agent, i.e. in a game like chess that requires two agents, self-play consists of the system learning the game by playing against itself rather than against a real opponent. This kind of system is called a self-learning system, and AlphaZero implements the use of neural networks and pattern recognition.

A self-learning system cannot function without an associated evaluation system: in any complex system, the only way to deal with errors is to give the system the ability to correct them. This is how a system progresses, going from randomness (when it has no knowledge yet), to something slightly better than randomness, then being able to understand the mistakes made in that system and correct them, which takes it to the next level and so on: this can go on indefinitely.

The eloquent decline of the dependence on human-generated data for learning a system

AlphaZero’s success is a big step in artificial intelligence since it challenges the decline of the pre-eminence of human-generated data collection, especially in closed systems. Indeed, the use of artificial neural networks and machine learning techniques has allowed the development of systems capable of learning and adapting to new tasks without the need to collect human-generated data or include human-generated heuristics. In the specific case of chess – and other board games – these developments have led to a decline in reliance on human-generated data because these systems are able to outperform human players and traditional artificial intelligence systems that rely on human-provided knowledge. Nevertheless, the use of human-generated data remains an important aspect of many high-performance AI systems, especially in more complex and open-ended tasks such as image recognition.

 

Want to know more? Read “DeepMind: Using chess game to solve intelligence”

 

Sources:

  • Matthew Sadler, The Silicon Road to Chess Improvement, New In Chess, 2021
  • Matthew Sadler, Natasha Regan, Game Changer, New In Chess, 2019
  • IBM, « Qu’est-ce que les réseaux neuronaux ? », 2021. https://www.ibm.com/fr-fr/cloud/learn/neural-networks
  • Lex Fridman, Avril 2020, “David Silver: AlphaGo, AlphaZero, and Deep Reinforcement Learning”, Podcast #86, 1:48:00, https://www.youtube.com/watch?v=uPUEq8d73JI&ab_channel=LexFridman
  • Illustration image: Photo de Hassan Pasha sur Unsplash

 

A propos de Karine Munschi