Candidates Tournament 2013: 
Computer Assessment of Quality of Play
By Matej Guid and Ivan Bratko
University of Ljubljana, Faculty of Computer and Information Science,
Artificial Intelligence Laboratory, Ljubljana, Slovenia.
In this paper we attempt to get some answer to the questions formulated
in the header, through a computer analysis of the individual chess moves
played by the players. Strong chess engines and increasingly powerful computer
hardware provide us with opportunities to observe more than just pure game
results. As we all know, a single mistake may ruin a well-played game. Game
results do not necessarily reflect well the quality of play – that is for
sure. Besides, quality of play seems to have improved greatly with the emergence
of huge databases of chess games and very strong chess engines.
In this article, we briefly present results of computer analysis of chess
games played in the latest Candidates Tournament in London, using the chess
engine Houdini 1.5a (64 bit) at search depth of 20 plies. In our previous
studies (see e.g. the Chessbase.com article Using
chess engines to estimate human skill) we demonstrated that (fallible)
chess engines can produce quite reliable rankings of the players. Possibly
surprisingly, it turns out that different engines at different search depths
produce the same or at least very similar rankings. The method is briefly
described below. While the method could be debated, it can nevertheless
give us some insights about the quality of play in the games analysed.
FIDE Candidates Tournament 2013
According to the 20-ply Houdini, Magnus Carlsen achieved the best computer
score at the FIDE Candidates Tournament 2013. Probably the greatest surprise
is an excellent computer score by Alexander Grischuk, who finished the tournament
with less that 50% points in the tournament table! According to the analysis,
Vladimir Kramnik also played at a very high level.

FIDE Candidates 2013 computer scores:
(lower scores indicate the better quality of play)
The results suggest that the quality of play demonstrated by the Candidates
is very high indeed. In particular, Carlsen’s score is the second best score
achieved in an individual tournament or match of all the top-level tournaments
and matches that we have analysed to the present day.
Let us note that both Carlsen’s and Kramnik’s computer score greatly deteriorated
in the last few rounds. After Round 10 their scores were both under 3.00,
which is truly remarkable: we will see that shortly, as we will compare
the results in the graph above to the achievements of the 15 classical world
champions at the peaks of their careers – in the World Chess Championship
matches.
The “classical” World Chess Championship matches (1886-2012)
In the graph below, we see corresponding results obtained with the same
program at the same level of search (i.e. Houdini at 20 plies).

The comparison of the world chess champions
(lower scores indicate the better quality of play)
The results suggest that in terms of the computer scores, Vishy Anand and
Vladimir Kramnik did best of all the players in the World Chess Championship
matches. It should be noted that several players achieved rather similar
scores.
By comparing the two graphs we can observe that the top three 2013 Candidates
(Carlsen, Grischuk, and Kramnik) achieved even better computer scores than
were the average scores of any of the fifteen “classical” world champions
in the “classical” World Championship matches!
What about the champions’ achievements in their individual World Championship
matches? Here is the Top-10 list of individual achievements, using the same
program at the same depth of search:
    
    
        
            | Player | World Championship match | Player's score | 
        
            | Kramnik | Kramnik-Leko, 2004 | 3.43 | 
        
            | Kasparov | Kasparov-Anand, 1995 | 4.35 | 
        
            | Anand | Kramnik-Anand, 2008 | 4.49 | 
        
            | Anand | Anand-Gelfand, 2012 | 4.81 | 
        
            | Capablanca | Lasker-Capablanca, 1921 | 5.48 | 
        
            | Anand | Anand-Topalov, 2010 | 5.59 | 
        
            | Karpov | Karpov-Kasparov, 1984 | 5.71 | 
        
            | Kramnik | Kasparov-Kramnik, 2000 | 5.76 | 
        
            | Botvinnik | Botvinnik-Petrosian, 1963 | 5.84 | 
        
            | Kasparov | Kasparov-Kramnik, 2000 | 5.85 | 
    
The top 10 scores in the “classical” World Championship matches
(lower scores indicate the better quality of play)
The best quality of play (as judged by Houdini 1.5a at 20-ply search) was
therefore demonstrated by Kramnik in his World Championship match against
Leko. As noted above, both Carlsen and Kramnik were on the way to achieve
an even better score in the first ten rounds of the FIDE Candidates Tournament.
A brief description of the method
The method used to obtain the results presented in this article is described
in the following scientific paper:
M. Guid and I. Bratko:Using Heuristic-Search Based Engines for Estimating
Human Skill at Chess. ICGA Journal, Vol. 34, No. 2, pp. 71-81, 2011.
[Available
as PDF]. An interested reader may also find more information in the
Chessbase.com article Using
chess engines to estimate human skill.
Here is a summary of the method (see the above paper for explanations):
    - 
    The analysis of each game starts at move 12. 
- 
    The chess engine evaluates the best moves (according to the computer)
    and the moves played by the player. 
- 
    All engine’s evaluations are obtained at the same depth of search. 
- 
    The score is then the average difference between evaluations of the
    best moves and the moves played. 
- 
    If the player’s mistake (as seen by the engine) at particular move
    is greater than 3.00, the score for this particular move becomes 300
    “centipawns” (to avoid unreasonably high penalties for gross mistakes). 
- 
    Moves where both the move played and the move suggested by the computer
    had an evaluation outside the interval [-2.00, 2.00], are discarded.
    (In clearly won positions players are tempted to play a simple safe
    move instead of a stronger, but risky one. Such “inferior” moves are,
    from a practical viewpoint, perfectly justified. Similarly, in
    lost positions players sometimes deliberately play an objectively worse
    move.) 
- 
    In the graphs above, all the scores are given in “centipawns”. 
We would like to emphasize that the scores obtained by the program only
measure the average differences between the players' choices of move and
the computer's choice. Several studies have shown that these scores that
are relative to the chess engine used have good chances to produce sensible
rankings of the players.
A valid comment regarding the computer scoring method is that it does not
take into account the complexity of positions. As a consequence, players
that tend to prefer simple positions are a priori more likely to commit
less errors and therefore obtain better computer scores. To qualify the
computer score results from the perspective of complexity, we add the average
complexity estimates of individual player's games. These complexity estimates
were computed by the method described in the scientific paper M. Guid and
I. Bratko: Computer analysis of world chess champions. ICGA Journal,
Vol. 29, No. 2, pp. 65-73, 2006.
Again, the chess engine Houdini 1.5a (64-bit) was used to compute the complexity
estimates, this time (in accordance with the algorithm) performing search
to various depths in the range between 2 and 15 plies.

The average complexity estimates of individual
player's games in the World Championship matches (left) and in the FIDE
Candidates 2013 (right). The lower scores indicate tendency towards simple
positions.
These results confirm previous observations that Capablanca’s outstanding
score in terms of low average differences in computer evaluations between
the player’s moves and the computer’s moves should be interpreted in the
light of his playing style that tended towards low complexity positions.
It is worth noting that according to the results demonstrated in the graph
above, Aronian dealt with the most complex positions (on average) of all
the Candidates. On the other hand, Carlsen’s outstanding computer score
(the estimated quality of play) does not in any way seem to be the
consequence of the level of complexity of positions that occurred in his
games.
 
    
    
        
            | The authorsMatej Guid has received his Ph.D. in computer
            science at the University of Ljubljana, Slovenia. His research interest
            include computer game-playing, automated explanation and tutoring
            systems, heuristic search, and argument-based machine learning.
            Some of his scientific works, including the Ph.D. thesis titled
            Search and Knowledge for Human and Machine Problem Solving,
            are available on Matej's Research
            page. Chess has been one of Matej's favourite hobbies since
            his childhood. He was also a junior champion of Slovenia a couple
            of times, and holds the title of FIDE master. |  | 
    
    
    
        
            |  | Ivan Bratko is professor of computer science at
            University of Ljubljana, Slovenia. He is head of Artificial intelligence
            Laboratory, Faculty of Computer and Information Science of Ljubljana
            University, and has conducted research in machine learning, knowledge-based
            systems, qualitative modelling, intelligent robotics, heuristic programming
            and computer chess (do you know the famous Bratko-Kopec
            test?). Professor Bratko has published over 200 scientific papers
            and a number of books, including the best-selling Prolog
            Programming for Artificial Intelligence. Chess is one of his favourite
            hobbies. | 
    
References
    - M. Guid and I. Bratko. Using
    Heuristic-Search Based Engines for Estimating Human Skill at Chess.
    ICGA Journal, Vol. 34, No. 2, pp. 71-81, 2011.
- M. Guid, A. Pérez and I. Bratko. How
    Trustworthy is CRAFTY's Analysis of World Chess Champions? ICGA Journal,
    Vol. 31, No. 3, pp. 131-144, 2008.
- M. Guid and I. Bratko. Computer
    Analysis of World Chess Champions. ICGA Journal, Vol. 29, No. 2, pp.
    65-73, 2006.
- M. Guid. Search
    and Knowledge for Human and Machine Problem Solving. Ph.D. Thesis,
    University of Ljubljana, 2010.
- Chessbase: Computers
    choose: who was the strongest player? 30.10.2006.
- Chessbase (reader's feedback): Computer
    analysis of world champions. 2.11.2006.
- G. Haworth. Gentlemen,
    stop your engines! ICGA Journal, Vol. 30, No. 3, pp. 150–156, 2007.
- Di Fatta, G., Haworth, G., and Regan, K. Skill
    rating by Bayesian inference. CIDM, pp. 89–94, 2009.
- Haworth, G., Regan, K., and Di Fatta, G. Performance
    and Prediction: Bayesian Modelling of Fallible Choice in Chess. Advances
    in Computers Games, Vol. 6048 of Lecture Notes in Computer Science, pp.
    99–110, Springer, 2010.
- K. Regan and G. Haworth. Intrinsic
    chess ratings. In Proceedings of AAAI 2011, San Francisco, 2011.
- C. Sullivan. Truechess.com
    Compares the Champions. 2008.
- Diogo R. Ferreira (2012). Determining
    the Strength of Chess Players based on actual Play. ICGA Journal,
    Vol. 35, No. 1.
Feedback
and mail to our news service
Please use this account if you want to contribute to or comment
on our news page service