Are Chess Improvers Causing a Lichess Tactic Rating Deflation?

15 Sept 202314,981 viewsEnglish (US)

A look into the puzzle data

Welcome! This post was originally posted on saychess.substack.com. Since I published I have made some slight corrections due to some excellent feedback from my readers. The newsletter goes out to +3,800 chess players. New subscribers receive the ebook '100 Headachingly Hard Mate In Two Puzzles Composed By Sam Loyd'.

- Click here to sign up

In my work on compiling the puzzles for the next Tactics Ladder book, a tactics book series based on the highest user-rated Lichess puzzles, I have noticed something. It seems like the puzzle ratings are lower online than in the Lichess puzzle data that I downloaded in December 2021 (it is not clearly specified if the actual data is older). So I got curious and decided to do a little data collecting to investigate if I was correct.
Each puzzle on Lichess has a unique puzzle ID. You can see it in the link-structure when you solve puzzles:

lichess.org/training/S1k6e (the Puzzle ID)

This particular puzzle is one of the most solved puzzles on Lichess. It has been solved or attempted to be solved over 220,000 times. In my data from 2021 it had a rating of 1515, but today it has dropped to 899!* A 616 drop. (*after I published the newsletter yesterday it jumped up to 1264)

Try puzzle S1k6e

Such a huge drop is of course an outlier, and maybe the Boden’s mate has just reached a tactical meme status?

To investigate the data I made two samples of data using puzzle ID’s from my 2021 dataset. One set with the most solved puzzles overall and one set with the most solved puzzles in each 100-point rating band from 1000-2400. Each dataset had more than 1000 puzzles.

I then constructed a script that calls the Lichess API with the puzzle IDs from the dataset and collected the current rating of the puzzle matching the ID. This gave an output looking like this:

PuzzleID,2021-rating,online-rating,rating change,game phase
0575T,1261,1284,23,endgame 
T3ef9,1258,1288,30,endgame 
0ABXA,1659,1714,55,middlegame 
04BH8,1424,1395,-29,middlegame 
01jKd,1749,1879,130,endgame 
0ARDw,1149,1192,43,opening 
04z7E,1184,1117,-67,opening 
0HdXY,1459,1486,27,endgame 
3urVs,2192,2245,53,endgame 
VWLmC,1962,1992,30,endgame 
Og1lv,2061,2077,16,endgame 
7IOmg,1845,1860,15,middlegame
...

Here is the plot of the most solved puzzles.

As you can see most of them are in the 1400-1600 rating range. We can also see that most of the dots are concentrated below 0, and the trendline is tilting down the more difficult the puzzles get. It was after seeing this concentrated plot I decided to make another dataset where the puzzles were divided into bands.

If we do that we get this result.

Again there is a tendency that shows that the puzzles have dropped more points the longer we move up in puzzle rating. I then thought about whether the game state of the puzzle had an effect on the drop, so I subdivided the puzzles into opening, middlegame, and endgame puzzles.

It seems like the opening puzzles have seen the largest drops in rating difficulty, but other than that I’m unsure if we can conclude too much from this plot.
Finally, let us look at the average rating change by band:

This shows us that the lower rating bands actually saw a slight increase, while the largest drops were concentrated around the 1500-1600 rating range. So if you in December 2021 had a 1550 puzzle rating and did not do anything until today we could expect that you would drop your puzzle rating to around 1425.

What Can We Learn From the Changes in Puzzle Ratings?

1000-1200 Band: The ratings in this range have increased. While this could be interpreted in various ways, one possibility is that many new players are still grappling with foundational tactics.
Decline in Higher Bands: There's a notable decrease in puzzle ratings from the 1400-1600 to 2200-2400 bands. This could suggest a variety of things, including the possibility that more seasoned players are getting better at solving these particular puzzles.
Mild Decline in 1200-1400: There's a slight drop in this range. While the reasons for this aren't clear-cut, it could indicate players just past the beginner stage are gradually improving.

Points of Consideration

External Influences: Multiple factors can skew the data:
- Lichess might have altered its puzzle system.
- A shift in the main platform used by the chess community (e.g., from chess.com to Lichess).
- Instances of cheating, which might be more prevalent in tactics since some don't view it as serious as cheating in actual games. This could particularly depress ratings for harder puzzles.
Growing Player Base: The online chess community has expanded, spurred by "The Queen's Gambit" and the pandemic. These new entrants might be affecting puzzle ratings, though the exact impact is uncertain.

Solving by Theme: The introduction of the Lichess puzzle dashboard in January 2021 might also be the reason for the deflation of the ratings. The new dashboard gave users the option to solve puzzles by theme. When you solve puzzles knowing the theme, you know what to look for. If the puzzle's rating is changed based on people solving the puzzles knowing the theme it would also disturb the rating system. If that is the case I do not know.

In light of the above, while the data provides some interesting patterns and tendencies, it's important to approach interpretations with some caution.

The data can hint at a potential rise of online chess improvers that might have contributed to pushing the ratings down, but we don’t know for sure.

One clear takeaway is that you should not obsess over your rating. While a player might improve their puzzle-solving skills, broader factors influencing the rating system can still lead to a drop in their actual rating. In essence, the rating system is a magic box and you do not control most of the variables.

I welcome your perspective on these results and observations.

Newsletter:
- Click here to sign up

/Martin

Discuss this blog post in the forum

Are Chess Improvers Causing a Lichess Tactic Rating Deflation?

What Can We Learn From the Changes in Puzzle Ratings?

More blog posts by SayChessClassical

Are Online Chess Players Trapped Pigeons?

The Great Eval Bar Debate: Chess.com's Broadcasting Style

Normalized Popularity Scores of Selected Chess Puzzle Themes on Lichess

The Flawed 1%-Method for Exponential Chess Growth