Sunday, December 27, 2015

Database reliability: A cautionary tale

Databases have revolutionized the game of chess. Thanks to these databases, people have access to millions of games, easily accessible and searchable. Contrast this to 1980, when I was but a lad of 12, living in the chess hinterlands of Orlando, Florida. I had exactly three books about the game: the then current edition of the USCF Rules of Chess, Harry Golombeck's Chess: A History, and Robert E. Burger's The Chess of Bobby Fischer.

The rule book was a rule book. Golombeck's book had all of 55 unannotated games. Burger's book, which was and is a gem, was something of a textbook, and had only a few complete games, though many positions from Fischer's games. (Now that my health has improved I plan on reviewing that book at some point, and I will explain then why I'm not linking to the book now.) But that was it. Probably fewer than 60 complete games, and I counted myself lucky! Books weren't as easy to come by, especially if you were a child and didn't have a USCF membership. These days anyone with an internet connection can access all manner of online databases for those millions of games mentioned earlier. And relatively cheap databases can be had for offline use. Astounding!

But these databases can have problems. Sometimes these collections haven't been checked well. In some sense, how could they be? Who could review millions of games for quality assurance? A few years back, I found the following game score in ChessBase's collection:



Starting with Black's 25th move, the game score becomes utter nonsense. For all I know this game score is still being used by ChessBase, though I sent them an email at the time. (Side note: An awful lot of the games between Petrosian and Tal were boring as dirt.)

The correct score can be found at ChessGames.com: LINK. That's where I found it then. You'll see that the last few moves of the game actually make sense! The prior score would only be believable if both players were so drunk they were wetting themselves at the board in their oblivion, and even that isn't believable. In that case neither would have resigned, and the most likely outcome would have been someone flagging after passing out.

So, the cautionary note is that you need to review the actual game scores in your database before taking them at face value. Trust no one! If you're a 1700 and find a bunch of stuff that you can determine is nonsense without even turning on a program, then you need to check the game score with another source!

No comments:

Post a Comment