|(1) Posted by Marek Kwiatkowski [Sunday, Jan 24, 2010 13:54]; edited by Marek Kwiatkowski [10-02-02]|
IQ test for UCI engines
“Rybka 3” and other UCI engines (clones or not clones) are tested by chess matches.
For us an important information should be: can a UCI engine think?.
I prepared a test to compare this ability.
All 3 example positions are started by Black, because Black is faster.
I tested Fritz 9, Rybka 2.n2 w32 and Stockfish 1.6 JA.
How should this test be interpreted? It depends on yours computer (RAM, CPU), engine and patience.
My conditions for engines:
Hash - 128
Clear Hash (before each solution)
Tablebases - no
MultiPV -1 (one line of solution)
by Paul Benko, Sakkelet 2000, 1st Pr
(= 3+3 )
black to move and win
(= 4+3 )
black to move and win
1...Qh1+ 2.Ka2 Qa8+ 3.Kb1 Qe4 4.Bf7 Qf3 5.Rxd2 Qf1+ (crucial move) -+
by Marek Kwiatkowski, Schach 1993
(= 9+6 )
Black to move and win
1...a1R 2.Ba8 (engine can choose a worse move e.g. 2.Bd5 Rd1) Rg1 3.Bb7 Rb1 ... 7.Bf3 Rf1 8 Be2 Rf2 9.Bf3 Rxh2+ -+
Only Stockfish satisfied me, solved all the positions in a sensible, for me (my computer), time.
|(2) Posted by Marek Kwiatkowski [Friday, Jan 29, 2010 07:15]|
This topic found no response. It is a pity because many of us should interest the ability of engines to solve positions with a subtle passive relationship between pieces.
The examples are not extremely hard, but I am not expecting that an engine can solve the full version of Benko’s study.
(= 4+4 )
In addition, it is not a joke that engines can differently work depending on the starting color of same position. So the test should be done twice (white <-> black).
Newest versions of engines are showed as >3000 ELO. This test allows us to verify it.
I want to widen FANCY for UCI engines and hence some other observations can help (encourage) me.
Stockfish is free (download): http://www.mediafire.com/?koinmvkm1mh
|(3) Posted by Siegfried Hornecker [Friday, Jan 29, 2010 07:45]|
I think the right person to ask about such things is Emil Vlasak, who has the beautiful column in EG. Even though I'm very firm in internet things I don't know about good tests for such things. There are a few tests for engines that one might be able to find on internet.
However, an interesting case is to find bugs.
(= 9+7 )
White to move wins - well, if the engine isn't buggy, that is.
Fritz 11 sees the top 5 in 1.gxh8Q+, 1.gxh8R+, 1.e8R, 1.e8Q, 1.Se3 - all with a mate in up to 5 moves
Winning is 1.f8S+ Rxf8+
Now Fritz 11 sees 2.exf8S+ and 2.gxf8S+ as equal - even though one loses.
After 2.exf8S+! Qxf8 Fritz 11 finally finds 3.gxf8S+ and 4.Sxe6 with a clear win.
And this example even could be made better if one manages to find a position where in both cases a promotion to queen would seem stronger than promoting to knight and later to queen.
The bug appears if all of the following four conditions are fulfilled:
- The white king is in check
- White captures a black piece
- White promotes to a knight
- At the end of the move the black king is in check
I don't know who first reported that bug, I found it myself some years ago.
|(4) Posted by Kevin Begley [Friday, Jan 29, 2010 13:11]|
Who created the example 2? And, I wonder, why not play it just a little farther:
5...Qf1+! 6.Kc2 6...Qxf7 (6...Qf5+? 7.Rd3! =) 7.Rd4+ 7...Kb5! (7...Kc5? 8.Ra4! =; 7...Ka5? 8.Kb1!! 8...Qg6+ 9.Ka2 =) -+
I've seen composer's names hanging above much less interesting problems... did nobody take credit for this?
|(5) Posted by Siegfried Hornecker [Friday, Jan 29, 2010 16:17]|
This is from the analysis of the Somov-Nasimovich study. See http://www.matplus.net/pub/start.php?px=1264778217&app=forum&act=posts&fid=gen&tid=664&pid=4510#n4510
|(6) Posted by Marek Kwiatkowski [Saturday, Jan 30, 2010 07:43]; edited by Marek Kwiatkowski [10-01-30]|
Kevin, I am happy if you found a joy in analyzing the 2nd example.
It seems to be an important position for the theory of endgames. The crucial move is 5…Qf1+!, and I am expecting that engines “see” all further moves.
In addition, it is possible to lose the first move by other ways. I hope it is obvious.
Siegfried, maybe there are some other tests, but I believe that my test is very useful for us (composers).
Thanks to the test, I found a very strong and “clever” engine – Stockfish.
I tested Sockfish on Pentium III, 256M RAM, Windows 98SE with my patience <5 minutes (per item).
I can’t test Rybka 3 or Firebird.
|(7) Posted by Marek Kwiatkowski [Tuesday, Feb 2, 2010 14:11]; edited by Marek Kwiatkowski [10-02-02]|
I add two new position to my test.
by F. Simchowitsch Pravda 1927 1st Pr
(= 7+8 )
Black to play and draw
1…Bd7+ 2.Ka6 Kb8 3.Qxa1 Ka8 =
All further neutral white’s moves should be followed by King’s moves. An estimated score ~ 0.00.
And the 5th one which terminates my good mood.
by F. Simchowitsch, 1/2nd Pr Shakhmaty 1926
(= 5+5 )
Black to play and win
I recommend to test the last simple position, especially.
|(8) Posted by Marek Kwiatkowski [Tuesday, Feb 2, 2010 18:49]; edited by Marek Kwiatkowski [10-02-02]|
I tested Firebird 1.0 beta on Pentium IV 3G Hz, 512M RAM, Widows XP (my patience - 5 minutes, hash - 128M). Some say that it is a clone of Rybka 3 (the best playing engine).
Firebird can solve only 1st example. I think that commercial engines: Rybka 3, Fritz 12 or Schredder 12 will not get a better result.
Stockfish can’t solve only 5th example.
Toga II 2.0SE (free download http://www.chesslogik.com/toga.htm) can’t solve 2nd and 5th examples.
|(9) Posted by Hauke Reddmann [Wednesday, Feb 3, 2010 10:56]|
I certainly flunked the IQ test :-)
Can someone explain Example 1 to me? Where does the right
to castle come in?
|(10) Posted by Marek Kwiatkowski [Friday, Feb 5, 2010 01:43]; edited by Marek Kwiatkowski [10-02-05]|
Of course, the castle in the first example is possible.
I was using 32 bit versions of engines in each test. It is possible that 64 bit versions can work in another way.
I submit the last testing position.
R. Reti, Shakhmaty 1928, 1st Prize
(= 6+6 )
white to move and win
1. Kh6!! Be5 2. Kg7 Bh2 3. c4!! bxc4 4. e5!! Bxe5 5. bxc4 1-0
I hope it was a good fun. John Henry is coming soon!
|(11) Posted by Siegfried Hornecker [Saturday, Apr 3, 2010 18:28]; edited by Siegfried Hornecker [10-04-03]|
(= 4+5 )
Bohemia 1906, 2nd to 5th prize
White to move and draw
I gave up on solving this after 18 months (!), and it's no shame to say this, since nobody ever would find the solution if he doesn't get any hint. However, a computer should be able to solve it in a few minutes.
After 3:30 minutes, Deep Fritz 11 (1968 MB) has the correct solution at place 2, directly after 1.Sg7+ (which loses), but is unable to see it as a clear draw (which every human sees easily then). However, then for a long time nothing changes. It's 10 minutes now and I stop it, seeing that Deep Fritz 11 finds it but doesn't understand it's a draw. I guess the horizon effect will be somewhere always so it would take longer than useful.
1.Kc6!! g1Q 2.Sxh4! and white draws since 3.Sef3 is not stoppable in a useful way.
Mark text to read.
|(12) Posted by Marek Kwiatkowski [Monday, Apr 5, 2010 09:04]|
It is a typical fortress. Programmers are not interested to avoid such an extreme rare wrong behavior of engines (crazy pieces too). It can decrease the power of play (ELO). The lack of other solutions is also an important help for composers.
|(13) Posted by Bojan Basic [Monday, Jul 12, 2010 02:14]; edited by Bojan Basic [10-07-12]|
It seems that Rybka has a bug and that it cannot find the winning move in this position. (This is end of a study by E. Pogosjanc, Springaren, 1965).
(= 6+5 )
Could anybody else check whether Rybka finds simple 1. a7-a8=B?
|(14) Posted by Vladimir Tyapkin [Monday, Jul 12, 2010 04:50]|
It's a feature of rybka (at least it's not fixed in rybka 4) . Google for 'rybka bishop underpromotion'. One of many discussions on the topic: http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=1042
No more posts
MatPlus.Net Forum Internet and Computing IQ test for UCI engines