Website founded by Milan Velimirović in 2006
23:48 UTC
| |
MatPlus.Net Forum Competitions Amazing John Nunn |
|
|
|
You can only view this page!
| | (1) Posted by seetharaman kalyan [Tuesday, Aug 8, 2017 21:35] | Amazing John Nunn John Nunn amazes again scoring a perfect score in the Individual solving contest and lost to the winner by a few minutes (who else but Piorun)!
Congratulations! | | (2) Posted by Frank Richter [Wednesday, Aug 9, 2017 13:39] | Sorry, but the winner was Piotr Murcia.
Nevertheless big respect to both. | | (3) Posted by seetharaman kalyan [Wednesday, Aug 9, 2017 15:13] | Sorry my mistake. Congrats to Piotr Murcia.! He seems invincible! | | (4) Posted by Miodrag Mladenović [Thursday, Aug 10, 2017 15:19] | The great champion deserves his name to be spelled correctly. So the correct spelling is "Piotr Murdzia". By the way something is very wrong with a rating system. I believe that Piotr wins 9 of 10 tournaments that he is participating but he is constantly loosing rating points and he never gets them. Not sure what should be done but something should be changed in rating rules. It does not make any sense that he is ranked third at the rating list when he is wining majority of tournaments. | | (5) Posted by Piotr Murdzia [Monday, Aug 14, 2017 11:28] | Thank you very much Misha!
You have touched a very important issue.
It is indeed very strange. Before I was not worry about my rating but lately I have realized something is wrong with the system. Maybe in case of national championships it is somehow ok, but over the last years I won many strong competitions (ISC, ESCS, WCSC, open) and loosing my rating in such case makes me angry.
Another strange thing is the possibility of reaching very high level after 2 or 3 competition besides of ECSC and WCSC. Everyone knows who I am talking about, but I don't have anything against this solver personally. I simply would like to see the change in the system that doesn't allow new solver to obtain such high rating after "one summer".
I think the serious discussion should take place in the topic.
Best wishes to all,
Piotr Murdzia
PS. And John Nunn is of course amazing - I agree. I have a big respect to him. | | (6) Posted by Georgy Evseev [Monday, Aug 14, 2017 15:57] | I would like to add my two cents to this discussion.
1. As far as I can tell the more or less strong mathematical basis for the current rating system do not exist, even in comparison to OTB ELO rating. In other simple words, there are some voluntary decisions in our current system, and nobody really knows all the consequences of those old mistakes). The main difference between our system and ELO system is that our system is rating based, while ELO system is points based. One of the consequences is that big deviations (a single very good or very poor result) may very significantly change rating, while this is not so in OTB chess.
The main issue is that difficulty level of compositions is not adequately represented. It is much easier for a strong solver to have a good result (rating-wise) in a tournament with more difficult problems (when weaker solvers are really poor) than when the problems are easier and all participants have good results. For example, in this year in both ISC and WCCC Open the best performance rating is set to the best of original ratings of participants. And Piotr got 100% points in the Open. How he is intended to increase his rating? He is out of more points). So, if everyone performs strong - then no good rating increase for you.
2. I never considered this rating very seriously, but I have to agree that in last ten years Piotr performs better than me. When he had a higher rating than me, I considered that this was a correct representation of our skills. But one or two Piotr's less good results show that it is sometimes very easy to lose rating and very difficult to restore/raise it. I consider my high rating is mainly a legacy of early "rating days" when initial ratings were assigned to all solvers.
3. The issue of big first ratings is an obvious trapdoor in the system, which just had never been found before. To assign an initial rating as an average of first two performances of solver is just plain wrong. We should think about system similar to OTB chess, when a new solver receives a standard initial rating (maybe, depending on the status or strength of his first tournament) and all calculations are made in accordance with it.
4. Unfortunately, a lot of work required to check the current system and to propose an update. Still, maybe it is time to do it. | | (7) Posted by Hauke Reddmann [Wednesday, Aug 16, 2017 10:26] | My 0.1€ (and I hope Piotr doesn't mean me :-)
Evidently there is a big difference between OTB and solving rating.
In OTB, you are rated against a player with a "good known" rating.
The system stabilizes itself. In solving, as it was already pointed
out, you are rated against an unknown solving difficulty (and as
a proxy against your competition).
Note that ELO is a zero-sum game. It works even when people get
better at chess! (I guess by weighting current events of people
with few ratings higher. So this is NO disadvantage.)
Can't solving rating be completely switched to ELO? You would
then need to define a nonlinear function to convert solver points to
ELO difference (same as in ELO) with a free parameter chosen
such that the sum of the raw rating differences are zero.
And then weigh old and new ELO together.
Can't this system simply be tested "in parallel"? (Problem 2:
Contests are MUCH less frequent than OTB tournaments, so wild
statistic fluctuations are expected anyway.)
Hauke-Half-A-Mathematician | | (8) Posted by Georgy Evseev [Wednesday, Aug 16, 2017 14:10] | @Hauke
Very good observations.
One of the valuable properties of OTB rating is its direct representation of player results. It was intended that the difference of 200 points generally means that a better player will score about 75% points against the weaker opponent (in a long sequence of games). Unfortunately, the solver's rating do not have this kind of direct connection with the result. So we just have abstract numbers, which are only expected to follow the rule "the more the better", though we do not even know if it is so. At least I do not know about any mathematical basis under the current system of solvers' ratings, except "look - the system seems to work more or less similar to OTB ratings."
I was thinking how the "good" system should be organized in the past, but generally stopped when current system was made official. Still it looks that it will be good if I refresh some of my thoughts.
Let's start again with the problems.
1. The results very much depend on the difficulty of the compositions.
2. "Grandmasters" and "amateurs" have the possibility to participate in same competition under the same conditions.
3. No general understanding, what the rating means.
I am not sure, how to resolve all these difficulties, but we can, for example, postulate, that the difference of 100 rating points means that weaker solver generally (in a long series of compositions) scores 90% of points of better solver.
Then, if we somehow transform scores into percentage scale (a tournament winner is considered 100%?), the following formula should represent the expected results of two solvers ("better" and "weaker"):
PB/PW=(0.9)^(-(RB-RW)/100). Here PB, PW - "percentage results"; RB, RW - current ratings.
We then need to find a way to transform real result to a percentage scale (PR), and to calculate expected result on the same scale (PE). If we do, then we can use "classic" formula
RN=RO+K*(PR-PE). Here RN - new rating, RO - old rating, K - a multiplier, which may depend on category of tournament, number of compositions solved, difficulty level, etc.
At least, such system will give answer to some questions, like: "my rating has increased 50 points, how much have my results improved?" | | (9) Posted by Harri Hurme [Wednesday, Aug 23, 2017 00:12] | Rating of chess problem solvers
I wrote in 2005 this new rating system paper but never published it. Now it seem that it is proper time to reveal the idea. I apologize my English, this is perhaps not fluent reading.
Background
I have been a member of the Finnish player elo committee since its foundation in 1970:s. I formulated the first mathematically described solvers rating system soon after 1985. This system, called RELO, is described in ST web pages in finnish http://www.sci.fi/~stniekat/st/relo.htm, but is readily translated to any language e.g. with Google translator. The rating system used by PCCC and WFCC do indeed have many short comings. Especially it is based on assumption that gained points and rating are linearly related. This is not true because there are typically some very easy and some very difficult problems. I claim that both RELO and the new system presented here are superior. But of course even a still better system can be developed, by no means I claim that this is the end of the story.
Time factor or Lex Perkonoja
RELO has some special features, e.g time factor or lex “Perkonoja” This because in the early solving contests the selected problems tend to be too easy for the best solver, he could solve them typically in half the given maximum time. Without the time factor the rating system could not make justice for the best solvers. Of course some times he did not make fully maximum because of some stupid small error. This means that the official winner could be lower rated. but the rating system should indicate the true ability of solvers, not only who came the official first. Should time factor taken in or not is open question. Nowadays the problems tend to be very difficult and missing the time factor does not affect the ratings too much. Thus including it is not necessary, but it can be taken for sake of completeness.
New RELO formulas
Basic principle
One can consider a single problem in a chess problem solving contest to be like an opponent in an otb chess tournament. Because we do not know exact difficulty of the problems beforehand, we must consider that the problems are like unrated opponent "players". In game rating system we can compare players which has never played against each other, it is fully adequate that they have played against opponents having a rating, which is based on a common rating pool. In analogy we can rate solvers because we can first calculate the difficulty of the problems because all the solvers in a solving tournament tried to solve them, we consider the problems as common opponents for all the solvers present. In game rating we can first calculate the rating of new players and after that we can subsequently calculate the rating change for those players which meet the new players. A footnote: Presently FIDE does not rate games against unrated players. But that can be done if their ratings are calculated first. This prinsible considering problems as opponents solves automatically the basic rating problem how to treat very easy or very difficult problems.
We can calculate the first cornerstone in the rating system, which is the difficulty rating for the problems, using the old ratings of solvers and the solving results as the basis. We call this rating as "Rdifficulty", this tells how tough an "opponent" each problem is.
Rdifficulty = Rav-C*arctanh(2*Pav/Pmax-1);
where Rav = average rating for solvers who have old rating
C=800/ln(10) = 347.4
Pav =average points for each problem by solvers who have old rating
Pmax=maximum points for a problem (presently Pmax=5)
I prefer using inverse of hyperbolic tanget but I could use logistic function as well.
After calculating the difficulty of the problems we can proceed with standard Elo rating methods. I give the required formulas though here as well.
Pexpected = 0.5(1+tanh((Ro-Rdifficulty)/C)
This is calculated separately for each solver and each problem and summed. The new Rating
Rnew= Ro+K*sum(Pachieved-Pexpected);
The coefficient K is normally 4 but can be different as in normal Elo-rating calculations, (4*Pmax=20). The K factor can be different for different category solving tourneys. Also the number of solvers should affect it. These questions need practical testing and thus this should be discussed after testing.
Calculate performance rating for each (new) solver. At this point we know the “opponents” (each problem) ratings and results ratings
Rperformance=Rav+C*(arctanh(2*Pt/Pmaxt-1)-arctanh(2*Pavt/Pmaxt-1))
The index t means that here total summed points are in use, not per problem as above.
The classical Elo rating is based on cumulative normal distribution.
Instead of the normal distribution prof Arpad Elo suggest to use so called logistic function as an approximation of the distribution. The basic reason for the approximation seems to be ability to carry practical calculations readily. However the logistic function 1/(1+10 (Ro-Ropp)/ 400) ) is a close approximation. Actually US chess federation consider this as more accurate. We transform the logistic function into more practical form by using hyperbolic tangent function and its inverse with the following mathematical identity: Logistic function 1/(1+10^x) = 0.5*(1+tanh(x/2)).
The hyperbolic tangent function is commonly used in engineering because its approximation evaluation is very simple, with small argument values
Tanh(x) ~ x and tanh(x)=1, if X>> 1 and tanh(x)=-1 if x<<-1. This helps in approximating in the normal middle range.
After publishing this I hope that someone seriously read this and even implement it in Excel. One question is how to initialize the system. This is a separate question and it has standard solution as explained by Elo: iterate with old results and the rating will be adjusted to the right level. Of course many other practical things remains to decided, this only tells the basic prinsiples.
Harri Hurme | | (10) Posted by Harri Hurme [Wednesday, Aug 23, 2017 00:12]; edited by Harri Hurme [17-08-23] | deleted | | (11) Posted by Roland Ott [Wednesday, Aug 23, 2017 11:35] | @Harri,
Thank you very much for this very interesting post which "someone seriously read" this morning...
I will discuss this topic within the WFCC Solving Committee and contact you by email.
Roland Ott | | (12) Posted by Georgy Evseev [Wednesday, Aug 23, 2017 15:46] | @Harri
Very thought out post, thank you.
@All
There is a FIDE table for performance rating, see https://www.fide.com/component/handbook/?id=174&view=article, p1.48. This is probably a numerical approximation of formulas. | | No more posts |
MatPlus.Net Forum Competitions Amazing John Nunn |
|
|
|