Abstract strategy games: Difference between revisions
Abstract strategy games: Expanded ratings 2 |
Miscellaneous category |
||
| Line 19: | Line 19: | ||
In this vein, if we were to try to compare rating spectrums (range of lowest to highest rating) of different abstract strategy games with a common baseline - 0 Elo set to be a "random-mover". This is loosely eqvuialent to a human who knows the rules, but does not know anything about strategy. Suppose there is a player B which is 400 Elo above this baseline, they are expected to win roughly 90.9% of the time according to the formula above. This is completely arbitrary, but for terminology purposes let us define such a gap as a "thrashing“, indicating player B is an entire class above the baseline level. Then, rating spectrums can be described as <math>k</math> number of thrashings, i.e. number of 400-Elo divisions. | In this vein, if we were to try to compare rating spectrums (range of lowest to highest rating) of different abstract strategy games with a common baseline - 0 Elo set to be a "random-mover". This is loosely eqvuialent to a human who knows the rules, but does not know anything about strategy. Suppose there is a player B which is 400 Elo above this baseline, they are expected to win roughly 90.9% of the time according to the formula above. This is completely arbitrary, but for terminology purposes let us define such a gap as a "thrashing“, indicating player B is an entire class above the baseline level. Then, rating spectrums can be described as <math>k</math> number of thrashings, i.e. number of 400-Elo divisions. | ||
[[Category:Miscellaneous]] | |||
Latest revision as of 04:47, 27 September 2025
An abstract strategy game is a board, card or other game where gameplay is mostly without a theme and a player's decisions affect the outcome. Such games are combinatorial, i.e. perfect information, don't involve non-deterministic elements (shuffled cards or dice rolls). This page will mostly be about two-player zero-sum games.
Common examples of such games include:
- Chess
- Hive
- Xiangqi (Chinese chess)
- Games from GIPF Project, such as YINSH.
- Connect 4
Ratings
Under perfect information and the absence of luck, it makes perfect sense to quantify a player's strength or skill level with a rating system. We can assign a number to a given player, and based on two rated players, we can estimate the outcome either in a single game, or an entire match consisting of a series of games. There are different rating systems to achieve this, each with their strengths and weaknesses. The first ever rating system was Elo, which FIDE uses a modified version of. Statistician Mark Glickman improved upon Elo, devising Glicko and Glicko-2, which are used by popular chess websites Chesscom and Lichess respectively.
A popular formula to estimate a game or match outcome, from the original Elo system, is as follows:
Given 2 ratings and , where , we can estimate the "expected score" of player A as . Note the use of "expected score". In games where draws do not exist, this is simply the same as winrate. However, in games like chess where it exists, expected score is equivalent to winrate + 1/2 * drawrate. Additional formulas are required to separate winrate from drawrate in that case. The negative sign in the formula can theoretically be simplified by swapping and , but it gets a bit confusing so it will kept in this form.
No matter the rating system used, it is important to note that "absolute ratings", such as a particular rating, have no meaning in of itself. It is only meaningful when comparing other ratings within the same system, and even the same settings ("relative ratings"). For instance, Glicko and Glicko-2 despite being similar in many ways, the ratings they each give cannot be directly compared. A regression can be done to show they potentially correlate, but finding an exact 1-to-1 mapping is impossible.
In this vein, if we were to try to compare rating spectrums (range of lowest to highest rating) of different abstract strategy games with a common baseline - 0 Elo set to be a "random-mover". This is loosely eqvuialent to a human who knows the rules, but does not know anything about strategy. Suppose there is a player B which is 400 Elo above this baseline, they are expected to win roughly 90.9% of the time according to the formula above. This is completely arbitrary, but for terminology purposes let us define such a gap as a "thrashing“, indicating player B is an entire class above the baseline level. Then, rating spectrums can be described as number of thrashings, i.e. number of 400-Elo divisions.
