Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from cutechess-cli to fastchess #2119

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

vondele
Copy link
Member

@vondele vondele commented Jul 13, 2024

fixes #2106

This PR switches from cutechess-cli to fast-chess.

cutechess-cli has been serving us well in the past years, however, some issues have accumulated, namely the difficulty of compiling cutechess-cli, the observed timeouts at high concurrency and short TC, and e.g. slowness when indexing larger books. fast-chess https://github.com/Disservin/fast-chess has addressed these issues, and has now probably become mature enough to serve as the game manager for fishtest.

As an example of its ability to deal with short TC and high concurrency: https://dfts-0.pigazzini.it/tests/view/669249cdbee8253775cede32 with concurrency 25, and TC 1+0.01s no timeouts are observed.

fast-chess is built from sources, with the zip download as well as the binary cached as needed. There is fine-grained control over which version of fast-chess is used, so we can easily upgrade for new features.

In this PR, fast-chess is built in cutechess compatibility to facilitate integration, and to benefit from the existing fishtest checks. Once validated, we should be able to switch easily to its native mode, which can output trinomial and pentanomial results, and we should be able significantly simplify the worker's book-keeping.

@vondele vondele marked this pull request as draft July 13, 2024 11:15
@vondele
Copy link
Member Author

vondele commented Jul 13, 2024

The PR is ready for review from the fishtest side, still in a draft state to allow for a couple of fixes and updates to be made to fast-chess.

@vondele
Copy link
Member Author

vondele commented Jul 15, 2024

current pushed version runs correctly for fishtest AFAICT. Still not master fast-chess, but nearly there.

I think the most important thing now verifying that fast-chess is building with all the version of the compilers we need, and in all environments.

Current fishtest version check is gcc >= 7.3 and clang >= 8.0 (current online workers are gcc >= 9.3 and clang >= 13.0).

@vondele
Copy link
Member Author

vondele commented Jul 15, 2024

updated to master now.

@vondele
Copy link
Member Author

vondele commented Jul 15, 2024

updated to current fast-chess, verified to compile with gcc 7.3.0 and clang 8.0.0, the fishtest minimum required versions.

worker/games.py Outdated Show resolved Hide resolved
worker/worker.py Outdated Show resolved Hide resolved
@Disservin
Copy link
Member

there was some code somewhere which made high concurrency worker only pickup ltc tasks, arguably we don't need that bit anymore with fc, but can be done later

@vondele
Copy link
Member Author

vondele commented Jul 15, 2024

there was some code somewhere which made high concurrency worker only pickup ltc tasks, arguably we don't need that bit anymore with fc, but can be done later

OK, will remove that, it is a server change.

@vondele vondele force-pushed the fastchessPR branch 2 times, most recently from 4cdb5f2 to deaf11b Compare July 15, 2024 21:33
@Disservin
Copy link
Member

Should we also first compile the binary with tests, to make sure that everything works on the host ? make -j tests && ./fast-chess-tests
Not quite sure

@vondele
Copy link
Member Author

vondele commented Jul 16, 2024

To be considered... will the tests work if fast-chess is compiled with 'USE_CUTE' and how long does it take?
Are there additional dependencies if done (gtest or similar) ?

@Disservin
Copy link
Member

will the tests work if fast-chess is compiled with 'USE_CUTE'

it is a separate binary which can only run the tests

how long does it take?

the tests are rather quick, ~3s.. compiling them takes considerably more time... i need to check if that can be improved

Are there additional dependencies if done (gtest or similar) ?

there's a depedency on doctest, but the repo has source code for that

@vondele
Copy link
Member Author

vondele commented Jul 16, 2024

testing worked locally with gcc 7.3 and clang 8.0, and is simple to add. Trying in CI.

working. Timing added is not so important, this happens just once and the binary is reused.

@ppigazzini
Copy link
Collaborator

fishtest/worker/games.py

Lines 1157 to 1180 in 12981ff

# Run cutechess-cli binary.
# Stochastic rounding and probability for float N.p: (N, 1-p); (N+1, p)
idx = cmd.index("_spsa_")
cmd = (
cmd[:idx]
+ [
"option.{}={}".format(
x["name"], math.floor(x["value"] + random.uniform(0, 1))
)
for x in w_params
]
+ cmd[idx + 1 :]
)
idx = cmd.index("_spsa_")
cmd = (
cmd[:idx]
+ [
"option.{}={}".format(
x["name"], math.floor(x["value"] + random.uniform(0, 1))
)
for x in b_params
]
+ cmd[idx + 1 :]
)

if I recall correctly fast-chess does not need stochastic rounding.

@vondele
Copy link
Member Author

vondele commented Jul 16, 2024

if I recall correctly fast-chess does not need stochastic rounding.

Isn't that more a property of SF? I think parameters are integers in SF, or at least only that is well tested.

@ppigazzini
Copy link
Collaborator

cutechess accepts only integers, but it's a pleasure to start blaming SF devs for the convoluted SPSA parameters rounding.

@vondele
Copy link
Member Author

vondele commented Jul 16, 2024

They consulted their legal department, which points to the UCI spec

		* spin
			a spin wheel that can be an integer in a certain range

@ppigazzini
Copy link
Collaborator

ppigazzini commented Jul 16, 2024

No way a legal department takes can be as a specification.

@Disservin
Copy link
Member

working. Timing added is not so important, this happens just once and the binary is reused.

just for documentation, we get a one time addition of 2min - 4min

image

@vondele
Copy link
Member Author

vondele commented Jul 16, 2024

ah, you mean in fishtest CI. For workers, I think it is acceptable.
Also, this is CI, where I hardcoded concurrency to 1. Let me change that to 4, I recall that's the concurrency that is available to actions (and concurrency is used for the compilation with -j N)

@Disservin
Copy link
Member

I think the culprit is that fastchess_WLD_results["games"] is not computed as W+L+D (which would include the saved_stats).

Mh fastchess updates calculates this as w+l+d.. now what comes to my mind is that maybe some race could lead to flawed updates but this shouldn't happend because all relevant parts are behind a lock. Can you give more information perhaps i.e. was the assertion wrong by 1 or 2? meaning a pair was missed or something was not updated in pairs.. also are you on the latest version of this pr? I think there were issues in the past not sure if they were ever included here or fixed before

@vdbergh
Copy link
Contributor

vdbergh commented Jul 30, 2024

I think the culprit is that fastchess_WLD_results["games"] is not computed as W+L+D (which would include the saved_stats).

Mh fastchess updates calculates this as w+l+d.. now what comes to my mind is that maybe some race could lead to flawed updates but this shouldn't happend because all relevant parts are behind a lock. Can you give more information perhaps i.e. was the assertion wrong by 1 or 2? meaning a pair was missed or something was not updated in pairs.. also are you on the latest version of this pr? I think there were issues in the past not sure if they were ever included here or fixed before

There is nothing wrong with fastchess. In the case of SPSA the worker computes WLD of a task as the sum of the WLD for the "mini tasks" (corresponding to a choice of parameters). The same should be done for the total number of games.

@vdbergh
Copy link
Contributor

vdbergh commented Jul 30, 2024

One may wonder however what the point is of this summing of WLD data corresponding to different sets of parameters...

@Disservin
Copy link
Member

There is nothing wrong with fastchess. In the case of SPSA the worker computes WLD of a task as the sum of the WLD for the "mini tasks" (corresponding to a choice of parameters). The same should be done for the total number of games.

oh! i missed the fact that both assertion errors came from a spsa test, thanks

@vondele
Copy link
Member Author

vondele commented Aug 2, 2024

I got two assertion errors https://tests.stockfishchess.org/actions?max_actions=2&action=&user=&text=AssertionError&before=1722328311.350552&run_id=

I think the culprit is that fastchess_WLD_results["games"] is not computed as W+L+D (which would include the saved_stats).

I am bit surprised however that this issue (if present) does not cause more problems.

@vdbergh is this the proper fix (quick local test suggest it is):

diff --git a/worker/games.py b/worker/games.py
index 88a26c2..cc82c02 100644
--- a/worker/games.py
+++ b/worker/games.py
@@ -990,7 +990,12 @@ def parse_fastchess_output(
                 spsa["losses"] = fastchess_WLD_results["losses"]
                 spsa["draws"] = fastchess_WLD_results["draws"]
 
-            num_games_finished = fastchess_WLD_results["games"]
+            num_games_finished = (
+                fastchess_WLD_results["games"]
+                + saved_stats["wins"]
+                + saved_stats["losses"]
+                + saved_stats["draws"]
+            )
 
             fastchess_ptnml_results = None
             fastchess_WLD_results = None

@vdbergh
Copy link
Contributor

vdbergh commented Aug 2, 2024

Yes I think so.

@vdbergh
Copy link
Contributor

vdbergh commented Aug 8, 2024

For reference: there are some more assertion errors https://tests.stockfishchess.org/actions?max_actions=2&action=&user=&text=AssertionError&before=1723033368.540148&run_id=

I have not debugged it yet (it is again for SPSA tests).

@vdbergh
Copy link
Contributor

vdbergh commented Aug 8, 2024

Ok this is master

            assert num_games_finished == 2 * sum(rounds["pentanomial"])
            assert num_games_finished <= num_games_updated + batch_size
            assert num_games_finished <= games_to_play

This is this PR

            assert num_games_finished == 2 * sum(result["stats"]["pentanomial"])
            assert num_games_finished <= num_games_updated + batch_size
            assert num_games_finished <= games_to_play

num_games_finished is the number of games finished in the current mini task (so the fix in #2119 (comment) is actually wrong, sorry about this).

The difference between master and this PR is the right hand side of the first assert.

2 * sum(rounds["pentanomial"])

in master vs

2 * sum(result["stats"]["pentanomial"])

in this PR.

The difference it that the first expression only refers to the current mini task and the second expression refers to the aggregated results.

So I think the second expression should be replaced by

2*sum(fastchess_ptnml_results)

@vdbergh
Copy link
Contributor

vdbergh commented Aug 9, 2024

This would be the patch wrt the current PR. I am now testing it

diff --git a/worker/games.py b/worker/games.py
index cc82c02..9ef2796 100644
--- a/worker/games.py
+++ b/worker/games.py
@@ -990,15 +990,7 @@ def parse_fastchess_output(
                 spsa["losses"] = fastchess_WLD_results["losses"]
                 spsa["draws"] = fastchess_WLD_results["draws"]
 
-            num_games_finished = (
-                fastchess_WLD_results["games"]
-                + saved_stats["wins"]
-                + saved_stats["losses"]
-                + saved_stats["draws"]
-            )
-
-            fastchess_ptnml_results = None
-            fastchess_WLD_results = None
+            num_games_finished = fastchess_WLD_results["games"]
 
             assert (
                 2 * sum(result["stats"]["pentanomial"])
@@ -1006,10 +998,13 @@ def parse_fastchess_output(
                 + result["stats"]["losses"]
                 + result["stats"]["draws"]
             )
-            assert num_games_finished == 2 * sum(result["stats"]["pentanomial"])
+            assert num_games_finished == 2 * sum(fastchess_ptnml_results)
             assert num_games_finished <= num_games_updated + batch_size
             assert num_games_finished <= games_to_play
 
+            fastchess_ptnml_results = None
+            fastchess_WLD_results = None
+
             # Send an update_task request after a batch is full or if we have played all games.
             if (num_games_finished == num_games_updated + batch_size) or (
                 num_games_finished == games_to_play

worker/games.py Outdated Show resolved Hide resolved
@Viren6
Copy link

Viren6 commented Aug 17, 2024

Seems time losses can be a lot higher than games played:

image

Checking the pgn shows the time losses is exactly twice as high as it should be, so there is probably an error between pairs vs games somewhere.

@gahtan-syarif
Copy link

gahtan-syarif commented Aug 17, 2024

Seems time losses can be a lot higher than games played:

image

Checking the pgn shows the time losses is exactly twice as high as it should be, so there is probably an error between pairs vs games somewhere.

it seems i found out what went wrong, fastchess when encountering time losses displays a warning to the user like below:

Warning; Engine Engine2 loses on time
Finished game 3 (Engine1 vs Engine2): 1-0 {Black loses on time}

this warning does not exist on cutechess.
fishtest counts the occurrences of the string "loses on time" which means it counts twice for every timeloss on fastchess output. the same applies to crashes as fc also sends a warning for disconnects

@Disservin
Copy link
Member

Seems time losses can be a lot higher than games played:
image
Checking the pgn shows the time losses is exactly twice as high as it should be, so there is probably an error between pairs vs games somewhere.

it seems i found out what went wrong, fastchess when encountering time losses displays a warning to the user like below:

Warning; Engine Engine2 loses on time
Finished game 3 (Engine1 vs Engine2): 1-0 {Black loses on time}

this warning does not exist on cutechess. fishtest counts the occurrences of the string "loses on time" which means it counts twice for every timeloss on fastchess output. the same applies to crashes as fc also sends a warning for disconnects

Lest just get rid of that I don’t see any value in the warning report

@gahtan-syarif
Copy link

Seems time losses can be a lot higher than games played:
image
Checking the pgn shows the time losses is exactly twice as high as it should be, so there is probably an error between pairs vs games somewhere.

it seems i found out what went wrong, fastchess when encountering time losses displays a warning to the user like below:

Warning; Engine Engine2 loses on time
Finished game 3 (Engine1 vs Engine2): 1-0 {Black loses on time}

this warning does not exist on cutechess. fishtest counts the occurrences of the string "loses on time" which means it counts twice for every timeloss on fastchess output. the same applies to crashes as fc also sends a warning for disconnects

Lest just get rid of that I don’t see any value in the warning report

i suggest making it logger::trace so it could still show up on the log file

@vondele
Copy link
Member Author

vondele commented Aug 17, 2024

Updated fast-chess after the fix, thanks for analysis and fixes.

@gahtan-syarif
Copy link

gahtan-syarif commented Aug 25, 2024

        # Parse line like this:
        # Finished game 1 (stockfish vs base): 0-1 {White disconnects}
        if "disconnects" in line or "connection stalls" in line:
            result["stats"]["crashes"] += 1

        if "on time" in line:
            result["stats"]["time_losses"] += 1

i suggest changing to:

        # Parse line like this:
        # Finished game 1 (stockfish vs base): 0-1 {White disconnects}
        if "disconnect" in line or "stall" in line:
            result["stats"]["crashes"] += 1

        if "on time" in line or "timeout" in line:
            result["stats"]["time_losses"] += 1

might also want to rename timelosses to timeouts
based on: https://github.com/cutechess/cutechess/blob/780065637f9936bc29cc592c6f0b2007ccbf66de/projects/lib/src/board/result.cpp#L109-L132

@vondele vondele marked this pull request as draft September 2, 2024 05:31
@vondele
Copy link
Member Author

vondele commented Sep 2, 2024

converting to draft, awaiting a change in convention for mate in plies / moves in the pgn output:

Disservin/fastchess#696

Edit: Fixed, PR updated.

@vondele vondele marked this pull request as ready for review September 3, 2024 07:14
@gahtan-syarif
Copy link

                "-tournament",
                "gauntlet",

better to get rid of this since its unecessary and fastchess currently supports only roundrobin format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace cutechess-cli by fast-chess in the worker?
7 participants