-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test-run sometimes doesn't kill server started by test #345
Labels
bug
Something isn't working
Comments
locker
added a commit
to locker/tarantool
that referenced
this issue
Jul 8, 2022
The gh_6565 test doesn't stop the hot standby replica it started, because the replica should fail to initialize and exit eventually anyway. However, if the replica lingers until the next test due to tarantool/test-run#345, the next test may successfully connect to it, which is likely to lead to a failure, because UNIX socket paths used by luatest servers are not randomized. For example, here gh_6568 test fails after gh_6565, because it uses the same alias for the test instance ('replica'): NO_WRAP [008] vinyl-luatest/gh_6565_hot_standby_unsupported_> [ pass ] [008] vinyl-luatest/gh_6568_replica_initial_join_rem> [ fail ] [008] Test failed! Output from reject file /tmp/t/rejects/vinyl-luatest/gh_6568_replica_initial_join_removal_of_compacted_run_files.reject: [008] TAP version 13 [008] 1..1 [008] # Started on Fri Jul 8 15:30:47 2022 [008] # Starting group: tarantoolgh-6568-replica-initial-join-removal-of-compacted-run-files [008] not ok 1 tarantoolgh-6568-replica-initial-join-removal-of-compacted-run-files.test_replication_compaction_cleanup [008] # builtin/fio.lua:242: fio.pathjoin(): undefined path part 1 [008] # stack traceback: [008] # builtin/fio.lua:242: in function 'pathjoin' [008] # ...ica_initial_join_removal_of_compacted_run_files_test.lua:43: in function 'tarantoolgh-6568-replica-initial-join-removal-of-compacted-run-files.test_replication_compaction_cleanup' [008] # ... [008] # [C]: in function 'xpcall' [008] replica | 2022-07-08 15:30:48.311 [832856] main/103/default.lua F> can't initialize storage: unlink, called on fd 30, aka unix/:(socket), peer of unix/:(socket): Address already in use [008] # Ran 1 tests in 0.722 seconds, 0 succeeded, 1 errored NO_WRAP Let's fix this by explicitly killing the hot standby replica. Since it could have exited voluntarily, we need to use pcall, because server.stop fails if the instance is already dead. This issue is similar to the one fixed by commit 8504016 ("test: stop server started by vinyl-luatest/update_optimize test"). NO_DOC=test NO_CHANGELOG=test
locker
added a commit
to tarantool/tarantool
that referenced
this issue
Jul 8, 2022
The gh_6565 test doesn't stop the hot standby replica it started, because the replica should fail to initialize and exit eventually anyway. However, if the replica lingers until the next test due to tarantool/test-run#345, the next test may successfully connect to it, which is likely to lead to a failure, because UNIX socket paths used by luatest servers are not randomized. For example, here gh_6568 test fails after gh_6565, because it uses the same alias for the test instance ('replica'): NO_WRAP [008] vinyl-luatest/gh_6565_hot_standby_unsupported_> [ pass ] [008] vinyl-luatest/gh_6568_replica_initial_join_rem> [ fail ] [008] Test failed! Output from reject file /tmp/t/rejects/vinyl-luatest/gh_6568_replica_initial_join_removal_of_compacted_run_files.reject: [008] TAP version 13 [008] 1..1 [008] # Started on Fri Jul 8 15:30:47 2022 [008] # Starting group: gh-6568-replica-initial-join-removal-of-compacted-run-files [008] not ok 1 gh-6568-replica-initial-join-removal-of-compacted-run-files.test_replication_compaction_cleanup [008] # builtin/fio.lua:242: fio.pathjoin(): undefined path part 1 [008] # stack traceback: [008] # builtin/fio.lua:242: in function 'pathjoin' [008] # ...ica_initial_join_removal_of_compacted_run_files_test.lua:43: in function 'gh-6568-replica-initial-join-removal-of-compacted-run-files.test_replication_compaction_cleanup' [008] # ... [008] # [C]: in function 'xpcall' [008] replica | 2022-07-08 15:30:48.311 [832856] main/103/default.lua F> can't initialize storage: unlink, called on fd 30, aka unix/:(socket), peer of unix/:(socket): Address already in use [008] # Ran 1 tests in 0.722 seconds, 0 succeeded, 1 errored NO_WRAP Let's fix this by explicitly killing the hot standby replica. Since it could have exited voluntarily, we need to use pcall, because server.stop fails if the instance is already dead. This issue is similar to the one fixed by commit 8504016 ("test: stop server started by vinyl-luatest/update_optimize test"). NO_DOC=test NO_CHANGELOG=test
locker
added a commit
to tarantool/tarantool
that referenced
this issue
Jul 8, 2022
The gh_6565 test doesn't stop the hot standby replica it started, because the replica should fail to initialize and exit eventually anyway. However, if the replica lingers until the next test due to tarantool/test-run#345, the next test may successfully connect to it, which is likely to lead to a failure, because UNIX socket paths used by luatest servers are not randomized. For example, here gh_6568 test fails after gh_6565, because it uses the same alias for the test instance ('replica'): NO_WRAP [008] vinyl-luatest/gh_6565_hot_standby_unsupported_> [ pass ] [008] vinyl-luatest/gh_6568_replica_initial_join_rem> [ fail ] [008] Test failed! Output from reject file /tmp/t/rejects/vinyl-luatest/gh_6568_replica_initial_join_removal_of_compacted_run_files.reject: [008] TAP version 13 [008] 1..1 [008] # Started on Fri Jul 8 15:30:47 2022 [008] # Starting group: gh-6568-replica-initial-join-removal-of-compacted-run-files [008] not ok 1 gh-6568-replica-initial-join-removal-of-compacted-run-files.test_replication_compaction_cleanup [008] # builtin/fio.lua:242: fio.pathjoin(): undefined path part 1 [008] # stack traceback: [008] # builtin/fio.lua:242: in function 'pathjoin' [008] # ...ica_initial_join_removal_of_compacted_run_files_test.lua:43: in function 'gh-6568-replica-initial-join-removal-of-compacted-run-files.test_replication_compaction_cleanup' [008] # ... [008] # [C]: in function 'xpcall' [008] replica | 2022-07-08 15:30:48.311 [832856] main/103/default.lua F> can't initialize storage: unlink, called on fd 30, aka unix/:(socket), peer of unix/:(socket): Address already in use [008] # Ran 1 tests in 0.722 seconds, 0 succeeded, 1 errored NO_WRAP Let's fix this by explicitly killing the hot standby replica. Since it could have exited voluntarily, we need to use pcall, because server.stop fails if the instance is already dead. This issue is similar to the one fixed by commit 8504016 ("test: stop server started by vinyl-luatest/update_optimize test"). NO_DOC=test NO_CHANGELOG=test (cherry picked from commit 6213907)
mkokryashkin
pushed a commit
to mkokryashkin/tarantool
that referenced
this issue
Sep 9, 2022
The gh_6565 test doesn't stop the hot standby replica it started, because the replica should fail to initialize and exit eventually anyway. However, if the replica lingers until the next test due to tarantool/test-run#345, the next test may successfully connect to it, which is likely to lead to a failure, because UNIX socket paths used by luatest servers are not randomized. For example, here gh_6568 test fails after gh_6565, because it uses the same alias for the test instance ('replica'): NO_WRAP [008] vinyl-luatest/gh_6565_hot_standby_unsupported_> [ pass ] [008] vinyl-luatest/gh_6568_replica_initial_join_rem> [ fail ] [008] Test failed! Output from reject file /tmp/t/rejects/vinyl-luatest/gh_6568_replica_initial_join_removal_of_compacted_run_files.reject: [008] TAP version 13 [008] 1..1 [008] # Started on Fri Jul 8 15:30:47 2022 [008] # Starting group: tarantoolgh-6568-replica-initial-join-removal-of-compacted-run-files [008] not ok 1 tarantoolgh-6568-replica-initial-join-removal-of-compacted-run-files.test_replication_compaction_cleanup [008] # builtin/fio.lua:242: fio.pathjoin(): undefined path part 1 [008] # stack traceback: [008] # builtin/fio.lua:242: in function 'pathjoin' [008] # ...ica_initial_join_removal_of_compacted_run_files_test.lua:43: in function 'tarantoolgh-6568-replica-initial-join-removal-of-compacted-run-files.test_replication_compaction_cleanup' [008] # ... [008] # [C]: in function 'xpcall' [008] replica | 2022-07-08 15:30:48.311 [832856] main/103/default.lua F> can't initialize storage: unlink, called on fd 30, aka unix/:(socket), peer of unix/:(socket): Address already in use [008] # Ran 1 tests in 0.722 seconds, 0 succeeded, 1 errored NO_WRAP Let's fix this by explicitly killing the hot standby replica. Since it could have exited voluntarily, we need to use pcall, because server.stop fails if the instance is already dead. This issue is similar to the one fixed by commit 8504016 ("test: stop server started by vinyl-luatest/update_optimize test"). NO_DOC=test NO_CHANGELOG=test
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
If a test starts a server, it should stop it on completion. However, if it doesn't, the server should still be stopped by test-run or luatest. Normally, this is what happens, but sometimes, the test server somehow survives.
How to reproduce:
vinyl-luatest/update_optimize
test tarantool#7359.The PR added server stop to
vinyl-luatest/update_optimize_test.lua
.(You'll probably need to try a few times before you catch it.)
A stray instance is running normally: it can be connected to or killed with SIGTERM.
Output
The text was updated successfully, but these errors were encountered: