Skip to content

Conversation

@cijohnson
Copy link

Motivation

overnight burn-in tests(aghfc, rvs) should not fail due to one bad node should continue to test other nodes to qualify eligible nodes.

Technical Details

stop_on_errors is an optional arg supported by parallelssh library defaults to True.

Change is to allow callers of Pssh instance to pass optional stop_on_errors and pass it to run_command api in exec and exec_cmd_list methods.

Test Plan

AGHFC and RVS tests should be run with ssh disabled in one of the node in cluster and ensure the tests continues

Test Result

TO BE EXECUTED

Submission Checklist

overnight tests should not fail due to one bad node
should continue to test other nodes to qualify eligible nodes.

stop_on_errors is an optional arg supported by parallelssh library
defaults to True.

Change is to allow callers of Pssh instance to pass optional
stop_on_errors and pass it to run_command api in exec and
exec_cmd_list methods.

Signed-off-by: Ignatious Johnson <[email protected]>
tests, so that these tests will continue to
run overnight even if one of the node is unresponsive.

Signed-off-by: Ignatious Johnson <[email protected]>
in virtual env conveniently.

Signed-off-by: Ignatious Johnson <[email protected]>
covers exec and exec_cmd_list methods

Signed-off-by: Ignatious Johnson <[email protected]>
in case of pssh.exceptions.Timeout exception and
the node is unreachable. Unreachability is ensured
by creating a ssh session to the specific set of nodes
which raised Timeout.

Added UT to cover these cases

Signed-off-by: Ignatious Johnson <[email protected]>
@cijohnson cijohnson force-pushed the ichristo/support_optional_continue_on_failure branch from 1feb198 to 1e068cd Compare November 20, 2025 00:35
This ensures that the output dictionary reflects the status of pruned hosts.
"""
for host in self.unreachable_hosts:
cmd_output[host] = "Host Unreachable"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case, If the node became unreachable after the partial test execution OR due to some fatal error from the tests, this cmd_output[host] would have some valid output. In this line that data will be overwritten right.

May be just appending the cmd_output with "\n\n\n !!!! Host Unreachable !!!! \n\n\n" will help at the test function where it receives the cmd_output.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, good catch, will append the Host unreachability at the end

@@ -0,0 +1,43 @@
VENV_DIR = test_venv
PYTHON = python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in somecase, it could be python3.
May be handle with which python or which python3

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will take care , thanks


import pytest
import globals
from . import globals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was getting this error:
==================================== ERRORS ==================================== ___________________ ERROR collecting tests/health/rvs_cvs.py ___________________ ImportError while importing test module '/home/ssolaiya/work/11_CVS/cvs-CIgna/cvs/tests/health/rvs_cvs.py'. Hint: make sure your test modules/packages have valid Python names. Traceback: test_venv/lib/python3.12/site-packages/_pytest/python.py:507: in importtestmodule mod = import_path( test_venv/lib/python3.12/site-packages/_pytest/pathlib.py:587: in import_path importlib.import_module(module_name) /usr/lib/python3.12/importlib/__init__.py:90: in import_module return _bootstrap._gcd_import(name[level:], package, level) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ <frozen importlib._bootstrap>:1387: in _gcd_import ??? <frozen importlib._bootstrap>:1360: in _find_and_load ??? <frozen importlib._bootstrap>:1331: in _find_and_load_unlocked ??? <frozen importlib._bootstrap>:935: in _load_unlocked ??? test_venv/lib/python3.12/site-packages/_pytest/assertion/rewrite.py:197: in exec_module exec(co, module.__dict__) tests/health/rvs_cvs.py:22: in <module> from utils_lib import * lib/utils_lib.py:14: in <module> from . import globals E ImportError: attempted relative import with no known parent package

changing back to
import globals
works fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will debug this today

self.prune_unreachable_hosts(output)
self.inform_unreachability(cmd_output)

return cmd_output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the test files need to check this cmd_output for ""Host Unreachable" string and do some action in its phdl obj to remove the bad host ?

Right now, its executing the commands on the bad node as well. and returning with ERROR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants