Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ray Core] For the same python test, the results of pytest and bazel are inconsistent #51211

Open
Moonquakes opened this issue Mar 10, 2025 · 0 comments
Assignees
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P2 Important issue, but not time-critical

Comments

@Moonquakes
Copy link

What happened + What you expected to happen

The results of using pytest and bazel to test the same python code are different. Pytest always succeeds, while bazel test always throws the following exception. What may be the cause?

Versions / Dependencies

Ray v2.38.0

Reproduction script

The two test statements are:
python -m pytest -v -s python/ray/tests/test_ray_debugger.py
bazel test --build_tests_only $(./ci/run/bazel_export_options) --config=ci --test_env=CI="1" --test_output=streamed -- //python/ray/tests:test_ray_debugger

The error message of bazel test is:

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //python/ray/tests:test_ray_debugger
-----------------------------------------------------------------------------
============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-7.4.4, pluggy-1.3.0 -- /opt/conda/envs/original-env/bin/python3
cachedir: .pytest_cache
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /root/.cache/bazel/_bazel_root/7b4611e5f7d910d529cf99d9ecdcc56a/execroot/com_github_ray_project_ray
configfile: pytest.ini
plugins: asyncio-0.17.0, forked-1.4.0, shutil-1.7.0, sugar-0.9.5, rerunfailures-11.1.2, timeout-2.1.0, httpserver-1.0.6, sphinx-0.5.1.dev0, docker-tools-3.1.3, anyio-3.7.1, virtualenv-1.7.0, lazy-fixture-0.6.3, benchmark-4.0.0
timeout: 180.0s
timeout method: signal
timeout func_only: False
collecting ... collected 10 items

python/ray/tests/test_ray_debugger.py::test_ray_debugger_breakpoint 2025-03-07 02:42:55,881	INFO worker.py:1807 -- Started a local Ray instance. View the dashboard at [1m[32m127.0.0.1:8265 [39m[22m
[36m(f pid=26195)[0m RemotePdb session open at localhost:44791, use 'ray debug' to connect...
[36m(f pid=26195)[0m RemotePdb accepted connection from ('127.0.0.1', 48272).
[36m(f pid=26195)[0m *** SIGSEGV received at time=1741315376 on cpu 3 ***
[36m(f pid=26195)[0m PC: @     0x7f4ab74057fd  (unknown)  (unknown)
[36m(f pid=26195)[0m     @     0x7f4ab72aa520  (unknown)  (unknown)
[36m(f pid=26195)[0m     @     0x7f4ab04d3061      16544  (unknown)
[36m(f pid=26195)[0m     @     0x7f4ab04c9d20  (unknown)  _rl_set_mark_at_pos
[36m(f pid=26195)[0m [2025-03-07 02:42:56,386 E 26195 26195] logging.cc:440: *** SIGSEGV received at time=1741315376 on cpu 3 ***
[36m(f pid=26195)[0m [2025-03-07 02:42:56,386 E 26195 26195] logging.cc:440: PC: @     0x7f4ab74057fd  (unknown)  (unknown)
[36m(f pid=26195)[0m [2025-03-07 02:42:56,386 E 26195 26195] logging.cc:440:     @     0x7f4ab72aa520  (unknown)  (unknown)
[36m(f pid=26195)[0m [2025-03-07 02:42:56,386 E 26195 26195] logging.cc:440:     @     0x7f4ab04d3061      16544  (unknown)
[36m(f pid=26195)[0m [2025-03-07 02:42:56,386 E 26195 26195] logging.cc:440:     @     0x7f4ab04c9d20  (unknown)  _rl_set_mark_at_pos
[36m(f pid=26195)[0m Fatal Python error: Segmentation fault
[36m(f pid=26195)[0m 
[36m(f pid=26195)[0m Stack (most recent call first):
[36m(f pid=26195)[0m   File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
[36m(f pid=26195)[0m   File "<frozen importlib._bootstrap_external>", line 1176 in create_module
[36m(f pid=26195)[0m   File "<frozen importlib._bootstrap>", line 571 in module_from_spec
[36m(f pid=26195)[0m   File "<frozen importlib._bootstrap>", line 674 in _load_unlocked
[36m(f pid=26195)[0m   File "<frozen importlib._bootstrap>", line 1006 in _find_and_load_unlocked
[36m(f pid=26195)[0m   File "<frozen importlib._bootstrap>", line 1027 in _find_and_load
[36m(f pid=26195)[0m   File "/opt/conda/envs/original-env/lib/python3.10/pdb.py", line 148 in __init__
[36m(f pid=26195)[0m   File "/data/ray/python/ray/util/rpdb.py", line 122 in listen
[36m(f pid=26195)[0m   File "/data/ray/python/ray/util/rpdb.py", line 269 in _connect_ray_pdb
[36m(f pid=26195)[0m   File "/data/ray/python/ray/util/rpdb.py", line 290 in set_trace
[36m(f pid=26195)[0m   File "/root/.cache/bazel/_bazel_root/7b4611e5f7d910d529cf99d9ecdcc56a/execroot/com_github_ray_project_ray/bazel-out/k8-opt/bin/python/ray/tests/test_ray_debugger.runfiles/com_github_ray_project_ray/python/ray/tests/test_ray_debugger.py", line 23 in f
[36m(f pid=26195)[0m   File "/data/ray/python/ray/_private/worker.py", line 917 in main_loop
[36m(f pid=26195)[0m   File "/data/ray/python/ray/_private/workers/default_worker.py", line 289 in <module>
[36m(f pid=26195)[0m 
[36m(f pid=26195)[0m Extension modules: psutil._psutil_linux, psutil._psutil_posix, msgpack._cmsgpack, google.protobuf.pyext._message, setproctitle, yaml._yaml, charset_normalizer.md, ray._raylet, pvectorc (total: 9)

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++

~~~~~~~~~~~~~~~~~~ Stack of ray_print_logs (139687217845824) ~~~~~~~~~~~~~~~~~~~
  File "/opt/conda/envs/original-env/lib/python3.10/threading.py", line 973, in _bootstrap
    self._bootstrap_inner()
  File "/opt/conda/envs/original-env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/conda/envs/original-env/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/data/ray/python/ray/_private/worker.py", line 939, in print_logs
    data = subscriber.poll()

~~~~~~~~~~~~~ Stack of ray_listen_error_messages (139687226238528) ~~~~~~~~~~~~~
  File "/opt/conda/envs/original-env/lib/python3.10/threading.py", line 973, in _bootstrap
    self._bootstrap_inner()
  File "/opt/conda/envs/original-env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/conda/envs/original-env/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/data/ray/python/ray/_private/worker.py", line 2198, in listen_error_messages
    _, error_data = worker.gcs_error_subscriber.poll()

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
Traceback (most recent call last):
  File "/opt/conda/envs/original-env/lib/python3.10/site-packages/pytest_timeout.py", line 241, in handler
    timeout_sigalrm(item, settings.timeout)
  File "/opt/conda/envs/original-env/lib/python3.10/site-packages/pytest_timeout.py", line 409, in timeout_sigalrm
    pytest.fail("Timeout >%ss" % timeout)
  File "/opt/conda/envs/original-env/lib/python3.10/site-packages/_pytest/outcomes.py", line 198, in fail
    raise Failed(msg=reason, pytrace=pytrace)
Failed: Timeout >180.0s

Issue Severity

None

@Moonquakes Moonquakes added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 10, 2025
@jcotant1 jcotant1 added the core Issues that should be addressed in Ray Core label Mar 10, 2025
@jjyao jjyao added P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P2 Important issue, but not time-critical
Projects
None yet
Development

No branches or pull requests

4 participants