Skip to content

TimeoutError on publish during a server restart #751

@superlevure

Description

@superlevure

Observed behavior

During a server rollout, a client that publishes to a stream can face a errors.TimeoutError:

Traceback (most recent call last):
  File "/workspace/app/nats_test.py", line 102, in publish_message
    await nats_client.publish(
           ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nats/js/client.py", line 201, in publish
    msg = await self._nc.request(
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nats/aio/client.py", line 1061, in request
    msg = await self._request_new_style(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nats/aio/client.py", line 1105, in _request_new_style
    raise errors.TimeoutError

Expected behavior

No Exception should be raised, the publish operation should be robust to a server rollout (like the subscribe operation).

Server and client version

  • Server: 2.11.8
  • Client: nats-py==2.11.0

Host environment

Server running in clustered mode with 3 nodes. The stream has 3 replicas.

Steps to reproduce

  • Deploy NATS with Jetstream enabled on kubernetes using the official Helm Chart (nats-1.3.13) and 3 replicas.
  • Create a test R3 stream
  • Run the following script:
import asyncio

import nats

NATS_SERVER_URL = "nats://nats.nats:4222"

async def main() -> None:
    nats_client = await nats.connect(
        servers=NATS_SERVER_URL,
    )
    jetstream = nats_client.jetstream()

    count: int = 0
    while True:
        subject = "test"
        message = b"Hello, World! %d" % count
        await jetstream.publish(
            subject=subject,
            payload=message,
        )
        print(
            f"Published {message} to {subject}.",
        )
        await asyncio.sleep(0.1)
        count += 1


if __name__ == "__main__":
    asyncio.run(main())
  • Perform a rollout of the server: kubectl rollout restart --namespace nats statefulset/nats
  • After a few attempts, you should experience the following error on the client side:
nats: encountered error
nats.errors.UnexpectedEOF: nats: unexpected EOF
Traceback (most recent call last):
  File "/workspace/app/minimal_test.py", line 19, in main
    await jetstream.publish(
  File "/usr/local/lib/python3.11/site-packages/nats/js/client.py", line 201, in publish
    msg = await self._nc.request(
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nats/aio/client.py", line 1061, in request
    msg = await self._request_new_style(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/nats/aio/client.py", line 1105, in _request_new_style
    raise errors.TimeoutError
nats.errors.TimeoutError: nats: timeout

Metadata

Metadata

Assignees

No one assigned

    Labels

    defectSuspected defect such as a bug or regressionneeds infoAdditional info is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions