Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fluvio-test] Add longevity test health checks #2390

Open
tjtelan opened this issue May 18, 2022 · 0 comments
Open

[fluvio-test] Add longevity test health checks #2390

tjtelan opened this issue May 18, 2022 · 0 comments
Labels
bug Something isn't working needs-scope technical debt Test Infrastructure Testing infrastructure Test

Comments

@tjtelan
Copy link
Contributor

tjtelan commented May 18, 2022

Some of my automated testing is set to run a full 24 hours. But I'm noticing that eventually testing sessions stop producing to testing topics but fluvio-test continues to run.

Restarting tests seem to work well enough for my objective, so I just want to know earlier if the test session has failed so I can correlate logs more effectively.

We should bail early if we're unable to recover from random disconnections. We can probably monitor the offsets for our topics and time out after a reasonable time if offsets stop increasing.

These are a small sample of the unique looking errors I see in the logs

2022-05-18T19:28:33.1399478Z ESC[2m2022-05-18T19:28:33.139388ZESC[0m ESC[31mERRORESC[0m ESC[1mrunESC[0mESC[2m:ESC[0m ESC[2mfluvio::producer::partition_producerESC[0mESC[2m:ESC[0m Failed to flush producer: Socket(Io(Custom { kind: TimedOut, error: "Timed out waiting for response. API_KEY=0, CorrelationId=40358" }))
2022-05-18T19:28:33.1405832Z ESC[2m2022-05-18T19:28:33.140284ZESC[0m ESC[31mERRORESC[0m ESC[2mfluvio::consumerESC[0mESC[2m:ESC[0m error sending offset: Io(
2022-05-18T19:28:33.1406210Z     Custom {
2022-05-18T19:28:33.1406710Z         kind: TimedOut,
2022-05-18T19:28:33.1407363Z         error: "Timed out waiting for response. API_KEY=1005, CorrelationId=133453",
2022-05-18T19:28:33.1407507Z     },
2022-05-18T19:28:33.1407621Z )
2022-05-18T19:28:33.1408230Z ESC[2m2022-05-18T19:28:33.140680ZESC[0m ESC[31mERRORESC[0m ESC[2mfluvio::consumerESC[0mESC[2m:ESC[0m error sending offset: Io(
2022-05-18T19:28:33.1408354Z     Custom {
2022-05-18T19:28:33.1408487Z         kind: TimedOut,
2022-05-18T19:28:33.1408748Z         error: "Timed out waiting for response. API_KEY=1005, CorrelationId=169693",
2022-05-18T19:28:33.1408861Z     },
2022-05-18T19:28:33.1408971Z )
2022-05-18T19:28:33.1427376Z thread 'main' panicked at 'Producer Send failed: Producer(Internal("Fluvio socket error: Timed out waiting for response. API_KEY=0, CorrelationId=40360"))', crates/fluvio-test/src/tests/longevity/producer.rs:62:18
2022-05-18T19:28:33.1427988Z note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2022-05-18T19:28:33.1428542Z ESC[2m2022-05-18T19:28:33.142007ZESC[0m ESC[31mERRORESC[0m ESC[2mfluvio::consumerESC[0mESC[2m:ESC[0m error sending offset: Io(
2022-05-18T19:28:33.1428665Z     Custom {
2022-05-18T19:28:33.1428797Z         kind: TimedOut,
2022-05-18T19:28:33.1429056Z         error: "Timed out waiting for response. API_KEY=1005, CorrelationId=133088",
2022-05-18T19:28:33.1429168Z     },
2022-05-18T19:28:33.1429278Z )
2022-05-18T19:28:33.1429873Z ESC[2m2022-05-18T19:28:33.142018ZESC[0m ESC[32m INFOESC[0m ESC[1mrunESC[0mESC[2m:ESC[0m ESC[2mfluvio::producer::partition_producerESC[0mESC[2m:ESC[0m partition producer end event received
2022-05-18T19:28:33.1430417Z ESC[2m2022-05-18T19:28:33.142028ZESC[0m ESC[32m INFOESC[0m ESC[1mrunESC[0mESC[2m:ESC[0m ESC[2mfluvio::producer::partition_producerESC[0mESC[2m:ESC[0m partition producer end
2022-05-18T22:43:26.9243414Z ESC[2m2022-05-18T22:43:26.923629ZESC[0m ESC[31mERRORESC[0m ESC[1mdispatcher_loopESC[0mESC[1m{ESC[0mESC[3mselfESC[0mESC[2m=ESC[0mMultiplexDisp(8)ESC[1m}ESC[0mESC[2m:ESC[0m ESC[2mfluvio_socket::multiplexingESC[0mESC[2m:ESC[0m error sending to socket, problem sending to queue socket: 4, err: sending into a closed channel
2022-05-18T22:43:39.2000960Z ESC[2m2022-05-18T22:43:39.199849ZESC[0m ESC[31mERRORESC[0m ESC[1mdispatcher_loopESC[0mESC[1m{ESC[0mESC[3mselfESC[0mESC[2m=ESC[0mMultiplexDisp(8)ESC[1m}ESC[0mESC[2m:ESC[0m ESC[2mfluvio_socket::multiplexingESC[0mESC[2m:ESC[0m error sending to socket, problem sending to queue socket: 5, err: sending into a closed channel
2022-05-18T22:43:39.2111044Z ESC[2m2022-05-18T22:43:39.210910ZESC[0m ESC[31mERRORESC[0m ESC[1mdispatcher_loopESC[0mESC[1m{ESC[0mESC[3mselfESC[0mESC[2m=ESC[0mMultiplexDisp(8)ESC[1m}ESC[0mESC[2m:ESC[0m ESC[2mfluvio_socket::multiplexingESC[0mESC[2m:ESC[0m error sending to socket, problem sending to queue socket: 5, err: sending into a closed channel
2022-05-18T22:43:40.3656432Z ESC[2m2022-05-18T22:43:40.364758ZESC[0m ESC[31mERRORESC[0m ESC[1mdispatcher_loopESC[0mESC[1m{ESC[0mESC[3mselfESC[0mESC[2m=ESC[0mMultiplexDisp(8)ESC[1m}ESC[0mESC[2m:ESC[0m ESC[2mfluvio_socket::multiplexingESC[0mESC[2m:ESC[0m error sending to socket, problem sending to queue socket: 6, err: sending into a closed channel
2022-05-18T22:43:40.3744158Z ESC[2m2022-05-18T22:43:40.372739ZESC[0m ESC[31mERRORESC[0m ESC[1mdispatcher_loopESC[0mESC[1m{ESC[0mESC[3mselfESC[0mESC[2m=ESC[0mMultiplexDisp(8)ESC[1m}ESC[0mESC[2m:ESC[0m ESC[2mfluvio_socket::multiplexingESC[0mESC[2m:ESC[0m error sending to socket, problem sending to queue socket: 6, err: sending into a closed channel
@tjtelan tjtelan added bug Something isn't working technical debt Test Infrastructure Testing infrastructure Test needs-scope labels May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-scope technical debt Test Infrastructure Testing infrastructure Test
Projects
None yet
Development

No branches or pull requests

1 participant