Describe the bug
An ephemeral self-hosted runner hosted on AWS Fargate (Ubuntu 24.04) intermittently loses
communication with the GitHub Actions server shortly after a job starts. This does not happen
consistently — it occurs roughly once every 50 job runs. When it fails, the Fargate container
logs are truncated immediately after the "Running job" line, and the runner never produces
a completion log. The GitHub UI reports a communication loss error.
To Reproduce
This is an intermittent issue and cannot be reliably reproduced on demand. General setup:
- Register an ephemeral self-hosted runner on AWS Fargate (Ubuntu 24.04, arm64).
- Trigger a workflow that dispatches a job to the runner.
- Approximately 1 out of 50 runs, the runner stops responding after picking up the job.
Expected behavior
The runner should complete the job and produce the following log sequence:
Job completed with result: Succeeded
√ Removed .credentials
√ Removed .runner
Runner listener exit with 0 return code, stop the service, no retry needed.
Exiting runner…
Runner exited with code: 0
Runner Version and Platform
- Runner version:
actions-runner-linux-arm64-2.311.0
- OS: Ubuntu 24.04 (ARM64), running inside AWS Fargate
- Runner mode: Ephemeral (
--ephemeral)
What's not working?
The runner loses communication with the GitHub server after picking up a job.
GitHub UI error:
The self-hosted runner lost communication with the server. Verify the machine is running
and has a healthy network connection. Anything in your workflow that terminates the runner
process, starves it for CPU/Memory, or blocks its network access can cause this error.
Notes:
- Memory and disk usage are logged every 5 seconds; no anomalies observed around failure time.
- No signs of OOM, CPU starvation, or network disruption in Fargate metrics.
- The runner container exits without producing a completion log or cleanup output.
Job Log Output
Failing run — Fargate container log ends abruptly at:
2026-05-21 01:03:53Z: Running job: ci-cd / dev-backend-ci-cd
(No further output. Container exits.)
Successful run — Fargate container log continues normally:
2026-05-21 01:03:53Z: Running job: ci-cd / dev-backend-ci-cd
2026-06-11 00:17:26Z: Job ci-cd / dev-backend-ci-cd completed with result: Succeeded
√ Removed .credentials
√ Removed .runner
Runner listener exit with 0 return code, stop the service, no retry needed.
Exiting runner…
Runner exited with code: 0
Runner and Worker's Diagnostic Logs
Diagnostic logs from the _diag folder were collected and reviewed. However, the logs only
provide more detailed output up to the following line — nothing is logged after this point:
2026-05-21 01:03:53Z: Running job: ci-cd / dev-backend-ci-cd
The diagnostic logs confirm the runner picked up the job successfully, but provide no
information about what caused the runner to stop responding afterward. The root cause
of the hang/exit after job dispatch remains unidentified from the available logs.
Describe the bug
An ephemeral self-hosted runner hosted on AWS Fargate (Ubuntu 24.04) intermittently loses
communication with the GitHub Actions server shortly after a job starts. This does not happen
consistently — it occurs roughly once every 50 job runs. When it fails, the Fargate container
logs are truncated immediately after the "Running job" line, and the runner never produces
a completion log. The GitHub UI reports a communication loss error.
To Reproduce
This is an intermittent issue and cannot be reliably reproduced on demand. General setup:
Expected behavior
The runner should complete the job and produce the following log sequence:
Job completed with result: Succeeded
√ Removed .credentials
√ Removed .runner
Runner listener exit with 0 return code, stop the service, no retry needed.
Exiting runner…
Runner exited with code: 0
Runner Version and Platform
actions-runner-linux-arm64-2.311.0--ephemeral)What's not working?
The runner loses communication with the GitHub server after picking up a job.
GitHub UI error:
Notes:
Job Log Output
Failing run — Fargate container log ends abruptly at:
2026-05-21 01:03:53Z: Running job: ci-cd / dev-backend-ci-cd
(No further output. Container exits.)
Successful run — Fargate container log continues normally:
2026-05-21 01:03:53Z: Running job: ci-cd / dev-backend-ci-cd
2026-06-11 00:17:26Z: Job ci-cd / dev-backend-ci-cd completed with result: Succeeded
√ Removed .credentials
√ Removed .runner
Runner listener exit with 0 return code, stop the service, no retry needed.
Exiting runner…
Runner exited with code: 0
Runner and Worker's Diagnostic Logs
Diagnostic logs from the
_diagfolder were collected and reviewed. However, the logs onlyprovide more detailed output up to the following line — nothing is logged after this point:
2026-05-21 01:03:53Z: Running job: ci-cd / dev-backend-ci-cd
The diagnostic logs confirm the runner picked up the job successfully, but provide no
information about what caused the runner to stop responding afterward. The root cause
of the hang/exit after job dispatch remains unidentified from the available logs.