diff --git a/platform-cloud/docs/compute-envs/preflight-checks.mdx b/platform-cloud/docs/compute-envs/preflight-checks.mdx index 72ff72cbf..75bdd733d 100644 --- a/platform-cloud/docs/compute-envs/preflight-checks.mdx +++ b/platform-cloud/docs/compute-envs/preflight-checks.mdx @@ -1,11 +1,11 @@ --- title: "Compute environment pre-flight checks" description: "How Seqera Platform continuously validates compute environments and what to do when a check fails" -date: "23 Jun 2026" +date created: "2026-06-23" tags: [compute environments, credentials, troubleshooting] --- -Pre-flight checks validate that a compute environment (CE) is usable before and at the point of pipeline launch. They run in the background on a recurring schedule and synchronously at launch time, so problems are surfaced before pipelines are submitted rather than failing mid-run. Pre-flight checks only flag conditions that would cause a pipeline launch to fail. +Pre-flight checks validate that a compute environment is usable before and at the point of pipeline launch. They run in the background on a recurring schedule and synchronously at launch time. Problems appear before pipeline submission rather than mid-run. Pre-flight checks only flag conditions that would block a pipeline launch. Pre-flight checks are enabled by default and cannot be disabled. @@ -15,84 +15,84 @@ Before creating or deploying a compute environment, confirm the following: **Credentials** - The access keys, service account key, or managed identity are valid and have not been rotated or revoked. -- The IAM role or service account has the permissions required by the target platform. See the relevant CE page for the minimum required policy. +- The IAM role or service account has the permissions required by the target platform. See the relevant compute environment page for the minimum required policy. **Work directory** - The bucket or storage container exists in the same region as the compute environment (required for AWS Batch and AWS Cloud compute environments only). -- The credential attached to the CE has read and write access to the work directory path. +- The credential attached to the compute environment has read and write access to the work directory path. -**Wave** (if enabled on the CE) +**Wave** (if enabled) - The Wave service is running and reachable from the Platform instance. -**Tower Agent** (HPC/grid CEs only) +**Tower Agent** (HPC/grid compute environments only) - Tower Agent is reachable from Platform. See [Tower Agent](../getting-started/tower-agent) for installation and startup instructions. -## What is checked and when +## Validation process Platform runs three tiers of validation: -### 1. Background credential sweep +### 1. Credential validation -Runs on a recurring schedule. For each cloud credential (AWS, GCP, Azure) in scope, Platform calls the provider API to verify the credential is still accepted and can authenticate. The credential sweep makes a live API call to verify the credential is still accepted by the cloud provider. For AWS role-based credentials and GCP Workload Identity Federation, this check confirms the credential is well-formed but cannot fully verify the underlying role or identity provider trust configuration. +Runs on a recurring schedule. For each cloud credential (AWS, GCP, Azure) in scope, Platform calls the provider API to verify that the credential is still accepted. For AWS role-based credentials and GCP Workload Identity Federation, this check confirms the credential is well-formed but cannot fully verify the underlying role or identity provider trust configuration. -When a credential fails this check, it is marked **INVALID** with the provider error recorded on the credential record. This error appears in the launch-time error when a pipeline is blocked, but not in the CE banner — to see the specific provider error, check the credential record directly. +When a credential fails this check, Platform marks it **INVALID** and records the provider error on the credential record. This error appears in the launch-time error when a pipeline is blocked, but not in the compute environment banner. To see the specific provider error, check the credential record directly. -### 2. Background CE sweep +### 2. Compute environment validation -Runs approximately every hour across all **AVAILABLE** compute environments. Each CE goes through two gated checks in sequence: +Runs approximately every hour across all `AVAILABLE` compute environments. Platform validates: -1. **Credential status** — reads the credential status already recorded in the database (no extra cloud call). If the credential is **INVALID**, the CE is immediately marked **INVALID** and the work directory check is skipped. -2. **Work directory** — calls the cloud provider to verify the CE's configured work directory is accessible. If this fails, the CE is marked **INVALID** with the provider error appended. +1. **Credential status**: If the credential is `INVALID`, the compute environment is marked `INVALID` immediately. +2. **Work directory**: Platform calls the cloud provider to verify that the configured work directory is accessible. If this check fails, the compute environment is marked **INVALID** and the provider error appears in the card view and at the top of the form page. -A CE marked **INVALID** by the background sweep displays a banner explaining the reason. A CE that passes returns to or stays **AVAILABLE** and its `lastValidated` timestamp is refreshed. +A compute environment marked `INVALID` displays a banner with the error message. An `AVAILABLE` compute environment has its `lastValidated` timestamp refreshed. :::note -The background CE sweep covers AWS Batch, AWS Cloud, Azure Batch, Azure Cloud, Google Cloud Batch, and Google Cloud compute environments. +These checks cover AWS Batch, AWS Cloud, Azure Batch, Azure Cloud, Google Cloud Batch, and Google Cloud compute environments. ::: ### 3. Pipeline launch-time checks -Run immediately when a user submits a pipeline launch. If any check fails, the launch is blocked and a specific error is returned. Multiple failures are reported together. +Runs immediately when a user submits a pipeline launch. If any check fails, the launch is blocked and a specific error is returned. Multiple failures are reported together. | Check | What it does | |---|---| -| CE status | Blocks launch if the CE is marked **INVALID** | -| Credential status | Blocks launch if the credential associated with the CE is marked **INVALID** | +| Compute environment status | Blocks launch if the compute environment is marked `INVALID` | +| Credential status | Blocks launch if the credential associated with the compute environment is marked `INVALID` | | Work directory override | If the user provided a different work directory at launch, validates that path with the cloud provider | -| Wave connectivity | For CEs with Wave enabled, verifies the Wave service connection is active | -| Tower Agent | For HPC/grid CEs, verifies a Tower Agent is online for the environment | +| Wave connectivity | For compute environments with Wave enabled, verifies the Wave service connection is active | +| Tower Agent | For HPC compute environments, verifies a Tower Agent is online for the environment | -## Manually re-validating credentials +## Manual credential validation -When a credential is marked **INVALID** and you have rotated the keys or fixed the underlying issue, you can trigger an immediate re-validation: +When a credential is marked `INVALID` and you have rotated the keys or fixed the underlying issue, you can trigger an immediate re-validation: 1. Navigate to **Credentials** in your workspace. 2. Find the credential and select **Validate**. -Platform makes a live call to the cloud provider and updates the credential status immediately. If the check passes, the credential returns to **AVAILABLE**. +Platform makes a live call to the cloud provider and updates the credential status immediately. If the check passes, the credential returns to `AVAILABLE`. -Any compute environments that were marked **INVALID** due to this credential will not recover automatically — use **Validate** on each affected CE after the credential is restored. +Compute environments marked `INVALID` because of this credential do not recover automatically. Use **Validate** on each affected compute environment after restoring the credential. -## Manually re-validating a compute environment +## Manual compute environment validation -When a compute environment is marked **INVALID** and you have fixed the underlying issue, you can trigger an immediate re-validation without waiting for the next background sweep: +When a compute environment is marked `INVALID` and you have fixed the underlying issue, you can trigger an immediate re-validation without waiting for the next background sweep: 1. Navigate to **Compute environments** in your workspace. -2. Find the CE and open its **⋮** (three-dot) dropdown menu. +2. Find the compute environment and open its **⋮** (three-dot) drop-down. 3. Select **Validate**. -Platform runs the full check sequence (live credential probe + work directory check) and updates the CE status immediately. If all checks pass, the CE returns to **AVAILABLE**. +Platform runs pre-flight checks and updates the compute environment status immediately. If all checks pass, the compute environment returns to `AVAILABLE`. ## Error reference ### Compute environment error messages -These appear on the compute environment detail page when the CE is **INVALID**. +These banners appear on the compute environment detail page when the compute environment is `INVALID`. | Banner | Meaning | Action | |---|---|---| -| `Associated credentials are invalid or expired. Update the credentials and validate this compute environment, or contact your workspace maintainer to resolve this.` | The background sweep found the attached credential is no longer valid | Go to **Credentials**, update or rotate the credential, then use **Validate** on the CE | -| `Work directory is invalid. {reason}. Fix it and validate this compute environment, or contact your workspace maintainer to resolve this.` | The background sweep could not access the configured work directory | Fix the bucket/path permissions or location, then use **Validate** on the CE | +| `Associated credentials are invalid or expired. Update the credentials and validate this compute environment, or contact your workspace maintainer to resolve this.` | The background sweep found the attached credential is no longer valid | Go to **Credentials**, update or rotate the credential, then use **Validate** on the compute environment | +| `Work directory is invalid. {reason}. Fix it and validate this compute environment, or contact your workspace maintainer to resolve this.` | The background sweep could not access the configured work directory | Fix the bucket/path permissions or location, then use **Validate** on the compute environment | ### Launch-time errors @@ -100,14 +100,14 @@ These are returned immediately to the user when a launch is blocked. | Error | Cause | Resolution | |---|---|---| -| `The selected compute environment '...' is in an invalid state` | CE is marked INVALID (see banner on the CE for the specific reason) | Fix the root cause shown in the CE banner, then use **Validate** on the CE | -| `The credentials '...' used by this compute environment are invalid` | Credential is marked INVALID | Go to **Credentials**, update or rotate the credential, then use **Validate** on the CE | +| `The selected compute environment '...' is in an invalid state` | Compute environment is marked `INVALID` (see banner for the specific reason) | Fix the root cause, then use **Validate** on the compute environment | +| `The credentials '...' used by this compute environment are invalid` | Credential is marked `INVALID` | Go to **Credentials**, update or rotate the credential, then use **Validate** on the compute environment | | `Wave is required by the selected compute environment but the Wave service connection is not active. Verify that Wave is running and check for connectivity issues` | Platform cannot reach the Wave service | Contact your platform administrator. Once Wave is restored, retry the launch | -| `No Tower Agent is online for the selected compute environment. Check that Tower Agent is running at your cluster.` | No Tower Agent is connected for this CE (HPC/grid only) | Start or restart Tower Agent on the cluster. See [Tower Agent](../getting-started/tower-agent) | +| `No Tower Agent is online for the selected compute environment. Check that Tower Agent is running at your cluster.` | No Tower Agent is connected for this compute environment (HPC/grid only) | Start or restart Tower Agent on the cluster. See [Tower Agent](../getting-started/tower-agent) | ### Credential error messages -When the credential sweep marks a credential INVALID, the provider-specific reason is stored on the credential record. It appears in the launch-time error when a pipeline is blocked, but not in the CE banner — to see the specific provider error, check the credential record directly. +When the credential sweep marks a credential `INVALID`, the provider-specific reason is stored on the credential record. It appears in the launch-time error when a pipeline is blocked, but not in the compute environment banner. To see the specific provider error, check the credential record directly. | Provider | Example message | |---|---|