Skip to content

feat: CommunityToolkit.Aspire.Hosting.K3s — k3s Kubernetes cluster hosting integration#1322

Open
edmondshtogu wants to merge 43 commits into
CommunityToolkit:mainfrom
edmondshtogu:main
Open

feat: CommunityToolkit.Aspire.Hosting.K3s — k3s Kubernetes cluster hosting integration#1322
edmondshtogu wants to merge 43 commits into
CommunityToolkit:mainfrom
edmondshtogu:main

Conversation

@edmondshtogu

@edmondshtogu edmondshtogu commented May 14, 2026

Copy link
Copy Markdown

Closes #1321

Overview of changes

Adds CommunityToolkit.Aspire.Hosting.K3s, a hosting integration that runs a lightweight Kubernetes cluster as an Aspire resource tree. Developers can declare a local Kubernetes cluster in Program.cs — with Helm charts, manifests, and exposed service endpoints — the same way they add Redis or PostgreSQL. No external tooling beyond a compatible container runtime (Docker or Podman) is required.

What's included

New package: src/CommunityToolkit.Aspire.Hosting.K3s/

File Responsibility
K3sClusterResource.cs ContainerResource — k3s server; holds kubeconfig directory path and image settings
K3sClusterOptions.cs Configuration (pod/service CIDR, disabled components, k3s image tag, helm/kubectl image overrides)
K3sBuilderExtensions.cs AddK3sCluster, WithDataVolume, WithLifetime, WithReference(cluster), WithK3sVersion, …
K3sReadinessHealthCheck.cs File-based health check — polls cluster/kubeconfig.yaml, writes local/ + container/ variants, probes nodes via KubernetesClient
HelmReleaseResource.cs ContainerResource — runs alpine/helm; child of cluster; exits 0 on success
K3sBuilderExtensions.Helm.cs AddHelmRelease, WithHelmValue, WithHelmValuesFile
K8sManifestResource.cs ContainerResource — runs alpine/k8s; child of cluster; exits 0 on success
K3sBuilderExtensions.Manifest.cs AddK8sManifest with auto-detected Kustomize support
K3sServiceEndpointResource.cs Resource — in-process port-forward; M1 passive health via IsReady flag
K3sBuilderExtensions.ServiceEndpoint.cs AddServiceEndpoint, WithReference(endpoint)
K3sInProcessPortForwarder.cs KubernetesClient WebSocket TCP forwarder; binds 0.0.0.0:{port}
K3sAgentResource.cs Worker node support (K3sClusterOptions.AgentCount)
HelmContainerImageTags.cs / KubectlContainerImageTags.cs Pinned image defaults; overridable via K3sClusterOptions

New Tests & Examples:

  • Unit Tests: tests/CommunityToolkit.Aspire.Hosting.K3s.Tests/ — 87 unit tests covering resource registration, script generation, kubeconfig variants, Kustomize detection, values file injection, and public API null guards.
  • Examples: examples/k3s/CommunityToolkit.Aspire.Hosting.K3s.AppHost/
  • TypeScript playground: playground/polyglot/TypeScript/CommunityToolkit.Aspire.Hosting.K3s/ValidationAppHost/

Key design decisions

  • Health check via bind-mount, not docker exec. k3s writes its kubeconfig to K3S_KUBECONFIG_OUTPUT=/tmp/k3s-kubeconfig/kubeconfig.yaml, bind-mounted to AppHostDirectory/.k3s/{name}/cluster/ on the host. The health check polls File.Exists, rewrites server URLs into local/ and container/ variants, then confirms node readiness via IKubernetes.CoreV1.ListNodeAsync. No shell access, no docker exec, works with any container runtime.

  • Helm and kubectl run as containers. HelmReleaseResource and K8sManifestResource extend ContainerResource and are shown as children of the cluster in the Aspire dashboard. The install/apply script is injected via WithContainerFiles. They cannot use WaitFor(cluster) (Aspire forbids a child waiting for its parent), so their scripts poll for /root/.kube/kubeconfig.yaml — which only appears after the cluster health check passes — before proceeding. Consumers use WaitForCompletion(helmRelease) since these are run-to-completion containers.

  • Kubeconfig delivered via bind-mount to all containers. WithReference(cluster) on containers, the helm installer, and the kubectl applier all bind-mount AppHostDirectory/.k3s/{name}/container/ (server: https://{name}:6443) at a known in-container path and set KUBECONFIG. Bind-mount is used uniformly so the kubeconfig updates automatically if the cluster is recreated without restarting dependent containers. No KUBECONFIG_DATA base64 encoding — all standard Kubernetes tooling (kubectl, helm, KubernetesClient SDK) works without custom bootstrap code.

  • Kustomize auto-detected. AddK8sManifest checks for kustomization.yaml at configuration time: if present, it bind-mounts the directory (preserving relative base references) and uses kubectl apply -k; otherwise it uses an async WithContainerFiles callback to copy only the YAML files at container-start time and applies with kubectl apply -f --server-side. The callback approach avoids Aspire's build-time path validation on the string overload. The script auto-detects the mode at runtime.

  • Service exposure without NodePort. K3sServiceEndpointResource starts an in-process KubernetesClient WebSocket port-forward bound to 0.0.0.0:{hostPort}. Host resources receive services__{name}__url=http(s)://localhost:{port}; container resources receive services__{name}__url=http(s)://host.docker.internal:{port} with --add-host=host.docker.internal:host-gateway injected automatically via ContainerRuntimeArgsCallbackAnnotation (DCP does not inject this on Linux). The forwarder resolves targetPort from the service spec (not the service port) before opening the pod WebSocket, and only signals ready after a running pod is confirmed — not when the TCP listener starts.

  • Image overrides via K3sClusterOptions. The helm and kubectl container images are configurable via HelmImage/HelmTag/HelmRegistry and KubectlImage/KubectlTag/KubectlRegistry on K3sClusterOptions. Defaults: docker.io/alpine/helm:3.17.3 and docker.io/alpine/kubectl:1.36.0.

  • Robustness fixes from review. WithK3sVersion propagates the image tag to all agent nodes (prevents server/agent version skew). AddServiceEndpoint validates the port is in the range 1–65535. Helm values files are indexed as {i}-{filename} so declaration order is preserved and basename collisions are safe. All helm --set values and --values paths are POSIX single-quote escaped via ShellEscape().

Usage

var cluster = builder.AddK3sCluster("k8s")
    .WithDataVolume()
    .WithK3sVersion("v1.36.0-k3s1");

var widgetCrd = cluster.AddK8sManifest("widget-crd", "./k8s/crds/");

var argocd = cluster.AddHelmRelease("argocd", "argo-cd",
    repo: "https://argoproj.github.io/argo-helm",
    version: "7.8.0",
    @namespace: "argocd")
    .WithHelmValuesFile("./deploy/argocd-values.yaml");

var ui = cluster.AddServiceEndpoint("argocd-ui", "argocd-server", 443, "argocd")
    .WaitForCompletion(argocd);

builder.AddProject<Projects.WidgetOperator>("operator")
    .WaitForCompletion(widgetCrd)
    .WithReference(cluster);

builder.AddProject<Projects.Api>("api")
    .WaitFor(ui)
    .WithReference(ui);

Example

image

PR Checklist

  • Created a feature/dev branch in your fork (vs. submitting directly from a commit on main)
  • Based off latest main branch of toolkit
  • PR doesn't include merge commits (always rebase on top of our main, if needed)
  • New integration
    • Docs are written
    • Added description of major feature to project description for NuGet package (4000 total character limit, so don't push entire description over that)
  • Tests for the changes have been added (for bug fixes / features) (if applicable)
  • Contains NO breaking changes
  • Every new API (including internal ones) has full XML docs
  • Code follows all style conventions

Other information

Security Assumptions
This change assumes local development only. k3s runs in privileged mode (required by k3s/containerd). The bind-mounted kubeconfig directory is readable only by the AppHost user. KubernetesClientConfiguration uses the embedded CA cert from the kubeconfig; no DangerousAcceptAnyServerCertificateValidator is used.

Remaining Follow-up Work

  • Publish-time diagnostic when a K3sClusterResource has no production counterpart configured.

Copilot AI review requested due to automatic review settings May 14, 2026 10:45
@github-actions

github-actions Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/CommunityToolkit/Aspire/main/eng/scripts/dogfood-pr.sh | bash -s -- 1322

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/CommunityToolkit/Aspire/main/eng/scripts/dogfood-pr.ps1) } 1322"

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@edmondshtogu

Copy link
Copy Markdown
Author

@dotnet-policy-service agree

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 38 changed files in this pull request and generated 13 comments.

Comment thread .github/workflows/tests.yaml Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.cs
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.Helm.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sContainerImageTags.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/README.md Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.ServiceEndpoint.cs Outdated
edmondshtogu and others added 3 commits May 18, 2026 14:59
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 38 changed files in this pull request and generated 15 comments.

Comments suppressed due to low confidence (4)

src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.Manifest.cs:95

  • This async callback has no await, which will produce CS1998 and be treated as an error in this repo. Use a completed Task return (or a synchronous overload) so the project builds cleanly.
            resourceBuilder.WithContainerFiles("/k8s-manifests", async (ctx, ct) =>
            {
                if (Directory.Exists(absolutePath))

src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.Helm.cs:91

  • This async expression-bodied callback has no await, so CS1998 will be emitted and treated as an error. Return a completed task (or use a synchronous overload) instead of marking the lambda async.
            .WithContainerFiles("/helm-values", async (ctx, ct) =>
                release.ValuesFiles
                    .Select((hostPath, i) => (ContainerFileSystemItem)new ContainerFile

src/CommunityToolkit.Aspire.Hosting.K3s/K3sInProcessPortForwarder.cs:152

  • The same empty-selector case here can select the first ready pod in the namespace and forward traffic to an unrelated workload. Services without selectors are valid in Kubernetes, so the forwarder should not treat an empty selector as "all pods".
            var selector = string.Join(",",
                (svc.Spec.Selector ?? new Dictionary<string, string>()).Select(kv => $"{kv.Key}={kv.Value}"));

            var pods = await k8sClient.CoreV1
                .ListNamespacedPodAsync(@namespace, labelSelector: selector, cancellationToken: ct)

tests/CommunityToolkit.Aspire.Hosting.K3s.Tests/K3sClusterResourceTests.cs:253

  • This test name says it verifies the pod subnet argument, but the assertion only checks that a cluster resource exists. It would still pass if WithPodSubnet stopped adding --cluster-cidr, so it should assert the command-line args.

Comment thread tests/CommunityToolkit.Aspire.Hosting.K3s.IntegrationTests/K3sIntegrationTests.cs Outdated
Comment thread tests/CommunityToolkit.Aspire.Hosting.K3s.IntegrationTests/K3sIntegrationTests.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.Manifest.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.Helm.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sInProcessPortForwarder.cs Outdated
Comment thread tests/CommunityToolkit.Aspire.Hosting.K3s.Tests/K3sClusterResourceTests.cs Outdated
edmondshtogu and others added 2 commits May 18, 2026 17:10
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 38 changed files in this pull request and generated 7 comments.

Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/README.md Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sInProcessPortForwarder.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sInProcessPortForwarder.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.cs Outdated
Comment thread src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.ServiceEndpoint.cs Outdated
edmondshtogu and others added 3 commits May 18, 2026 20:15
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@davidfowl

Copy link
Copy Markdown
Contributor

Did we verify that the TypeScript path works end to end?

@edmondshtogu

Copy link
Copy Markdown
Author

Did we verify that the TypeScript path works end to end?

@davidfowl yes, I introduced additional changes since the callback (Action) was unusable in TypeScript polyglot apphosts — the generated Promise-based wrapper never resolved, making agentCount and the Helm/kubectl image overrides silently unreachable. Here is the TypeScript polyglot in action:
image

@edmondshtogu

Copy link
Copy Markdown
Author

Can someone approve the tests so I can see if there is anything else to be improved?

@edmondshtogu

edmondshtogu commented Jun 8, 2026

Copy link
Copy Markdown
Author

Can someone approve the workflows again so I can see if there is anything else to be improved?

Important

The timheuer/setup-aspire action installs the latest available CLI build in staging channel, which may be a newer patch than the version pinned in Directory.Build.props. The previous check required an exact match (or the exact version with a +sha suffix), so it would fail whenever the action installed 13.4.3+abc while the repo expected 13.4.0 — even though a newer patch is always compatible. I introduced a fix to strips build metadata before comparing and allow any installed version ≥ the expected version. The .Net main Workflow now is passing!

EndpointReference.IsAllocated caches its result on first call via a
nullable bool field (??= pattern). For persistent containers in polyglot
AppHosts, the health check can tick before DCP fires the endpoint
allocation event, permanently caching false and making the port
unresolvable for the lifetime of the process — leaving local/kubeconfig.yaml
stale with port 6443.

Replace the IsAllocated guard + GetValueAsync path with a direct read of
EndpointAnnotation.AllocatedEndpoint.Port. This property is non-cached
(checks IsValueSet on every call) and non-blocking (returns null if DCP
has not yet allocated), so it picks up DCP's allocation on any subsequent
health check tick regardless of when the first tick occurred.

Port resolution order:
1. annotation.AllocatedEndpoint.Port — set by DCP on container start
2. annotation.Port — static apiServerPort from AddK3sCluster

Remove the EndpointReference constructor parameter and the port hint file
mechanism, both of which are unnecessary with this approach.
@edmondshtogu

Copy link
Copy Markdown
Author

@aaronpowell could you please take a look at the PR and share if there is something you would like to change, or if you are happy to approve the integration? I'd love to start using this in my projects. Currently, because microsoft/aspire#16878 is still open, I can't add third-party integrations via the Aspire CLI, so having this package available is the only way around.

@aaronpowell aaronpowell left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that are going to require Docker to run will need to be marked with the attribute to indicate that, otherwise they'll fail on Windows during CI.

!File.Exists check followed by File.WriteAllText — which is a Windows file-locking race when xunit runs tests in parallel. FileMode.CreateNew + swallowing the collision IOException makes it atomic.
@edmondshtogu

Copy link
Copy Markdown
Author

@aaronpowell The Windows CI failure was a file-locking race in EnsureKubeconfigPlaceholder, not a Docker dependency. AddK3sCluster writes placeholder files to {AppHostDirectory}/.k3s/{name}/ at configuration time. With xunit's default parallel execution, multiple tests calling AddK3sCluster("k8s") simultaneously hit a TOCTOU on Windows — both check !File.Exists, both pass, then the second File.WriteAllText fails with IOException: file is being used by another process.

Fixed by replacing the check-then-create pattern with FileMode.CreateNew and swallowing the IOException when the file already exists — atomic create-if-not-exists, safe under concurrent calls.

None of the tests actually require Docker: they all use either mocked IKubernetes via Moq or DistributedApplication.CreateBuilder() + resource inspection without ever calling RunAsync. Marking them [RequiresDocker] would skip valid platform coverage unnecessarily.

@aaronpowell aaronpowell left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty solid integration, almost ready to merge in.

Couple of changes needed on this and we'll need to have a test that bootstraps the TypeScript app host (you'll find examples in the repo on how to do that).

Some of the questions I have might be due to naivety on k3 (I've not used it myself).

Comment on lines +46 to +53
# Strip build metadata (e.g. 13.4.3+becb48e → 13.4.3) before comparing so that
# a newer patch release installed by the action does not fail this check.
INSTALLED_BASE="${INSTALLED_ASPIRE_CLI_VERSION%%+*}"
# Pass when installed base version >= expected version (sort -V = version ordering).
if ! printf '%s\n' "${EXPECTED_ASPIRE_CLI_VERSION}" "${INSTALLED_BASE}" | sort -V -C; then
echo "Expected Aspire CLI version >= ${EXPECTED_ASPIRE_CLI_VERSION}, but found ${INSTALLED_ASPIRE_CLI_VERSION}."
exit 1
fi

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed, we already handle that in CI

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverting it, I added since the CI was failing in my fork side due to version mismatch

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this file, and the others related to it, into the examples folder as the AppHost.TypeScript for the integration, to follow the pattern we have with other integrations.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update this to match the latest design of the config - SDK version should be there.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you delete this file, we auto-generate it.

Comment on lines +24 to +29
internal (string Registry, string Image, string Tag) HelmImageInfo { get; set; }
= ("docker.io", "alpine/helm", "3.17.3");

/// <summary>Container image settings for the kubectl manifest applier, resolved from cluster options.</summary>
internal (string Registry, string Image, string Tag) KubectlImageInfo { get; set; }
= ("docker.io", "alpine/kubectl", "1.36.0");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default values of the tuple should be driven from the types we have to ensure there's only a single place to update.

Comment on lines +84 to +86
if (servicePort is < 1 or > 65535)
throw new ArgumentOutOfRangeException(nameof(servicePort),
servicePort, "Service port must be in the range 1–65535.");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should leave this to Aspire to handle. While it's a valid check, no other integration does that so I'd prefer to be consistent.

Comment on lines +70 to +74
var resource = new K3sClusterResource(name)
{
HelmImageInfo = (HelmContainerImageTags.Registry, HelmContainerImageTags.Image, HelmContainerImageTags.Tag),
KubectlImageInfo = (KubectlContainerImageTags.Registry, KubectlContainerImageTags.Image, KubectlContainerImageTags.Tag),
};

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that K3sClusterResource is a container resource, why does it then also have container info for other containers? Should they not be modelled as their own resources?

// container/ — rewritten by the health check with server: https://{name}:6443
var kubeconfigDir = Path.Combine(builder.AppHostDirectory, ".k3s", name);
var clusterDir = Path.Combine(kubeconfigDir, "cluster");
Directory.CreateDirectory(clusterDir);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we error-handle the folder already existing?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be deleted?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: K3s Kubernetes cluster hosting integration

6 participants