feat: CommunityToolkit.Aspire.Hosting.K3s — k3s Kubernetes cluster hosting integration#1322
feat: CommunityToolkit.Aspire.Hosting.K3s — k3s Kubernetes cluster hosting integration#1322edmondshtogu wants to merge 43 commits into
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/CommunityToolkit/Aspire/main/eng/scripts/dogfood-pr.sh | bash -s -- 1322Or
iex "& { $(irm https://raw.githubusercontent.com/CommunityToolkit/Aspire/main/eng/scripts/dogfood-pr.ps1) } 1322" |
|
@dotnet-policy-service agree |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 37 out of 38 changed files in this pull request and generated 15 comments.
Comments suppressed due to low confidence (4)
src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.Manifest.cs:95
- This
asynccallback has no await, which will produce CS1998 and be treated as an error in this repo. Use a completedTaskreturn (or a synchronous overload) so the project builds cleanly.
resourceBuilder.WithContainerFiles("/k8s-manifests", async (ctx, ct) =>
{
if (Directory.Exists(absolutePath))
src/CommunityToolkit.Aspire.Hosting.K3s/K3sBuilderExtensions.Helm.cs:91
- This
asyncexpression-bodied callback has no await, so CS1998 will be emitted and treated as an error. Return a completed task (or use a synchronous overload) instead of marking the lambda async.
.WithContainerFiles("/helm-values", async (ctx, ct) =>
release.ValuesFiles
.Select((hostPath, i) => (ContainerFileSystemItem)new ContainerFile
src/CommunityToolkit.Aspire.Hosting.K3s/K3sInProcessPortForwarder.cs:152
- The same empty-selector case here can select the first ready pod in the namespace and forward traffic to an unrelated workload. Services without selectors are valid in Kubernetes, so the forwarder should not treat an empty selector as "all pods".
var selector = string.Join(",",
(svc.Spec.Selector ?? new Dictionary<string, string>()).Select(kv => $"{kv.Key}={kv.Value}"));
var pods = await k8sClient.CoreV1
.ListNamespacedPodAsync(@namespace, labelSelector: selector, cancellationToken: ct)
tests/CommunityToolkit.Aspire.Hosting.K3s.Tests/K3sClusterResourceTests.cs:253
- This test name says it verifies the pod subnet argument, but the assertion only checks that a cluster resource exists. It would still pass if
WithPodSubnetstopped adding--cluster-cidr, so it should assert the command-line args.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
Did we verify that the TypeScript path works end to end? |
@davidfowl yes, I introduced additional changes since the callback (Action) was unusable in TypeScript polyglot apphosts — the generated Promise-based wrapper never resolved, making agentCount and the Helm/kubectl image overrides silently unreachable. Here is the TypeScript polyglot in action: |
|
Can someone approve the tests so I can see if there is anything else to be improved? |
|
Can someone approve the workflows again so I can see if there is anything else to be improved? Important The |
EndpointReference.IsAllocated caches its result on first call via a nullable bool field (??= pattern). For persistent containers in polyglot AppHosts, the health check can tick before DCP fires the endpoint allocation event, permanently caching false and making the port unresolvable for the lifetime of the process — leaving local/kubeconfig.yaml stale with port 6443. Replace the IsAllocated guard + GetValueAsync path with a direct read of EndpointAnnotation.AllocatedEndpoint.Port. This property is non-cached (checks IsValueSet on every call) and non-blocking (returns null if DCP has not yet allocated), so it picks up DCP's allocation on any subsequent health check tick regardless of when the first tick occurred. Port resolution order: 1. annotation.AllocatedEndpoint.Port — set by DCP on container start 2. annotation.Port — static apiServerPort from AddK3sCluster Remove the EndpointReference constructor parameter and the port hint file mechanism, both of which are unnecessary with this approach.
|
@aaronpowell could you please take a look at the PR and share if there is something you would like to change, or if you are happy to approve the integration? I'd love to start using this in my projects. Currently, because microsoft/aspire#16878 is still open, I can't add third-party integrations via the Aspire CLI, so having this package available is the only way around. |
aaronpowell
left a comment
There was a problem hiding this comment.
Tests that are going to require Docker to run will need to be marked with the attribute to indicate that, otherwise they'll fail on Windows during CI.
!File.Exists check followed by File.WriteAllText — which is a Windows file-locking race when xunit runs tests in parallel. FileMode.CreateNew + swallowing the collision IOException makes it atomic.
|
@aaronpowell The Windows CI failure was a file-locking race in Fixed by replacing the check-then-create pattern with None of the tests actually require Docker: they all use either mocked |
aaronpowell
left a comment
There was a problem hiding this comment.
Pretty solid integration, almost ready to merge in.
Couple of changes needed on this and we'll need to have a test that bootstraps the TypeScript app host (you'll find examples in the repo on how to do that).
Some of the questions I have might be due to naivety on k3 (I've not used it myself).
| # Strip build metadata (e.g. 13.4.3+becb48e → 13.4.3) before comparing so that | ||
| # a newer patch release installed by the action does not fail this check. | ||
| INSTALLED_BASE="${INSTALLED_ASPIRE_CLI_VERSION%%+*}" | ||
| # Pass when installed base version >= expected version (sort -V = version ordering). | ||
| if ! printf '%s\n' "${EXPECTED_ASPIRE_CLI_VERSION}" "${INSTALLED_BASE}" | sort -V -C; then | ||
| echo "Expected Aspire CLI version >= ${EXPECTED_ASPIRE_CLI_VERSION}, but found ${INSTALLED_ASPIRE_CLI_VERSION}." | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
I don't think this is needed, we already handle that in CI
There was a problem hiding this comment.
reverting it, I added since the CI was failing in my fork side due to version mismatch
There was a problem hiding this comment.
Can we move this file, and the others related to it, into the examples folder as the AppHost.TypeScript for the integration, to follow the pattern we have with other integrations.
There was a problem hiding this comment.
Need to update this to match the latest design of the config - SDK version should be there.
There was a problem hiding this comment.
Can you delete this file, we auto-generate it.
| internal (string Registry, string Image, string Tag) HelmImageInfo { get; set; } | ||
| = ("docker.io", "alpine/helm", "3.17.3"); | ||
|
|
||
| /// <summary>Container image settings for the kubectl manifest applier, resolved from cluster options.</summary> | ||
| internal (string Registry, string Image, string Tag) KubectlImageInfo { get; set; } | ||
| = ("docker.io", "alpine/kubectl", "1.36.0"); |
There was a problem hiding this comment.
The default values of the tuple should be driven from the types we have to ensure there's only a single place to update.
| if (servicePort is < 1 or > 65535) | ||
| throw new ArgumentOutOfRangeException(nameof(servicePort), | ||
| servicePort, "Service port must be in the range 1–65535."); |
There was a problem hiding this comment.
We should leave this to Aspire to handle. While it's a valid check, no other integration does that so I'd prefer to be consistent.
| var resource = new K3sClusterResource(name) | ||
| { | ||
| HelmImageInfo = (HelmContainerImageTags.Registry, HelmContainerImageTags.Image, HelmContainerImageTags.Tag), | ||
| KubectlImageInfo = (KubectlContainerImageTags.Registry, KubectlContainerImageTags.Image, KubectlContainerImageTags.Tag), | ||
| }; |
There was a problem hiding this comment.
Given that K3sClusterResource is a container resource, why does it then also have container info for other containers? Should they not be modelled as their own resources?
| // container/ — rewritten by the health check with server: https://{name}:6443 | ||
| var kubeconfigDir = Path.Combine(builder.AppHostDirectory, ".k3s", name); | ||
| var clusterDir = Path.Combine(kubeconfigDir, "cluster"); | ||
| Directory.CreateDirectory(clusterDir); |
There was a problem hiding this comment.
Should we error-handle the folder already existing?

Closes #1321
Overview of changes
Adds CommunityToolkit.Aspire.Hosting.K3s, a hosting integration that runs a lightweight Kubernetes cluster as an Aspire resource tree. Developers can declare a local Kubernetes cluster in Program.cs — with Helm charts, manifests, and exposed service endpoints — the same way they add Redis or PostgreSQL. No external tooling beyond a compatible container runtime (Docker or Podman) is required.
What's included
New package: src/CommunityToolkit.Aspire.Hosting.K3s/
New Tests & Examples:
Key design decisions
Health check via bind-mount, not docker exec. k3s writes its kubeconfig to
K3S_KUBECONFIG_OUTPUT=/tmp/k3s-kubeconfig/kubeconfig.yaml, bind-mounted toAppHostDirectory/.k3s/{name}/cluster/on the host. The health check pollsFile.Exists, rewrites server URLs intolocal/andcontainer/variants, then confirms node readiness viaIKubernetes.CoreV1.ListNodeAsync. No shell access, nodocker exec, works with any container runtime.Helm and kubectl run as containers.
HelmReleaseResourceandK8sManifestResourceextendContainerResourceand are shown as children of the cluster in the Aspire dashboard. The install/apply script is injected viaWithContainerFiles. They cannot useWaitFor(cluster)(Aspire forbids a child waiting for its parent), so their scripts poll for/root/.kube/kubeconfig.yaml— which only appears after the cluster health check passes — before proceeding. Consumers useWaitForCompletion(helmRelease)since these are run-to-completion containers.Kubeconfig delivered via bind-mount to all containers.
WithReference(cluster)on containers, the helm installer, and the kubectl applier all bind-mountAppHostDirectory/.k3s/{name}/container/(server:https://{name}:6443) at a known in-container path and setKUBECONFIG. Bind-mount is used uniformly so the kubeconfig updates automatically if the cluster is recreated without restarting dependent containers. NoKUBECONFIG_DATAbase64 encoding — all standard Kubernetes tooling (kubectl,helm, KubernetesClient SDK) works without custom bootstrap code.Kustomize auto-detected.
AddK8sManifestchecks forkustomization.yamlat configuration time: if present, it bind-mounts the directory (preserving relative base references) and useskubectl apply -k; otherwise it uses an asyncWithContainerFilescallback to copy only the YAML files at container-start time and applies withkubectl apply -f --server-side. The callback approach avoids Aspire's build-time path validation on the string overload. The script auto-detects the mode at runtime.Service exposure without NodePort.
K3sServiceEndpointResourcestarts an in-process KubernetesClient WebSocket port-forward bound to0.0.0.0:{hostPort}. Host resources receiveservices__{name}__url=http(s)://localhost:{port}; container resources receiveservices__{name}__url=http(s)://host.docker.internal:{port}with--add-host=host.docker.internal:host-gatewayinjected automatically viaContainerRuntimeArgsCallbackAnnotation(DCP does not inject this on Linux). The forwarder resolvestargetPortfrom the service spec (not the service port) before opening the pod WebSocket, and only signals ready after a running pod is confirmed — not when the TCP listener starts.Image overrides via K3sClusterOptions. The helm and kubectl container images are configurable via
HelmImage/HelmTag/HelmRegistryandKubectlImage/KubectlTag/KubectlRegistryonK3sClusterOptions. Defaults:docker.io/alpine/helm:3.17.3anddocker.io/alpine/kubectl:1.36.0.Robustness fixes from review.
WithK3sVersionpropagates the image tag to all agent nodes (prevents server/agent version skew).AddServiceEndpointvalidates the port is in the range 1–65535. Helm values files are indexed as{i}-{filename}so declaration order is preserved and basename collisions are safe. All helm--setvalues and--valuespaths are POSIX single-quote escaped viaShellEscape().Usage
Example
PR Checklist
Other information
Security Assumptions
This change assumes local development only. k3s runs in privileged mode (required by k3s/containerd). The bind-mounted kubeconfig directory is readable only by the AppHost user.
KubernetesClientConfigurationuses the embedded CA cert from the kubeconfig; noDangerousAcceptAnyServerCertificateValidatoris used.Remaining Follow-up Work
K3sClusterResourcehas no production counterpart configured.