Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configurable docker network for kind cluster nodes #273

Closed
foolusion opened this issue Feb 5, 2019 · 27 comments · Fixed by #1538
Closed

Allow configurable docker network for kind cluster nodes #273

foolusion opened this issue Feb 5, 2019 · 27 comments · Fixed by #1538
Assignees
Labels
kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@foolusion
Copy link

It would be nice to be able to specify the network that the cluster uses.

@BenTheElder
Copy link
Member

/kind feature
/priority important-longterm
/assign

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Feb 5, 2019
@BenTheElder BenTheElder added this to the 2019 goals milestone Feb 6, 2019
@BenTheElder
Copy link
Member

We definitely want this, I think we're going to put it in the networking config, and start having an automated default.

cc @aojea @neolit123

Strawman:

  • if the specified network exists, we use it without creating
  • if it doesn't exist, we create it, and label it as belonging to the cluster
  • on cluster delete, we list networks labeled with the cluster, and delete only those (so not just whatever the containers use, only if we labeled it)
  • we name this field / functionality somehow such that it is clear that this feature is docker specific, leaving room for podman etc. in the immediate future xref support multiple container engines on the host #154.

@BenTheElder BenTheElder modified the milestones: 2019 goals, 0.4 May 3, 2019
@neolit123
Copy link
Member

Strawman:

SGTM

do we need to have the config field?
when a cluster is created we can auto-manage a network with the same name or prefixed similarly?
e.g.
kind-network-kind
kind-network-mycluster

@BenTheElder
Copy link
Member

BenTheElder commented May 3, 2019 via email

@neolit123
Copy link
Member

ok, makes sense.

@aojea
Copy link
Contributor

aojea commented May 13, 2019

seems that docker has an option to populate the /etc/hosts file, that can be useful to get rid of the loopback address in the resolv.conf and keep the node name resolution

Managing /etc/hosts
Your container will have lines in /etc/hosts which define the hostname of the container itself as well as localhost and a few other common things. The --add-host flag can be used to add additional lines to /etc/hosts.

$ docker run -it --add-host db-static:86.75.30.9 ubuntu cat /etc/hosts
172.17.0.22     09d03f76bf2c
fe00::0         ip6-localnet
ff00::0         ip6-mcastprefix
ff02::1         ip6-allnodes
ff02::2         ip6-allrouters
127.0.0.1       localhost
::1	            localhost ip6-localhost ip6-loopback
86.75.30.9      db-static

@BenTheElder
Copy link
Member

the only problem with --add-host is we don't know the other nodes's IPs when we call docker run, it's a bit chicken and egg :^)

@BenTheElder
Copy link
Member

This turned out to have a few more issues that we expected due to non-default docker networks having different behavior. This may slip to 0.5 as we're nearing the 0.4 release, but it's definitely something we want.

@BenTheElder BenTheElder modified the milestones: v0.4.0, v0.5.0 Jun 25, 2019
@BenTheElder BenTheElder added the kind/design Categorizes issue or PR as related to design. label Aug 15, 2019
@BenTheElder BenTheElder changed the title Allow configurable docker network for kind clusters Allow configurable docker network for kind cluster nodes Aug 16, 2019
@BenTheElder BenTheElder modified the milestones: v0.5.0, 1.0 Aug 16, 2019
@jayunit100
Copy link
Contributor

jayunit100 commented Oct 19, 2019

Hi . ! can someone disambiguate use cases here, between this and the #278 issue ?

@aojea
Copy link
Contributor

aojea commented Oct 19, 2019

@jayunit100 there are different things regarding networking, one is the CNI plugin used by the kubernetes cluster, kind installs its own CNI by default but you can disable it and install your preferred CNI plugin once kind finish creating the cluster.
The other networking part is the one that docker provides, that's where kind spawn the nodes. Currently kind only supports docker, with its networking limitations. Using another bridge in docker has some consequences that break kinds because of different things.

@jayunit100
Copy link
Contributor

jayunit100 commented Oct 19, 2019

So docker0 is only being used for node IP addresses in the use case for this issue? Thanks for clarifying! Was confused :) . Curious what the use case is for not using docker0 at that level ... after all kind as an abstraction for testing k8s is sufficient as long as the k8s specific stuff isn’t impacted by Dockers impl as a hyper visor for virtual nodes, right?

@jayunit100
Copy link
Contributor

jayunit100 commented Oct 19, 2019

Mostly get it now.. maybe change the title of this issue to “use non docker0 interface for kubelet IPs” (although imprecise I think it gets the point across) so that is clear what we mean by cluster :):)... thanks again! The CNI feature for kind is definetly awesome, want to make sure people know that it works as is :).
Ps for context am looking at using kind instead of my vagrant recipes for some aspects of some calico tests .

@aojea
Copy link
Contributor

aojea commented Oct 19, 2019

There are calico folks using kind for testing, as you can see in this slack conversation https://kubernetes.slack.com/archives/CEKK1KTN2/p1570036710217000 , maybe you can bring up this conversation in our slack channel

The main problem of using a custom bridge with docker is that it modifies the DNS behavior, using an embedded dns server https://docs.docker.com/v17.09/engine/userguide/networking/configure-dns/

Bronek referenced this issue in mks-m/kind Oct 25, 2019
Squashed commit of the following:

commit 00521c4
Author: keymone <[email protected]>
Date:   Mon Oct 21 11:43:25 2019 +0100

    Allow specifying network name to use in docker provisioner
@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 17, 2020
@BenTheElder BenTheElder removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 28, 2020
@kubernetes-sigs kubernetes-sigs deleted a comment from fejta-bot Jan 28, 2020
@zephinzer
Copy link

this would be nice- currently facing an issue of not being able to resolve an internal image registry which is behind my org's vpn. what work is left regarding this? maybe i can help!

@BenTheElder
Copy link
Member

kind uses a specific network now in HEAD (kind) as part of some unrelated work.

as it currently stands kind will not delete any networks, so you can just precreate the kind network with your desired settings.

we need to revisit how that works a bit though WRT IIPv6 in a follow up PR before moving forward.

@BenTheElder
Copy link
Member

#1538 will make it possible to do this.

you shouldn't actually need this in nearly all cases though, kind is switching to ensure and use a "kind" network with all of the features of a user defined network.

if you pre-create this network it will use it as you configured, it does not delete networks.

@BenTheElder
Copy link
Member

most of the problem initially was just that user defined networks are a breaking change in docker vs the bridge, they have different DNS in ways that don't trivially work with kind.

we've fixed that and always use one now.

the remaining issues are that completely arbitrary networks can be ... very strange.
for now we're provisioning our own under a fixed name unless it already exists.

this network is a standard bridge, with an IPv6 subnet picked out of ULA.

joshuaspence added a commit to joshuaspence/homelab that referenced this issue Aug 3, 2020
Instead of using two completely separate subnets (`172.x.0.0/16` is used by Kind and I had arbitrarily chosen `192.168.2.0/24` for MetalLB), use a custom Docker network for `kind` instead. I needed to create the Docker network myself rather than letting `kind` do it because otherwise the subnet IP range will not necessarily be fixed (see https://github.com/kubernetes-sigs/kind/blob/add83858a0addea5899a88003a598399a8a36747/pkg/cluster/internal/providers/docker/network.go#L94).

To allow `kind` to use the custom Docker network, I am relying on `KIND_EXPERIMENTAL_DOCKER_NETWORK`. See kubernetes-sigs/kind#273 and kubernetes-sigs/kind#1538.

For documentation on `docker network create`, see https://docs.docker.com/engine/reference/commandline/network_create/.

| Subnet           | Usage                |
|------------------|----------------------|
| `172.100.0.0/24` | Kubernetes nodes     |
| `172.100.1.0/24` | MetalLB address pool |
@redbrick9
Copy link

redbrick9 commented Nov 30, 2022

Hi @BenTheElder, I created a bridge which is similar to kind, just have the different "IPAM", I also set the env variable KIND_EXPERIMENTAL_DOCKER_NETWORK with the bridge. When issuing "kind create cluster ..." and saw the following stdout on the screen.

WARNING: Overriding docker network due to KIND_EXPERIMENTAL_DOCKER_NETWORK
WARNING: Here be dragons! This is not supported currently.

After the kind cluster was created, it still uses "kind" network, I tried to identify why and didn't get any clues, did you know why? Thanks!

@prabhakhar
Copy link

prabhakhar commented Jan 11, 2023

@redbrick9 It works as specified.

export KIND_EXPERIMENTAL_DOCKER_NETWORK=wildlings
kind create cluster --config wildlings.yaml

A new network got created in docker.

🚀 ➜ docker network inspect wildlings | jq .[0].IPAM
{
  "Driver": "default",
  "Options": {},
  "Config": [
    {
      "Subnet": "172.19.0.0/16",
      "Gateway": "172.19.0.1"
    },
    {
      "Subnet": "fc00:c796:3cb7:e852::/64"
    }
  ]

@boeboe
Copy link

boeboe commented May 29, 2023

Any chance we get this functionality as a kind startup flag (--net) in the future?

@BenTheElder
Copy link
Member

This is pretty bug prone and you can pre-create the kind network with your (unsupported, potentially broken) settings instead.

@boeboe
Copy link

boeboe commented May 30, 2023

@BenTheElder

I kindly (pun intended) disagree. I am using the --network and --subnet flags in minikube on a daily basis:

minikube start --help | grep net
    --network='':
        network to run minikube with. Now it is used by docker/podman and KVM drivers. If left empty, minikube will create a new network.
    --subnet='':
        Subnet to be used on kic cluster. If left empty, minikube will choose subnet address, beginning from 192.168.49.0. (docker and podman driver only)

This allows me to run my minikube based kubernetes clusters (plural) in any docker networks (plural) that I pre-configured, or even allows me to create a bridged docker network through minikube itself.

I am not asking here to start managing docker bridged networks, as the --subnet flag does for minikube, but being able to attach your kind cluster to a configurable (and assumed pre-existing and pre-configured) docker network is basic functionality that does not extend kind beyond its core responsibilities.

The main use cases for me to use minikube with configurable (and subnet separated, to avoid metallb conflicts) docker networks is to simulate kubernetes multi-cluster demo, developer and CI/CD environments. It would be awesome that I can also use kind for this purpose. Moving this experimental flag to a first class, yet optional, command line argument does not impact stability and increases usability and adoption reach.

@BenTheElder
Copy link
Member

--subnet

subnets come from docker IPAM settings which are already user configurable OR you can create the kind network (or KIND_EXPERIMENTAL_NETWORK)

but being able to attach your kind cluster to a configurable (and assumed pre-existing and pre-configured) docker network is basic functionality that does not extend kind beyond its core responsibilities.

https://kind.sigs.k8s.io/docs/contributing/project-scope/
https://kind.sigs.k8s.io/docs/design/principles/#target-cri-functionality

Anyhow you can connect to additional networks with for node in $(kind get nodes); do docker network connect $node network-name; done.

To change the default network in an experimental, unsupported way you can use KIND_EXPERIMENTAL NETWORK.

The main use cases for me to use minikube with configurable (and subnet separated, to avoid metallb conflicts) docker networks is to simulate kubernetes multi-cluster demo, developer and CI/CD environments. It would be awesome that I can also use kind for this purpose.

There's demos of this sort of thing in the kubernetes project using KIND with the existing functionality https://github.com/kubernetes-sigs/mcs-api/blob/master/scripts/up.sh

Moving this experimental flag to a first class, yet optional, command line argument does not impact stability

This is not true. See for example #2917

@aojea
Copy link
Contributor

aojea commented May 31, 2023

or people can create their own plugins https://github.com/aojea/kind-networking-plugins

@boeboe
Copy link

boeboe commented May 31, 2023

Regarding #2917
... I don't see how this is relevant.

The only reason people connect to a second network, is because they were forced to by the arbitrary choice of hard coding a bridged docker network kind in the first place. The only place I've seen multi-network use cases is for multi-interface things in 5G core spec (and CNIs like multus), which are all Service Provider use cases.

Anyway... the experimental flag works like a charm and covers my use case. It's a bit strange you refuse to make this a first class command line flag... minikube and k3d both support it out if the box... without a plugin system.

Hard coding choices like the name and choice of a docker network is bad software design, but I'll leave it there.

FWIW... hereby my attempt to create a single abstraction layer for my multi cluster needs, having support for minikube/k3s/kind, where kind is the only one going "experimental"
https://github.com/boeboe/k8s-local/blob/main/k8s-local.sh#L202

@BenTheElder
Copy link
Member

Regarding #2917
... I don't see how this is relevant.

This is an example of the challenging bugs that crop up due to users with custom networking that we're not supporting.

We simply can't prioritize that. Which is why the existing feature is clearly named "EXPERIMENTAL" and will stay that way for now.

The only reason people connect to a second network, is because they were forced to by the arbitrary choice of hard coding a bridged docker network kind in the first place. The only place I've seen multi-network use cases is for multi-interface things in 5G core spec (and CNIs like multus), which are all Service Provider use cases.

Frankly, this approach is not helpful and I'm disinclined to spend further energy here.

The design and implementation is not "arbitrary" just because you have not looked into the history and context behind it.
Every single change is carefully considered and implemented with reason. This is rude and willfully ignorant. All commits and discussions are public.

KIND used the default docker bridge for the first year before we ran into serious limitations exploring proposed fixes for clusters surviving host reboots, which was NOT an originally intended functionality we even tested because KIND was created to test Kubernetes, NOT to test applications.

But there was high user demand anyhow and minikube hadn't adopted the kind image yet and k3d didn't exist, so we spent a lot of effort adapting to the demands for long lived application development clusters. In the process we settled on a substitute for the standard docker bridge network that closely mimics it with the minimum of changes and because we have to configure it somewhat it is under the predictable "kind" name for running test containers alongside it and otherwise behaving very closely to before this change.

Anyway... the experimental flag works like a charm and covers my use case. It's a bit strange you refuse to make this a first class command line flag... minikube and k3d both support it out if the box... without a plugin system.

minikube is a sister project in the same organization, it is explicitly not a goal to attempt to create 100% overlap between them.

KIND is a lightweight tool focused on developing bleeding edge Kubernetes with a secondary focus on usage for other functionality, which you can find more about in our contributing guide / docs:
https://kind.sigs.k8s.io/docs/contributing/project-scope/

It is important to our existing users and use cases that the tool remain small and well maintained and keep up with the latest changes in the container ecosystem, Linux, and Kubernetes, which is where most of our energy goes, e.g. #3223.

Hard coding choices like the name and choice of a docker network is bad software design, but I'll leave it there.

Again, you haven't bothered to look at how we settled on the current approach and you're being rude.

FWIW... hereby my attempt to create a single abstraction layer for my multi cluster needs, having support for minikube/k3s/kind, where kind is the only one going "experimental"
https://github.com/boeboe/k8s-local/blob/main/k8s-local.sh#L202

Again:

  • This looks like it really is not such a burden to use an environment variable instead of a flag and to accept that this feature is considered experimental precisely because there isn't a fleshed out host-multi-network design due to very low demand and nobody contributing a detailed proposal etc.
  • KIND is only expected to be compatible with minikube, k3s, microk8s etc in the sense that it provides standard conformant Kubernetes. Docker host networks are irrelevant as long as node to node networking meets expectations, which is our focus in maintaining the docker network integration.

@BenTheElder
Copy link
Member

This issue is closed.

If anyone would like to propose a new feature with a considered design proposal: https://kind.sigs.k8s.io/docs/contributing/getting-started/

To be considered for addition it will also first need concrete use cases that cannot be handled with existing functionality.

AFAIK there aren't really any, e.g. joshuaspence/homelab@72b9038 references this but that can entirely be accomplished on the standard network instead (will leave a comment), multi-cluster testing is also referenced but that works fine on a single bridge network.

A few seed questions for anyone that does choose to explore this:

  1. What does the network lifecycle look like? Is it coupled to the cluster? If so, how will we avoid breaking existing users and tools? If not, how will we clean up all these networks (as opposed to today, where the user has opted into an experimental feature and will have to work around this limitation).
  2. How flexible will configure these be if we don't just leave it up to the power-user to do externally as today? How do we reconcile the wildly different networking vs podman and in the future nerdctl and other tools? Will we only support CNI networks and their limitations?

@kubernetes-sigs kubernetes-sigs locked as resolved and limited conversation to collaborators Jun 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
10 participants