Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ig nodeLabels not passed to kubernetes nodes in Hetzner #16159

Open
lukasredev opened this issue Dec 7, 2023 · 15 comments · May be fixed by #16739
Open

ig nodeLabels not passed to kubernetes nodes in Hetzner #16159

lukasredev opened this issue Dec 7, 2023 · 15 comments · May be fixed by #16739
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@lukasredev
Copy link

/kind bug

1. What kops version are you running? The command kops version, will display
this information.

v1.28.1

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

v1.27.8

3. What cloud provider are you using?
Hetzner

4. What commands did you run? What is the simplest way to reproduce this issue?
Create the cluster

kops create cluster --name=my-cluster.lukasre.k8s.local \
  --ssh-public-key=path-to-pub --cloud=hetzner --zones=fsn1 \
  --image=ubuntu-20.04 --networking=calico --network-cidr=10.10.0.0/16

Add a new instance group with different node labels

kops create ig nodes-immich-fsn1 --subnet fsn1

Edit the instance group with the following config:

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-12-06T20:46:51Z"
  generation: 3
  labels:
    kops.k8s.io/cluster: my-cluster.lukasre.k8s.local
  name: nodes-immich-fsn1
spec:
  image: ubuntu-22.04
  kubelet:
    anonymousAuth: false
    nodeLabels:
      lukasre.ch/instancetype: immich
      node-role.kubernetes.io/node: ""
  machineType: cx21
  manager: CloudGroup
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-immich-fsn1
    lukasre.ch/instancetype: immich
  role: Node
  subnets:
  - fsn1

Update the cluster (including forcing a rolling update) with

kops update cluster --yes
kops rolling-update cluster --yes --force

5. What happened after the commands executed?
Commands are successful, but node labels are not added.

The yaml representation of the newly created node is the following (only metadata)

apiVersion: v1
kind: Node
metadata:
  annotations:
    alpha.kubernetes.io/provided-node-ip: 10.10.0.7
    csi.volume.kubernetes.io/nodeid: '{"csi.hetzner.cloud":"40260390"}'
    node.alpha.kubernetes.io/ttl: "0"
    projectcalico.org/IPv4Address: 10.10.0.7/32
    projectcalico.org/IPv4IPIPTunnelAddr: x.x.x.x
    volumes.kubernetes.io/controller-managed-attach-detach: "true"
  creationTimestamp: "2023-12-07T08:52:42Z"
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/instance-type: cx21
    beta.kubernetes.io/os: linux
    csi.hetzner.cloud/location: fsn1
    failure-domain.beta.kubernetes.io/region: fsn1
    failure-domain.beta.kubernetes.io/zone: fsn1-dc14
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: nodes-immich-fsn1-67573cda4d994baa
    kubernetes.io/os: linux
    node-role.kubernetes.io/node: ""
    node.kubernetes.io/instance-type: cx21
    topology.kubernetes.io/region: fsn1
    topology.kubernetes.io/zone: fsn1-dc14
  name: nodes-immich-fsn1-67573cda4d994baa
  resourceVersion: "21804205"
  uid: f730e308-5bb9-49f3-b530-91f7a74b698c

6. What did you expect to happen?
The node labels specified in the instance group

kops.k8s.io/instancegroup: nodes-immich-fsn1
lukasre.ch/instancetype: immich

are not added to the nodes

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

piVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: "2023-09-19T18:44:55Z"
  generation: 2
  name: my-cluster.lukasre.k8s.local
spec:
  api:
    loadBalancer:
      type: Public
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: hetzner
  configBase: <configBase>
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - instanceGroup: control-plane-fsn1
      name: etcd-1
    manager:
      backupRetentionDays: 90
    memoryRequest: 100Mi
    name: main
  - cpuRequest: 100m
    etcdMembers:
    - instanceGroup: control-plane-fsn1
      name: etcd-1
    manager:
      backupRetentionDays: 90
    memoryRequest: 100Mi
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubelet:
    anonymousAuth: false
  kubernetesApiAccess:
  - 0.0.0.0/0
  - ::/0
  kubernetesVersion: 1.27.8
  networkCIDR: 10.10.0.0/16
  networking:
    calico: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 0.0.0.0/0
  - ::/0
  subnets:
  - name: fsn1
    type: Public
    zone: fsn1
  topology:
    dns:
      type: None

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-09-19T18:44:55Z"
  labels:
    kops.k8s.io/cluster: my-cluster.lukasre.k8s.local
  name: control-plane-fsn1
spec:
  image: ubuntu-20.04
  machineType: cx21
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
  - fsn1

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-09-19T18:44:55Z"
  generation: 2
  labels:
    kops.k8s.io/cluster: my-cluster.lukasre.k8s.local
  name: nodes-fsn1
spec:
  image: ubuntu-20.04
  machineType: cx21
  maxSize: 2
  minSize: 2
  role: Node
  subnets:
  - fsn1

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2023-12-06T20:46:51Z"
  generation: 3
  labels:
    kops.k8s.io/cluster: my-cluster.lukasre.k8s.local
  name: nodes-immich-fsn1
spec:
  image: ubuntu-22.04
  kubelet:
    anonymousAuth: false
    nodeLabels:
      lukasre.ch/instancetype: immich
      node-role.kubernetes.io/node: ""
  machineType: cx21
  manager: CloudGroup
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-immich-fsn1
    lukasre.ch/instancetype: immich
  role: Node
  subnets:
  - fsn1

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?
I looked at some existing issues and found #15090 and it seems it might be a similar issue:
If you compare how labels are generated for OpenStack here and for hetzner here it seems that the labels are not passed to the nodeIdentity.Info object.

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 7, 2023
@lukasredev
Copy link
Author

I would be happy to help with a fix, but would require some guidance :)

@lukasredev
Copy link
Author

Anyone? :)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 29, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 29, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 28, 2024
@MTRNord
Copy link

MTRNord commented Jul 7, 2024

This seems to be still relevant :(

@rifelpet
Copy link
Member

rifelpet commented Jul 7, 2024

/reopen

Can you post logs from the kops-controller pods in kube-system? That is the component responsible for applying labels from instance groups to nodes.

For anyone looking into this bug, this is the controller that handles label updates, initialized here.

@k8s-ci-robot k8s-ci-robot reopened this Jul 7, 2024
@k8s-ci-robot
Copy link
Contributor

@rifelpet: Reopened this issue.

In response to this:

/reopen

Can you post logs from the kops-controller pods in kube-system? That is the component responsible for applying labels from instance groups to nodes.

For anyone looking into this bug, this is the controller that handles label updates, initialized here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@MTRNord
Copy link

MTRNord commented Jul 8, 2024

/reopen

Can you post logs from the kops-controller pods in kube-system? That is the component responsible for applying labels from instance groups to nodes.

For anyone looking into this bug, this is the controller that handles label updates, initialized here.

The only one having logs for the creating of the node in question are these logs:

❯ kubectl logs -n kube-system kops-controller-w9jkr
I0707 19:26:35.841870       1 main.go:241] "msg"="starting manager" "logger"="setup"
I0707 19:26:35.842276       1 server.go:185] "msg"="Starting metrics server" "logger"="controller-runtime.metrics"
I0707 19:26:35.844427       1 server.go:139] kops-controller listening on :3988
I0707 19:26:35.844717       1 server.go:224] "msg"="Serving metrics server" "bindAddress"=":0" "logger"="controller-runtime.metrics" "secure"=false
I0707 19:26:35.844977       1 leaderelection.go:250] attempting to acquire leader lease kube-system/kops-controller-leader...
I0707 19:34:47.495953       1 server.go:220] performed successful callback challenge with 10.10.0.11:3987; identified as minecraft-58d2077a7bd90d8
I0707 19:34:47.495985       1 node_config.go:29] getting node config for &{APIVersion:bootstrap.kops.k8s.io/v1alpha1 Certs:map[] KeypairIDs:map[] IncludeNodeConfig:true Challenge:0x40007680f0}
I0707 19:34:47.497375       1 s3context.go:94] Found S3_ENDPOINT="https://s3.nl-ams.scw.cloud", using as non-AWS S3 backend
I0707 19:34:47.651300       1 server.go:259] bootstrap 10.10.0.2:28728 minecraft-58d2077a7bd90d8 success
I0707 19:41:16.781699       1 server.go:220] performed successful callback challenge with 10.10.0.7:3987; identified as nodes-hel1-7e8746841cf8f905
I0707 19:41:16.792090       1 server.go:259] bootstrap 10.10.0.2:64068 nodes-hel1-7e8746841cf8f905 success
I0707 19:49:05.298120       1 server.go:220] performed successful callback challenge with 10.10.0.5:3987; identified as nodes-hel1-d91dc6bfd5aab64
I0707 19:49:05.298167       1 node_config.go:29] getting node config for &{APIVersion:bootstrap.kops.k8s.io/v1alpha1 Certs:map[] KeypairIDs:map[] IncludeNodeConfig:true Challenge:0x400088a5f0}
I0707 19:49:05.440257       1 server.go:259] bootstrap 10.10.0.2:2440 nodes-hel1-d91dc6bfd5aab64 success
I0707 20:05:50.277408       1 server.go:220] performed successful callback challenge with 10.10.0.11:3987; identified as minecraft-6c3b8cb63d629438
I0707 20:05:50.288817       1 server.go:259] bootstrap 10.10.0.2:20488 minecraft-6c3b8cb63d629438 success

with this instancegroup config:

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-07-07T18:44:57Z"
  generation: 5
  labels:
    kops.k8s.io/cluster: midnightthoughts.k8s.local
  name: minecraft
spec:
  image: ubuntu-22.04
  kubelet:
    anonymousAuth: false
    nodeLabels:
      node-role.kubernetes.io/node: minecraft
  machineType: cax31
  manager: CloudGroup
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: minecraft
    role: minecraft
  role: Node
  subnets:
  - hel1
  taints:
  - app=minecraft:NoSchedule

using hetzner for the VMs and scaleway for the S3

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 7, 2024
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@rifelpet rifelpet linked a pull request Aug 7, 2024 that will close this issue
@rifelpet
Copy link
Member

rifelpet commented Aug 7, 2024

/reopen
/remove-lifecycle rotten

Potential fix in #16739

@k8s-ci-robot
Copy link
Contributor

@rifelpet: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten

Potential fix in #16739

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot reopened this Aug 7, 2024
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 7, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants