Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploying on the Kwok node, unable to collect the kube_pod_datus_stcheduled_time metric #1192

Open
1 of 5 tasks
conghuhu opened this issue Aug 6, 2024 · 11 comments
Open
1 of 5 tasks
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@conghuhu
Copy link

conghuhu commented Aug 6, 2024

How to use it?

  • kwok
  • kwokctl --runtime=docker (default runtime)
  • kwokctl --runtime=binary
  • kwokctl --runtime=nerdctl
  • kwokctl --runtime=kind

What happened?

The pod deployed on the Kwok node cannot collect the kube_pod_datus_stcheduled_time metric in kube state metrics.``

What did you expect to happen?

should query result by count(kube_pod_status_scheduled_time{namespace="default"}) in prometheus

How can we reproduce it (as minimally and precisely as possible)?

  1. use follow shell to install kube-prometheus-stack
#!/bin/bash
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm install --set nodeExporter.enabled=false,alertmanager.enabled=false,prometheusOperator.admissionWebhooks.patch.image.registry="docker.io",prometheusOperator.admissionWebhooks.patch.image.repository=15841721425/kube-webhook-certgen kube-prometheus-stack --set kube-state-metrics.image.registry="docker.io",kube-state-metrics.image.repository=bitnami/kube-state-metrics,kube-state-metrics.image.tag=2.10.1 prometheus-community/kube-prometheus-stack -n monitoring --create-namespace
  1. use follow shell to install kwok
#!/bin/bash

if [ $# -eq 0 ]; then
  echo "Error: Please provide the number of nodes to create."
  echo "Usage: $0 <number_of_nodes>"
  exit 1
fi

KWOK_REPO=kubernetes-sigs/kwok
KWOK_LATEST_RELEASE=$(curl "https://api.github.com/repos/${KWOK_REPO}/releases/latest" | jq -r '.tag_name')
kubectl apply -f "https://github.com/${KWOK_REPO}/releases/download/${KWOK_LATEST_RELEASE}/kwok.yaml"
kubectl apply -f "https://github.com/${KWOK_REPO}/releases/download/${KWOK_LATEST_RELEASE}/stage-fast.yaml"

for (( i=0;i<$1; i++))
do
  kubectl apply -f - <<EOF
  apiVersion: v1
  kind: Node
  metadata:
    annotations:
      node.alpha.kubernetes.io/ttl: "0"
      kwok.x-k8s.io/node: fake
    labels:
      beta.kubernetes.io/arch: amd64
      beta.kubernetes.io/os: linux
      kubernetes.io/arch: amd64
      kubernetes.io/hostname: kwok-node-$i
      kubernetes.io/os: linux
      kubernetes.io/role: agent
      node-role.kubernetes.io/agent: ""
      type: kwok
    name: kwok-node-$i
  spec:
    taints: # Avoid scheduling actual running pods to fake Node
    - effect: NoSchedule
      key: kwok.x-k8s.io/node
      value: fake
  status:
    allocatable:
      cpu: 32
      memory: 256Gi
      pods: 110
    capacity:
      cpu: 32
      memory: 256Gi
      pods: 110
    nodeInfo:
      architecture: amd64
      bootID: ""
      containerRuntimeVersion: ""
      kernelVersion: ""
      kubeProxyVersion: fake
      kubeletVersion: fake
      machineID: ""
      operatingSystem: linux
      osImage: ""
      systemUUID: ""
    phase: Running
EOF
done
  1. deploy a nginx deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx
          ports:
            - containerPort: 80
      tolerations:
        - key: "kwok.x-k8s.io/node"
          operator: "Exists"
          effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  type: NodePort
  ports:
    - port: 80
      targetPort: 80
  selector:
    app: nginx
  1. query count(kube_pod_status_scheduled_time{namespace="default"}) in prometheus

Anything else we need to know?

No response

Kwok version

$ kwok --version
0.6.0

$ kwokctl --version
# paste output here

OS version

```console # On Linux: $ cat /etc/os-release # paste output here $ uname -a # paste output here

On Darwin:

$ uname -a

paste output here

On Windows:

C:> wmic os get Caption, Version, BuildNumber, OSArchitecture

paste output here

</details>
@conghuhu conghuhu added the kind/bug Categorizes issue or PR as related to a bug. label Aug 6, 2024
@wzshiming
Copy link
Member

wzshiming commented Aug 6, 2024

https://kwok.sigs.k8s.io/docs/user/metrics-configuration/

These are configurable, and you must configure the metrics as you see fit.

/remove-kind bug
/kind support

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Aug 6, 2024
@conghuhu
Copy link
Author

conghuhu commented Aug 6, 2024

https://kwok.sigs.k8s.io/docs/user/metrics-configuration/

These are configurable, and you must configure the metrics as you see fit.

/remove-kind bug /kind support

OKay, I will try

@conghuhu
Copy link
Author

@wzshiming May I ask if this metric is simulated or real based on document? I would like to expose the real metric

@wzshiming
Copy link
Member

There are only simulated metrics. what exactly do you mean by real metrics?

@conghuhu
Copy link
Author

There are only simulated metrics. what exactly do you mean by real metrics?

I want to track the time it takes for pods on the Kwok node to reach the scheduled state. Currently, I am using the kube_pod_datus_stcheduled_time metric provided by kube-state-metrics, but I am unable to collect metric data on the Kwok node

@wzshiming
Copy link
Member

you also need to use kube-state-metrics to provide the metrics you need

@conghuhu
Copy link
Author

kube-state-metrics can already be collected. I deployed nginx on a non kwok node, which can collect metrics, but not on a kwok node

@wzshiming
Copy link
Member

Ahh, it seems that kwok needs to be adapted to kube-state-metrics

@conghuhu
Copy link
Author

Excuse me, how long will it take to adapt?

@wzshiming
Copy link
Member

This requires kube-state-metrics to adapt to kwok and it to support collecting multiple nodes simulated by kwok.

https://github.com/kubernetes/kube-state-metrics/blob/f8aa7d9bb9d8e29876e19f4859391a54a7e61d63/pkg/app/server.go#L228-L233

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

4 participants