Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test issue #110

Open
neo-liang-sap opened this issue Mar 20, 2022 · 51 comments
Open

test issue #110

neo-liang-sap opened this issue Mar 20, 2022 · 51 comments
Assignees
Labels
notification/pager-duty No notification needed priority/blocker Needs to be resolved now, because it breaks the service status/in-progress Issue is in progress/work topology/shoot Affects Shoot clusters

Comments

@neo-liang-sap
Copy link
Contributor

Which cluster is affected?

Cluster Details Dashboard Link: https://dashboard.garden.dev.k8s.ondemand.com/namespace/garden-sretestneo/shoots/sre-g-test/

What happened?

What you expected to happen?

When did it happen or started to happen?

Absolute:
Relative:

How would we reproduce it (concisely and precisely)?

Anything else we need to know?

Help us categorise this issue for faster resolution:

/area audit-logging auto-scaling backup certification control-plane cost delivery disaster-recovery documentation high-availability logging metering monitoring networking os performance quality security storage usability user-management
/component gardener dashboard documentation
/kind bug regression post-mortem
/os garden-linux suse-chost
/platform alicloud aws azure gcp converged-cloud

/priority critical

@neo-liang-sap neo-liang-sap added the topology/shoot Affects Shoot clusters label Mar 20, 2022
@gardener-robot
Copy link

@neo-liang-sap No more than 5 labels permitted, but 21 labels were given.

@gardener-robot gardener-robot added component/dashboard Gardener Dashboard component/documentation Gardener Documentation component/gardener Gardener kind/bug Bug kind/post-mortem Bug that requires deeper analysis after immediate issues were resolved (usually after downtime) kind/regression Bug that hit us already in the past and that is reappearing/requires a proper solution os/garden-linux Related to Garden Linux OS os/suse-chost Related to SUSE Container Host OS platform/alicloud Alicloud platform/infrastructure platform/aws Amazon web services platform/infrastructure platform/azure Microsoft Azure platform/infrastructure platform/converged-cloud Converged Cloud (CC) platform/infrastructure platform/gcp Google cloud platform/infrastructure priority/critical Needs to be resolved soon, because it impacts users negatively labels Mar 20, 2022
@gardener-robot
Copy link

@neo-liang-sap

Shoot: sretestneo/sre-g-test v1.21.9
            created at 2021-04-14 01:04 by Neo Liang (BTP Core FP TS SRE (CHN))
            on aws in eu-west-1 with purpose development
            at 3 nodes and 28 pods and 32 API server requests/second (max of last 24h each)
Seed: garden/aws

🟢 Last Operation
description: Shoot cluster has been successfully reconciled.
lastUpdateTime: '2022-03-19T15:35:34Z'
progress: 100
state: Succeeded
type: Reconcile
🟢 Shoot Conditions

          🟢 APIServerAvailable (HealthzRequestSucceeded)
          🟢 ControlPlaneHealthy (ControlPlaneRunning)
          🟢 EveryNodeReady (EveryNodeReady)
          🟢 SystemComponentsHealthy (SystemComponentsRunning)

🟢 Seed Conditions

          🟢 AuditlogServiceAvailability (AuditlogInstanceAttached)
          🟢 GardenletReady (GardenletReady)
          🟢 ExtensionsReady (AllExtensionsReady)
          🟢 Bootstrapped (BootstrappingSucceeded)
          🟢 BackupBucketsReady (BackupBucketsAvailable)

🟠 Control Plane Pods Not Healthy
Name Status Age
csi-driver-controller-5cbc85bd4d
10.243.132.205
Running
🟢 6/6 Ready
1 week, 3 days
🟠 4 Restarts (Error`)
shoot-dns-service-5c4bb8b5f8
10.243.135.40
Running
🟢 1/1 Ready
2 weeks, 15 hours
🟠 4 Restarts`
🟢 Worker Groups
Name OS Machine Zones
worker-i3q39
3:4 -0 +1
gardenlinux
v576.3.0
m5.large
80Gi gp2
eu-west-1b
 
🟢 Worker Nodes All Healthy
🟢 Daemon Sets All Healthy
Name Desired Current Ready Up-To-Date Available Node Selector
apiserver-proxy 🟢 3 🟢 3 🟢 3 🟢 3 🟢 3 n/a
calico-node 🟢 3 🟢 3 🟢 3 🟢 3 🟢 3 os:linux
csi-driver-node 🟢 3 🟢 3 🟢 3 🟢 3 🟢 3 n/a
kube-proxy-worker-i3q39-v1.21.9 🟢 3 🟢 3 🟢 3 🟢 3 🟢 3 kubernetes-version:1.21.9
pool:worker-i3q39
node-exporter 🟢 3 🟢 3 🟢 3 🟢 3 🟢 3 n/a
node-problem-detector 🟢 3 🟢 3 🟢 3 🟢 3 🟢 3 n/a
🟢 System Components Pods All Healthy
🔴 Throttled Pods
Component Container Throttling (99th percentile)
gardener-resource-manager
in seed cluster
gardener-resource-manager
gardener-resource-manager-5bf69cff6b-4smqw
🔴 85.96 %
gardener-resource-manager
in seed cluster
gardener-resource-manager
gardener-resource-manager-5bf69cff6b-nmxkx
🟠 70.89 %
blackbox-exporter
in shoot cluster
blackbox-exporter
blackbox-exporter-757c5df5d4-cd7lh
🟠 66.67 %
prometheus
in seed cluster
blackbox-exporter
prometheus-0
🟠 50.00 %
prometheus
in seed cluster
prometheus-config-reloader
prometheus-0
🟠 50.00 %
csi-driver-node
in shoot cluster
csi-node-driver-registrar
csi-driver-node-7qjx5
🟠 50.00 %
🟠 No pod disruption budgets (PDB) defined

          Consequence: Workload is unprotected from voluntary disruptions such as under node rolling updates
          Recommendation: Protect your workload with proper PDBs (in combination with node anti-affinity)

@gardener-robot gardener-robot added the status/new Issue is new and unprocessed label Mar 20, 2022
@gardener-robot
Copy link

@neo-liang-sap You opened a priority issue on the weekend. Please read https://wiki.wdf.sap.corp/wiki/display/Kubernetes/Gardener+Support and https://wiki.wdf.sap.corp/wiki/display/Kubernetes/DevOps+24x7 carefully for more information on how Gardener is operated on a weekend.

@gardener-robot
Copy link

@gardener-robot You have mentioned internal references in the public. Please check.

1 similar comment
@gardener-robot
Copy link

@gardener-robot You have mentioned internal references in the public. Please check.

@gardener-robot gardener-robot added the notification/carbon-copy No notification needed label Mar 20, 2022
@gardener-robot
Copy link

@neo-liang-sap ℹ️ Please take note of this issue.

@neo-liang-sap neo-liang-sap removed component/dashboard Gardener Dashboard component/gardener Gardener kind/bug Bug kind/regression Bug that hit us already in the past and that is reappearing/requires a proper solution kind/post-mortem Bug that requires deeper analysis after immediate issues were resolved (usually after downtime) platform/alicloud Alicloud platform/infrastructure platform/aws Amazon web services platform/infrastructure labels Mar 20, 2022
@gardener-robot
Copy link

@neo-liang-sap, @etiennnr This issue has not been touched since 64 days. Please add a follow up comment and/or change the status/ label.

@gardener-robot
Copy link

@neo-liang-sap, @etiennnr [in-progress] This issue has not been touched since 68 days. Please add a follow up comment and/or change the status/ label.

@gardener-robot
Copy link

@neo-liang-sap, @etiennnr This issue has not been touched since 70 days. Please add a follow up comment and/or change the status/ label.

@gardener-robot
Copy link

@neo-liang-sap, @etiennnr This issue has not been touched since 59 work days. Please add a follow up comment and/or change the status/ label.

@neo-liang-sap
Copy link
Contributor Author

/handover

@gardener-robot
Copy link

@neo-liang-sap Command "/handover" failed with "AssertionError(['neo-liang-sap'])".

Additional Information
Redacted in public. Check backend logs.

@neo-liang-sap
Copy link
Contributor Author

/handover @neo-liang-sap

@gardener-robot
Copy link

@neo-liang-sap neo-liang-sap handover this issue to you, please take care.

@neo-liang-sap
Copy link
Contributor Author

/handover

@gardener-robot
Copy link

@neo-liang-sap Command "/handover" failed with "AssertionError(['neo-liang-sap'])".

Additional Information
Redacted in public. Check backend logs.

@neo-liang-sap
Copy link
Contributor Author

/handover

@gardener-robot
Copy link

@neo-liang-sap Command "/handover" failed with "AssertionError((['neo-liang-sap'],))".

Additional Information
Redacted in public. Check backend logs.

@neo-liang-sap
Copy link
Contributor Author

/handover @neo-liang-sap

@gardener-robot
Copy link

@neo-liang-sap neo-liang-sap handover this issue to you, please take care.

@neo-liang-sap
Copy link
Contributor Author

/handover

@gardener-robot
Copy link

@neo-liang-sap neo-liang-sap handover this ticket to you, please take care.

@neo-liang-sap
Copy link
Contributor Author

/handover @neo-liang-sap

@gardener-robot
Copy link

@neo-liang-sap neo-liang-sap handover this issue to you, please take care.

@neo-liang-sap
Copy link
Contributor Author

/handover

@gardener-robot
Copy link

@neo-liang-sap neo-liang-sap handover this ticket to you, please take care.

@gardener-robot
Copy link

@neo-liang-sap This issue has not been touched since 10 work days. Please add a follow up comment and/or change the status/ label.

@gardener-robot
Copy link

@neo-liang-sap This issue has not been touched since 21 work days. Please add a follow up comment and/or change the status/ label.

@gardener-robot
Copy link

@neo-liang-sap This issue has not been touched since 24 work days. Please add a follow up comment and/or change the status/ label.

1 similar comment
@gardener-robot
Copy link

@neo-liang-sap This issue has not been touched since 24 work days. Please add a follow up comment and/or change the status/ label.

@gardener-robot
Copy link

@neo-liang-sap This issue has not been touched since 28 work days. Please add a follow up comment and/or change the status/ label.

@gardener-robot-dev
Copy link

@neo-liang-sap This issue has not been touched since 92 work days. Please add a follow up comment and/or change the status/ label.

@gardener-robot-dev
Copy link

@neo-liang-sap This issue has not been touched since 97 work days. Please add a follow up comment and/or change the status/ label.

@gardener-robot-dev
Copy link

@neo-liang-sap This issue has not been touched since 174 work days. Please add a follow up comment and/or change the status/ label.

@gardener-robot-dev
Copy link

@neo-liang-sap This issue has not been touched since 293 work days. Please add a follow up comment and/or change the status/ label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
notification/pager-duty No notification needed priority/blocker Needs to be resolved now, because it breaks the service status/in-progress Issue is in progress/work topology/shoot Affects Shoot clusters
Projects
None yet
Development

No branches or pull requests

5 participants