Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd-io Infra and CI Migration #6102

Open
8 of 14 tasks
upodroid opened this issue Nov 19, 2023 · 12 comments
Open
8 of 14 tasks

etcd-io Infra and CI Migration #6102

upodroid opened this issue Nov 19, 2023 · 12 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@upodroid
Copy link
Member

upodroid commented Nov 19, 2023

etcd is now a subproject of Kubernetes and etcd maintainers are looking to adopt the CI system and Infra management approach we use for Kubernetes.

GitHub

  1. approved area/github-management cncf-cla: yes lgtm sig/etcd size/L
    jmhbnz

Infra

  1. approved area/infra area/infra/gcp area/prow area/terraform cncf-cla: yes lgtm sig/k8s-infra sig/testing size/M size/S tide/merge-method-squash
    ameukam dims
  2. approved area/access area/groups cncf-cla: yes lgtm sig/k8s-infra size/M
    MadhavJivrajani cblecker
    nikhita upodroid

Testing

  1. approved area/config area/jobs area/testgrid cncf-cla: yes lgtm sig/testing size/L
    ahrtr serathius
  2. approved area/config area/jobs cncf-cla: yes lgtm sig/etcd sig/testing size/M
    ahrtr serathius
  3. kind/feature sig/testing
  4. approved area/images area/release-eng cncf-cla: yes lgtm sig/release sig/testing size/L
    dims jbpratt

If I missed something, feel free to comment on the issue and I'll update the tracker.

/cc @jmhbnz @serathius @wenjiaswe @mrbobbytables @ahrtr @ameukam @BenTheElder

/sig etcd
/sig testing
/priority important-soon
/kind feature

@upodroid upodroid added the sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. label Nov 19, 2023
@k8s-ci-robot k8s-ci-robot added sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/testing Categorizes an issue or PR as relevant to SIG Testing. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/feature Categorizes issue or PR as related to a new feature. labels Nov 19, 2023
@upodroid upodroid self-assigned this Nov 19, 2023
@serathius
Copy link
Contributor

serathius commented Nov 22, 2023

Have we stopped running presubmits? I stopped seeing them in Github PRs and https://testgrid.k8s.io/sig-etcd-presubmits seems empty. Nothing in https://prow.k8s.io/?repo=etcd-io%2Fetcd too

@upodroid
Copy link
Member Author

There is a thread in #testing-ops on Slack to investigate this issue
https://kubernetes.slack.com/archives/C7J9RP96G/p1700688510160169

@serathius
Copy link
Contributor

Interesting flake in unit test:

=== FAIL: storage/schema TestMigrate/Upgrading_3.6_to_v3.7_is_not_supported (0.01s)
    logger.go:130: 2023-11-23T15:59:05.436Z	WARN	failed to preallocate an initial WAL file	{"path": "/tmp/TestMigrateUpgrading_3.6_to_v3.7_is_not_supported2935238209/002/etcd_wal_test401442[383](https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/etcd-io_etcd/17008/pull-etcd-unit-test/1727717754938593280#1:build-log.txt%3A383)7/wal.tmp/0000000000000000-0000000000000000.wal", "segment-bytes": 64000000, "error": "no space left on device"}
    schema_test.go:207: Failed to create WAL: no space left on device
    --- FAIL: TestMigrate/Upgrading_3.6_to_v3.7_is_not_supported (0.01s)

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/etcd-io_etcd/17008/pull-etcd-unit-test/1727717754938593280

@upodroid
Copy link
Member Author

Does that unit test expect the pod to have an ephemeral volume of a specific size?

@serathius
Copy link
Contributor

I think problem might stem from etcd WAL tests. I don't think that unit tests mock storage, just write to t.TempDir() (should be /tmp/ by default). WAL creation pre-allocates 64MB, so if there are couple of such tests running without cleanup we could be allocating couple of hundreds of megabytes.

@wenjiaswe
Copy link

cc @siyuanfoundation

@BenTheElder
Copy link
Member

we could mount an emptyDir (disk or memory) to /tmp if etcd tests are writing to it heavily.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 10, 2024
@upodroid
Copy link
Member Author

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 13, 2024
@serathius
Copy link
Contributor

Note for robustness tests, there is one functionally to github actions that was not migrated. Uploading of test report artifacts.

@cblecker
Copy link
Member

etcd DNS was migrated in #6600.

Today I have migrated the etcd netlify site to the Kubernetes account, and requested the CNCF close the etcd netlify account in CNCFSD-2245

@ArkaSaha30
Copy link
Member

Hello 👋
I am willing to take up some of the workflow issues for migration to Prowjobs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

No branches or pull requests

8 participants