Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downstream KojiBuild state not handled correctly #2503

Closed
1 of 2 tasks
xsuchy opened this issue Aug 17, 2024 · 9 comments · Fixed by #2559
Closed
1 of 2 tasks

Downstream KojiBuild state not handled correctly #2503

xsuchy opened this issue Aug 17, 2024 · 9 comments · Fixed by #2559
Assignees
Labels
area/fedora Related to Fedora ecosystem complexity/single-task Regular task, should be done within days. gain/high This brings a lot of value to (not strictly a lot of) users. impact/low This issue impacts only a few users. kind/bug Something isn't working.

Comments

@xsuchy
Copy link
Contributor

xsuchy commented Aug 17, 2024

What happened? What is the problem?

I discovered that sometimes BodhiUpdate is not filed by Packit, despite being correctly configured.

I did and upstream release of fedora-upgrade and submitted it to Fedora using tito release fedora-git. And I forgot to add --no-build. This uploaded src.rpm to all branches in dist-git and triggered koji build. The builds were submitted by both Packit and Tito. As you can see in https://koji.fedoraproject.org/koji/packageinfo?packageID=15136
Packit was first for F40 and Tito everywhere else. Who is second is denied to proceed. As you can see in F39 build https://koji.fedoraproject.org/koji/taskinfo?taskID=122080964 submitted by Packit. Although it failed Packit reports in dashboard that the build is still pending https://dashboard.packit.dev/results/koji-builds/6850

And what is important is that F39 bodhi update was not filled. Despite the condition was meant - there is successful build https://koji.fedoraproject.org/koji/buildinfo?buildID=2531586

What did you expect to happen?

I expected that Bodhi update for F39 is created.

Example URL(s)

No response

Steps to reproduce

1. fedpkg import foo.src.rpm
2. fedpkg build (be quicker than Packit)
3. Bodhi update is not created.

What is the impacted category (job)?

Fedora release automation

Workaround

  • There is an existing workaround that can be used until this issue is fixed.

Participation

  • I am willing to submit a pull request for this issue. (Packit team is happy to help!)
@xsuchy xsuchy added the kind/bug Something isn't working. label Aug 17, 2024
@nforro
Copy link
Member

nforro commented Aug 19, 2024

And what is important is that F39 bodhi update was not filled. Despite the condition was meant - there is successful build https://koji.fedoraproject.org/koji/buildinfo?buildID=2531586

There are however no allowed builders configured: https://src.fedoraproject.org/rpms/fedora-upgrade/blob/2e7791f16bbea3aedf43c81dfb8db3f417a43f02/f/packit.yaml#_18
So the only account whose build can trigger the update is packit.

@mfocko mfocko added area/fedora Related to Fedora ecosystem complexity/single-task Regular task, should be done within days. labels Aug 19, 2024
@mfocko mfocko added impact/low This issue impacts only a few users. gain/high This brings a lot of value to (not strictly a lot of) users. labels Aug 19, 2024
@lbarcziova lbarcziova changed the title Race condition with KojiBuild Downstream KojiBuild state not handled correctly Aug 19, 2024
@lbarcziova
Copy link
Member

lbarcziova commented Aug 19, 2024

During stand-up meeting, we agreed that this is an expected behaviour. However, the build failure should be correctly handled/ reflected in DB (and shown on dashboard). Besides that, we may think about a way how to notify users about the failures (e.g. via mail), a bit related to #2404 .

@lbarcziova
Copy link
Member

From an initial investigation, it looks like this happens for all builds duplicated by us, i.e. where the build triggered by us fails because the build is already in progress/completed (triggered by someone else). Koji doesn't seem to emit messages on the message bus for those (which we rely on for updating the status).

@nforro
Copy link
Member

nforro commented Oct 2, 2024

This is actually quite similar to packit/packit#2427, if there is a way with Koji API to determine whether a non-scratch build for a NVR is in progress, we could skip/fail before submitting the build.

@lbarcziova
Copy link
Member

Yes, sounds good, my only concern especially for this issue would be the race conditions.

@nforro
Copy link
Member

nforro commented Oct 2, 2024

if there is a way with Koji API to determine whether a non-scratch build for a NVR is in progress

We have KojiHelper.get_build_info() that accepts a NVR.

my only concern especially for this issue would be the race conditions

Hm, true, I'm not sure if there even is a way to avoid them.

@lbarcziova
Copy link
Member

For this particular issue, I was thinking we could have a "babysit" task to just handle the forever-pending builds (which might be overkill), but for packit/packit#2427 there is probably no way.

I would start here by adding the check of KojiHelper.get_build_info() and see how often we will still hit a race condition.

@nforro
Copy link
Member

nforro commented Oct 2, 2024

Do we want to fix both issues at the same time? The only difference should be checking if KojiBuildState(KojiHelper().get_build_info(nvr)["state"]) returns KojiBuildState.building or KojiBuildState.complete.

@lbarcziova
Copy link
Member

lbarcziova commented Oct 3, 2024

We agreed on implementing the solution from packit/packit#2427 here which should solve both issues.

lbarcziova added a commit to lbarcziova/packit that referenced this issue Oct 3, 2024
lbarcziova added a commit to lbarcziova/packit-service that referenced this issue Oct 4, 2024
lbarcziova added a commit to lbarcziova/packit that referenced this issue Oct 4, 2024
lbarcziova added a commit to lbarcziova/packit-service that referenced this issue Oct 4, 2024
lbarcziova added a commit to packit/packit that referenced this issue Oct 4, 2024
Related to packit/packit-service#2503


RELEASE NOTES BEGIN

N/A

RELEASE NOTES END
lbarcziova added a commit to lbarcziova/packit-service that referenced this issue Oct 4, 2024
softwarefactory-project-zuul bot added a commit that referenced this issue Oct 4, 2024
Skip running Koji builds if they are triggered already

Fixes #2503
Requires packit/packit#2435

TODO:

 test it

RELEASE NOTES BEGIN
Before triggering the non-scratch Koji builds, we now check whether there is not already a build in progress or completed for the same NVR.
RELEASE NOTES END

Reviewed-by: Nikola Forró
Reviewed-by: Laura Barcziová
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/fedora Related to Fedora ecosystem complexity/single-task Regular task, should be done within days. gain/high This brings a lot of value to (not strictly a lot of) users. impact/low This issue impacts only a few users. kind/bug Something isn't working.
Projects
Status: done
Development

Successfully merging a pull request may close this issue.

4 participants