libimage: fix manifest race for AnnotateInstance() and RemoveInstance() by Luap99 · Pull Request #2504 · containers/common

Luap99 · 2025-08-07T17:02:26Z

Neither AnnotateInstance() or RemoveInstance() use the lock so we were still open to many races.

The reason the should fix the podman manifest add race is because that one also always calls AnnotateInstance().

Fixes: https://issues.redhat.com/browse/RHEL-21291

Also some two small commits to reduce the overhead of reload a bit.

Summary by Sourcery

Prevent races in manifest instance annotation and removal by acquiring a per-image lock and reloading the manifest list, and refactor reload logic into a shared helper.

Bug Fixes:

AnnotateInstance and RemoveInstance now acquire a per-image lock and reload before modifying the manifest list to avoid concurrent races.

Enhancements:

Extract reloadID helper to consolidate and optimize manifest list reload logic.

Neither AnnotateInstance() or RemoveInstance() use the lock so we were still open to many races. Fixes: https://issues.redhat.com/browse/RHEL-21291 Signed-off-by: Paul Holzinger <pholzing@redhat.com>

Share code as they do the same thing. Signed-off-by: Paul Holzinger <pholzing@redhat.com>

This does nothing it is just overhead. The lookupManifestList() call returns a new image which when then use and that also already calls image.reload() internally. Signed-off-by: Paul Holzinger <pholzing@redhat.com>

sourcery-ai · 2025-08-07T17:02:32Z

Reviewer's Guide

Added manifest-specific locks in AnnotateInstance and RemoveInstance to avoid races and refactored reload logic by extracting reloadID to streamline saveAndReload and reload methods.

Sequence diagram for manifest list mutation with locking

sequenceDiagram
participant Caller
participant ManifestList
participant Locker
participant Storage
Caller->>ManifestList: AnnotateInstance/RemoveInstance(d, options)
ManifestList->>Locker: LockerForImage(store, ID)
Locker->>ManifestList: lock()
ManifestList->>Storage: reload()
Storage-->>ManifestList: latest data
ManifestList->>ManifestList: mutate manifest list
Locker->>ManifestList: unlock()
ManifestList-->>Caller: result

Class diagram for ManifestList reload logic refactor and locking

classDiagram
class ManifestList {
  +saveAndReload()
  +reload()
  +reloadID(listID string)
  +AnnotateInstance(d, options)
  +RemoveInstance(d)
  image
  list
}
class Locker {
  +Lock()
  +Unlock()
}
ManifestList --> Locker : uses
ManifestList --> "1" image : has
ManifestList --> "1" list : has

Class diagram for Locker usage in AnnotateInstance and RemoveInstance

classDiagram
class ManifestList {
  +AnnotateInstance(d, options)
  +RemoveInstance(d)
}
class Locker {
  +Lock()
  +Unlock()
}
ManifestList --> Locker : acquires lock

File-Level Changes

Change	Details	Files
Embed manifest-specific locking and reload in AnnotateInstance to prevent concurrent modifications	Acquire locker for image Reload manifest list inside lock before setting annotations Unlock after modifications	`libimage/manifest_list.go`
Embed manifest-specific locking and reload in RemoveInstance to prevent concurrent modifications	Acquire locker for image Reload manifest list inside lock before removal Unlock after modifications	`libimage/manifest_list.go`
Refactor reload logic by extracting reloadID	Extract reloadID method Replace inline reload code in saveAndReload with reloadID Replace inline reload code in reload with reloadID	`libimage/manifest_list.go`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

mheon · 2025-08-07T20:48:48Z

I will test tomorrow morning to verify this fixes the race on my end

mheon · 2025-08-11T18:59:31Z

Tested locally. Without the patch, I consistently see 2-3 failures when creating 100 manifests in a tight loop. With the patch, I see 0 failures.

mheon · 2025-08-11T19:00:05Z

LGTM

sourcery-ai

Hey @Luap99 - I've reviewed your changes - here's some feedback:

Refactor the repeated lock acquisition and reload boilerplate in AnnotateInstance and RemoveInstance into a shared helper to reduce duplication and improve maintainability.
Verify that using lookupManifestList in reloadID fully replaces the previous m.image.reload + lookup sequence to guarantee the same data freshness, since dropping the explicit reload step could introduce stale-state issues.

Prompt for AI Agents

Please address the comments from this code review:
## Overall Comments
- Refactor the repeated lock acquisition and reload boilerplate in AnnotateInstance and RemoveInstance into a shared helper to reduce duplication and improve maintainability.
- Verify that using lookupManifestList in reloadID fully replaces the previous m.image.reload + lookup sequence to guarantee the same data freshness, since dropping the explicit reload step could introduce stale-state issues.

## Individual Comments

### Comment 1
<location> `libimage/manifest_list.go:661` </location>
<code_context>
 		return nil
 	}

+	locker, err := manifests.LockerForImage(m.image.runtime.store, m.ID())
+	if err != nil {
+		return err
</code_context>

<issue_to_address>
Consider refactoring the repeated lock, reload, and unlock logic into a helper method to reduce boilerplate and improve maintainability.

Here’s one way to collapse all of the repeated “lock, reload, defer‐unlock” boilerplate into a small helper.  You still get the same behavior (locking + initial reload) and leave both `saveAndReload` and `reload` untouched.

```go
// add this near the top of manifest_list.go

// withLockAndReload grabs the image‐specific mutex, reloads the
// in‐memory state, then runs `action`.  Unlock happens via defer.
func (m *ManifestList) withLockAndReload(action func() error) error {
    locker, err := manifests.LockerForImage(m.image.runtime.store, m.ID())
    if err != nil {
        return err
    }
    locker.Lock()
    defer locker.Unlock()

    if err := m.reload(); err != nil {
        return err
    }
    return action()
}
```

Then reduce your two methods to:

```go
func (m *ManifestList) AnnotateInstance(d digest.Digest, options *ManifestListAnnotateOptions) error {
    if options == nil {
        return nil
    }
    return m.withLockAndReload(func() error {
        if len(options.OS) > 0 {
            if err := m.list.SetOS(d, options.OS); err != nil {
                return err
            }
        }
        if len(options.OSVersion) > 0 {
            if err := m.list.SetOSVersion(d, options.OSVersion); err != nil {
                return err
            }
        }
        return m.saveAndReload()
    })
}

func (m *ManifestList) RemoveInstance(d digest.Digest) error {
    return m.withLockAndReload(func() error {
        if err := m.list.Remove(d); err != nil {
            return err
        }
        return m.saveAndReload()
    })
}
```

This removes the duplicated lock/unlock+reload in each method while preserving all original functionality.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-08-11T19:04:15Z

 		return nil
 	}

+	locker, err := manifests.LockerForImage(m.image.runtime.store, m.ID())


issue (complexity): Consider refactoring the repeated lock, reload, and unlock logic into a helper method to reduce boilerplate and improve maintainability.

Here’s one way to collapse all of the repeated “lock, reload, defer‐unlock” boilerplate into a small helper. You still get the same behavior (locking + initial reload) and leave both saveAndReload and reload untouched.

// add this near the top of manifest_list.go // withLockAndReload grabs the image‐specific mutex, reloads the // in‐memory state, then runs `action`. Unlock happens via defer. func (m *ManifestList) withLockAndReload(action func() error) error { locker, err := manifests.LockerForImage(m.image.runtime.store, m.ID()) if err != nil { return err } locker.Lock() defer locker.Unlock() if err := m.reload(); err != nil { return err } return action() }

Then reduce your two methods to:

func (m *ManifestList) AnnotateInstance(d digest.Digest, options *ManifestListAnnotateOptions) error { if options == nil { return nil } return m.withLockAndReload(func() error { if len(options.OS) > 0 { if err := m.list.SetOS(d, options.OS); err != nil { return err } } if len(options.OSVersion) > 0 { if err := m.list.SetOSVersion(d, options.OSVersion); err != nil { return err } } return m.saveAndReload() }) } func (m *ManifestList) RemoveInstance(d digest.Digest) error { return m.withLockAndReload(func() error { if err := m.list.Remove(d); err != nil { return err } return m.saveAndReload() }) }

This removes the duplicated lock/unlock+reload in each method while preserving all original functionality.

openshift-ci · 2025-08-11T19:04:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99, sourcery-ai[bot]

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [Luap99]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Luap99 · 2025-08-13T14:21:11Z

@containers/podman-maintainers PTAL

baude · 2025-08-14T14:35:45Z

/lgtm

Luap99 added 3 commits August 7, 2025 18:38

libimage: add missing manifest lock

4fffcf3

Neither AnnotateInstance() or RemoveInstance() use the lock so we were still open to many races. Fixes: https://issues.redhat.com/browse/RHEL-21291 Signed-off-by: Paul Holzinger <pholzing@redhat.com>

libimage: dedup manifest reload() and saveAndReload()

7da5c7a

Share code as they do the same thing. Signed-off-by: Paul Holzinger <pholzing@redhat.com>

libimage: remove extra image reload in manifest reload

18a0916

This does nothing it is just overhead. The lookupManifestList() call returns a new image which when then use and that also already calls image.reload() internally. Signed-off-by: Paul Holzinger <pholzing@redhat.com>

openshift-ci Bot added the do-not-merge/work-in-progress label Aug 7, 2025

openshift-ci Bot added the approved label Aug 7, 2025

Luap99 marked this pull request as ready for review August 11, 2025 19:03

openshift-ci Bot removed the do-not-merge/work-in-progress label Aug 11, 2025

sourcery-ai Bot approved these changes Aug 11, 2025

View reviewed changes

openshift-ci Bot assigned baude Aug 14, 2025

openshift-ci Bot added the lgtm label Aug 14, 2025

openshift-merge-bot Bot merged commit 1c68196 into containers:main Aug 14, 2025
14 checks passed

Luap99 deleted the manifest-race branch August 14, 2025 14:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libimage: fix manifest race for AnnotateInstance() and RemoveInstance()#2504

libimage: fix manifest race for AnnotateInstance() and RemoveInstance()#2504
openshift-merge-bot[bot] merged 3 commits into
containers:mainfrom
Luap99:manifest-race

Luap99 commented Aug 7, 2025 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Aug 7, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

mheon commented Aug 7, 2025

Uh oh!

mheon commented Aug 11, 2025

Uh oh!

mheon commented Aug 11, 2025

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot Aug 11, 2025

Uh oh!

openshift-ci Bot commented Aug 11, 2025

Uh oh!

Luap99 commented Aug 13, 2025

Uh oh!

baude commented Aug 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Luap99 commented Aug 7, 2025 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for manifest list mutation with locking

Class diagram for ManifestList reload logic refactor and locking

Class diagram for Locker usage in AnnotateInstance and RemoveInstance

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

mheon commented Aug 7, 2025

Uh oh!

mheon commented Aug 11, 2025

Uh oh!

mheon commented Aug 11, 2025

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Aug 11, 2025

Uh oh!

Luap99 commented Aug 13, 2025

Uh oh!

baude commented Aug 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Luap99 commented Aug 7, 2025 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Aug 7, 2025 •

edited

Loading