Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: (DutyScheduler) duties reset race condition #1741

Merged
merged 6 commits into from
Sep 17, 2024

Conversation

moshe-blox
Copy link
Contributor

@moshe-blox moshe-blox commented Sep 16, 2024

This PR fixes a race condition discovered by @olegshmuelov which happens right after an indices changes and re-orgs, when AttesterHandler and SyncCommitteeHandler reset their respective duty maps prior to re-fetching, during which any reads by CommitteeHandler would end up with zero duties.

The fix changes how we populate the map, opting for fetch -> set rather than reset -> fetch -> add.

This bug was first fixed in #1725 using a different approach. Since that PR contains other other improvements & fixes, then this PR doesn't close it, and we should combine both after verifying this approach works.

d.m[period][validatorIndex] = dutyDescriptor[eth2apiv1.SyncCommitteeDuty]{
duty: duty,
inCommittee: inCommittee,
func (d *SyncCommitteeDuties) Set(period uint64, duties []DutyDescriptor[eth2apiv1.SyncCommitteeDuty]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you removed the mutex lock here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@y0sher good catch, fixed
i also pushed a bunch of refactors

Comment on lines 82 to 87
d.m[epoch] = make(map[phase0.Slot]map[phase0.ValidatorIndex]StoreDuty[D])
for _, duty := range duties {
if _, ok := d.m[epoch][duty.Slot]; !ok {
d.m[epoch][duty.Slot] = make(map[phase0.ValidatorIndex]StoreDuty[D])
}
d.m[epoch][duty.Slot][duty.ValidatorIndex] = duty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can reduce lock time and reduce redundant lookups

func (d *Duties[D]) Set(epoch phase0.Epoch, duties []StoreDuty[D]) {
	newDuties := make(map[phase0.Slot]map[phase0.ValidatorIndex]StoreDuty[D])

	for _, duty := range duties {
		if _, slotExists := newDuties[duty.Slot]; !slotExists {
			newDuties[duty.Slot] = make(map[phase0.ValidatorIndex]StoreDuty[D])
		}
		newDuties[duty.Slot][duty.ValidatorIndex] = duty
	}

	d.mu.Lock()
	defer d.mu.Unlock()

	d.m[epoch] = newDuties
}

Copy link
Contributor Author

@moshe-blox moshe-blox Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olegshmuelov nice optimization, applied both 👌

Comment on lines 63 to 71
func (d *SyncCommitteeDuties) Set(period uint64, duties []StoreSyncCommitteeDuty) {
d.mu.Lock()
defer d.mu.Unlock()

if _, ok := d.m[period]; !ok {
d.m[period] = make(map[phase0.ValidatorIndex]dutyDescriptor[eth2apiv1.SyncCommitteeDuty])
}

d.m[period][validatorIndex] = dutyDescriptor[eth2apiv1.SyncCommitteeDuty]{
duty: duty,
inCommittee: inCommittee,
d.m[period] = make(map[phase0.ValidatorIndex]StoreSyncCommitteeDuty)
for _, duty := range duties {
d.m[period][duty.ValidatorIndex] = duty
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

func (d *SyncCommitteeDuties) Set(period uint64, duties []StoreSyncCommitteeDuty) {
	newDuties := make(map[phase0.ValidatorIndex]StoreSyncCommitteeDuty, len(duties))

	for _, duty := range duties {
		newDuties[duty.ValidatorIndex] = duty
	}

	d.mu.Lock()
	defer d.mu.Unlock()

	d.m[period] = newDuties
}

@moshe-blox moshe-blox merged commit ce465c8 into stage Sep 17, 2024
6 checks passed
@moshe-blox moshe-blox deleted the fix/duty-scheduler-reset branch September 17, 2024 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants