-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only mark timers to update #3
base: main
Are you sure you want to change the base?
Conversation
5e1292c
to
8092b80
Compare
Please add some more detail to the second part:
I don't fully understand this. |
let timer_mask = self | ||
.timers | ||
.timers_to_update_mask | ||
.load(std::sync::atomic::Ordering::Relaxed); | ||
for timer_name in TimerName::VALUES { | ||
if (timer_mask & (1 << (timer_name as u16))) != 0 { | ||
self.timer_tick(timer_name); | ||
} | ||
} | ||
// Reset all marked bits | ||
self.timers | ||
.timers_to_update_mask | ||
.store(0, std::sync::atomic::Ordering::Relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will miss mark_timer_to_update
calls that happen during the for loop. I think you want to call let timer_mask = self.timers.timers_to_update_mask.swap(0, Relaxed)
in the beginning.
If I'm correct, please also add a test that could catch this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed and added test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this test, verifies the fix for the above issue. The issue happens only when the for loop (now in tick_marked_timers
) runs in parallel with mark_timer_to_update
. Also, reverting the swap
change from implementation, doesn't break the test - please add a test that fails with the old version of the code, but passes with the new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but because of a bit difficulty for running them in parallel to make sure the issue happens, I decided to mark the timer after reading the bit mask. That timer should be updated in the next cycle of update timers, but in previous version that update was being missed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.0
my_tun.update_timers(&mut [0]); | ||
|
||
// Only those timers marked should be udpated | ||
assert!(!my_tun.timers[TimerName::TimeLastDataPacketSent].is_zero()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: use assert_eq!(timers[x], timers[TimeCurrent])
4f1d22c
to
4ba9f5e
Compare
Performance test from base
Performance of current branch
|
I've done a ton of testing yesterday on my setup (two physical machines connected directly via cable). Unsurprisingly the most consistent results I get when measuring only one way UDP traffic. On multiple 6 minute runs I got an average of 684 Mbps with this branch vs 662 Mbps with the main branch. Looks like 3% improvement which is nice. |
@jjanowsk I think this amount is good enough because I would expect only a marginal gain, if any, by this change. Either way this change is needed for future improvements. |
4326a1f
to
e64f7d8
Compare
@@ -82,6 +96,7 @@ pub struct Timers { | |||
persistent_keepalive: usize, | |||
/// Should this timer call reset rr function (if not a shared rr instance) | |||
pub(super) should_reset_rr: bool, | |||
timers_to_update_mask: AtomicU16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need AtomicU16
here? can we use instead the parking_lot::Mutex<u16>
or even parking_lot::Mutex<[bool;8]>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is main area that should be bringing improment, as then we are just swtiching from a mutex to a mutex.
Now we fundamentaly move away from any wait for lock's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is adding atomicu16, not removing any mutex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, it's not with self as &self, but &mut self.
We don't even need Mutex or Atomic then in reality, just u16 would be enough
Timers are updated on every single packet call, but their accuracy is limited to 250ms where currentTime is updated. So just mark timers to update, and update them only once in update_timers.
e64f7d8
to
d4e74e9
Compare
More updates from the measurements on the CI. When run in the CI on 15 minute runs of single way UDP traffic the gains are not significant. There is less then 0.5% of the difference between main and this branch, sometimes main being faster and sometimes this branch. In total I've run 4 pipelines like this. |
In that case, maybe we can close this PR? |
Timers are updated on every single packet call, but their accuracy is limited to 250ms where currentTime is updated. So just mark timers to update, and update them only once in update_timers.
This PR is part 1 of several PR's to revamp the neptun noise logic. The next PR would remove the lock from the peer.tunnel and introduce interior locks to tunnel members instead. In the hotpath, we would only be reading the session so no need to lock the entire tunnel for that. Other members of tunnel are either lockable or can be made atomic. Locking timers internally is not possible, otherwise it can lead to a deadlock as
update_timers
also tries to lock() the timers then.