-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zookeeper sequence node overflow causes perpetual re-election #730
Comments
…rflow Once sequence node is incremented beyond 2147483647, the sequence overflows into the negative space. When getting predecessors, cast the node sequence from String to int before comparing. Closes python-zk#730
…rflow Once sequence node is incremented beyond 2147483647, the sequence overflows into the negative space. When getting predecessors, cast the node sequence from String to int before comparing. Closes python-zk#730
Once sequence node is incremented beyond 2147483647, the sequence overflows into the negative space. When getting predecessors, cast the node sequence from String to int before comparing. Closes python-zk#730
Once sequence node is incremented beyond 2147483647, the sequence overflows into the negative space. When getting predecessors, cast the node sequence from String to int before comparing. Closes python-zk#730
I have created a pull request but it is currently marked with failing checks due to:
Please have a look at the PR anyway :) Thanks! |
Hi Angie,
Thank you for reporting the issue. You are however incorrect in your
approach to the resolution. Serial Number Arithmetics (
https://en.wikipedia.org/wiki/Serial_number_arithmetic) should be used to
manage sequence numbers.
For instance, with your PR, once the sequence numbers are negative, the
new contenders would always "win" since they would have a smaller integer
number than the last positive sequence.
I have/had a draft solution for the issue I had reported here
#606.
Would you be interested in testing it out?
Cheers,
…On Mon, Oct 9, 2023 at 11:02 AM Angie Xiang ***@***.***> wrote:
I have created a pull request but it is currently marked with failing
checks due to:
- failing coverage: 96.65% (-0.09%) compared to 92bd0c2
<92bd0c2>
- strange because the change is not adding any new lines, just modifying
two existing ones
- an in progress Test Python 3.10, ZK 3.4.14
<https://github.com/python-zk/kazoo/actions/runs/6457423239/job/17528922532?pr=731#logs>,
which looks stuck at the time of writing this, as the test with tox has not
been producing any output since 1h 20min
Please have a look at the PR anyway :) Thanks!
—
Reply to this email directly, view it on GitHub
<#730 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIFTHXGJQO2EOPKWGPTTALX6QGZZAVCNFSM6AAAAAA5YPTXY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJTGE4DCNJTGA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Charles-Henri de Boysson
|
Hi @ceache , thank you for the quick reply! I have a question regarding:
Please correct me if I am wrong, but this should only happen exactly once when the wrap around happens. Our use case in which we have observed the issue is electing a new leader instance. We hit the overflow case and observed that when >1 instance was running, they were stuck in a perpetual loop of "new leader is forcibly elected -> old leader is killed -> new instance comes online and registers as new leader". The PR I opened should work as a bug fix for our case, because we would then have exactly one forced re-election when wrapping around from 2147483647 to -2147483648. But after that, with every increment the new negative sequence number only gets larger, and the smallest sequence number still remains at -2147483648. In a concrete example Am I missing a different use case in which my change could become an issue? |
Hi,
You keep saying "leader election" but you mean lock owner ("leader
election" has special meaning in Zookeeper).
So, in your example, let's say we have 3 clients contending for a lock, A,
B and C.
A gets there first, the sequence number is 2147483646, and he obtains the
lock (enters critical section).
B, in turn, tries to acquire the lock, its sequence number is 2147483647,
it can't since A has a lower sequence number so it waits.
C also tries, its sequence number is now, -2147483648, it thinks it has the
lowest number so it "obtains" the lock.
We now have A and C in the critical section we were trying to protect.
Your patch improves the situation because it considers them as integers so
it at least has the "concept" of negative number increase right.
>> min(2147483646, 2147483647, -2147483648, -2147483647)
-2147483648
Without it every new contender is the "lowest" once the sequence numbers
wrap into negative.
>> min("2147483646", "2147483647", "-2147483648", "-2147483647")
'-2147483647'
But the issue, even if it is a "once in a wrap" issue, remains with two
contenders entering the critical section.
I hope this helps explain what I was trying to say.
Cheers,
…On Mon, Oct 9, 2023 at 11:45 AM Angie Xiang ***@***.***> wrote:
Hi @ceache <https://github.com/ceache> ,
thank you for the quick reply! I have a question regarding:
once the sequence numbers are negative, the
new contenders would always "win" since they would have a smaller integer
number than the last positive sequence
Please correct me if I am wrong, but this should only happen exactly once
when the wrap around happens. Our use case in which we have observed the
issue is electing a new leader instance. We hit the overflow case and
observed that when >1 instance was running, they were stuck in a perpetual
loop of "new leader is forcibly elected -> old leader is killed -> new
instance comes online and registers as new leader".
The PR I opened should work as a bug fix for our case, because we would
then have exactly one forced re-election when wrapping around from
2147483647 to -2147483648. But after that, with every increment the new
negative sequence number only gets larger, and the smallest sequence number
still remains at -2147483648. In a concrete example min(2147483646,
2147483647, -2147483648, -2147483647) = -2147483648.
Am I missing a different use case in which my change could become an issue?
—
Reply to this email directly, view it on GitHub
<#730 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIFTHTSQAMFI5NHQY4Z6O3X6QL2BAVCNFSM6AAAAAA5YPTXY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJTGI2TIMRTGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Charles-Henri de Boysson
|
Hi, yes, sorry for the confusing terminology, I should be saying "lock owner" 👍 I kept saying "leader election" because we use it to elect a leader in our system. If I am not mistaken, what you are saying is that my change does not produce the correct result in this critical section. With the example you gave above: However my approach is from a user perspective, because my goal was to stabilise the whole system's state once this overflow is reached. Without the proposed change, the lock owner is perpetually changed as with every single new instance, the lock is handed over, since we are comparing Strings. This caused our system to become unusable a few days ago until we deleted and recreated the znode, thus resetting the sequence. The change would prevent what we encountered and prevent a major incident, even though the result of who gets to acquire the lock is technically incorrect while in this critical section. I agree that the serial number arithmetics which you have linked would probably be the better and more correct solution. From our side, this is not a critical fix anymore as it took us 6 years to reach the point of overflow, so I expect that we will have another few years again before it becomes a problem. Nevertheless it could be a good intermediate fix, depending on how long / difficult it is to implement the sequence number arithmetics, as it does improve the situation, does not worsen it in any way and may prevent a major incident in someone else's system. Best regards, PS: There will be a delay in response until tomorrow as it is already past EOB in Central Europe, sorry about that! If there is a new reply, I will get back to you tomorrow 👋 |
Hi, I have taken another look at this issue and have a few more questions.
Just to confirm: Your issue with my change casting the sequence numbers to an integer is that the result as to who gets to acquire the lock is wrong: The sequence number that is still at I have read the Sequence Number Arithmetic article that you linked, but if I am not mistaken, it doesn't describe a full solution either; it ends with:
And then does not propose a solution for this edge case. So in any case, even with sequence number arithmetics, we still have an edge case that produces strange results. When looking at it from a different angle:I have a proposal based on this comment taken from
With the integer cast in my PR, this statement is mathematically correct when looking at the number in the lock contender name. From a human sense standpoint, I would expect 2147483647 to be followed by 2147483648, but this is not the case for integers constrained by 32 bit. Taking the string number and then casting to int still fulfils the "contract" stated in this comment. What do you think? Best regards, |
Hello, we have encountered this issue again. If my understanding is correct, the proposed pull request is not being accepted as it is behaving incorrectly during the overflow period, though it would be stable. The outlined solution in https://en.wikipedia.org/wiki/Serial_number_arithmetic mentions a remaining edge case regarding equidistant numbers, as well as a difference in the implementation depending on the machine's integer size. We're unsure how zookeeper behaves on different machines (i.e. 32-bit vs 64-bit). Is there already a fix in the works on your end? You mentioned having a draft change in relation to #606, but it isn't linked there. Best regards, |
Expected Behavior
When the sequence zookeeper node counter overflows (as documented in: https://zookeeper.apache.org/doc/r3.6.0/zookeeperProgrammers.html#Sequence+Nodes+--+Unique+Naming), it would be expected that there is one re-election from 2147483647 to -2147483648. Afterwards, incrementing in the negative space should be stable again and not causing unexpected re-elections.
Actual Behavior
When overflowing and reaching the negative space starting from -2147483648, we observed perpetual leader re-elections. With each new added sequence number, the new instance would register itself as the new leader.
When digging into the kazoo code, it seems to be a typing issue, as the sequence is matched using a regex but not cast to an integer, meaning checking for the smallest sequence number is a String comparison, i.e.
"-2147483648" > "-2147483647"
isTrue
.Snippet to Reproduce the Problem
N/A
Logs with logging in DEBUG mode
N/A
Specifications
The text was updated successfully, but these errors were encountered: