Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash collision on 0 when hashing vclocks for tictac AAE #32

Closed
martinsumner opened this issue Mar 18, 2025 · 1 comment · Fixed by OpenRiak/kv_index_tictactree#5
Closed
Assignees
Labels
bug Something isn't working

Comments

@martinsumner
Copy link
Contributor

martinsumner commented Mar 18, 2025

In production systems running TictacAAE, especially those using full-sync, there may be situations where a full-sync fails to compare cached trees for the same segments, but repeatedly show no difference between Keys & Clocks (i.e. the exchange returns {clock_compare, 0}).

This is resolved through repair mode - either by rebuilding the tree, or by triggering repair mode via full-sync:

-spec maybe_repair_trees(
repair_id()|undefined, boolean()) -> {boolean(), repair_id()|undefined}.
maybe_repair_trees(LastRepairID, true) ->
%% If only a subset of partitions are to be queried on the vnode, then
%% don't repair. Wait until all partitions are to be repaired so that
%% there is not a partial repair
{false, LastRepairID};
maybe_repair_trees(LastRepairID, _Filtered) ->
case application:get_env(riak_kv, aae_fetchclocks_repair, false) of
{true, LastRepairID} ->
{false, LastRepairID};
{true, UpdRepairID} ->
{true, UpdRepairID};
_ ->
{false, LastRepairID}
end.

But why are the segment hashes wrong in this case. There are two issues, although one is almost certainly a non-issue.

1 - Updating hashtrees assign meaning to a VC which hashes to 0

https://github.com/OpenRiak/kv_index_tictactree/blob/fb1f56d1bef3d207a1a9b2ef181f6f49ee968773/src/aae_controller.erl#L1332-L1335
https://github.com/OpenRiak/kv_index_tictactree/blob/fb1f56d1bef3d207a1a9b2ef181f6f49ee968773/src/aae_controller.erl#L425-L451

As erlang:phash2/1 is by default a 2 ^ 27 hash space, there is the potential for relatively frequent collisions between a Previous or Current vector-clock Hash which is 0, rather than one where 0 is being used to convey special meaning. There may have been some confusion here with phash/1 which was a hash in the Range beginning 1, so assigning a special meaning to 0 was safe.

So if an update is made to a Key, where the VC previously hashed to 0 - the segment will be incorrect, as it will not remove the hash of the Key in the Update.

All vnodes which receive the same series of updates will be incorrect in the same way - until one of them rebuilds. When an AAE tree is rebuilt, it does not need to calculate the delta hash, and so the rebuilt vnode will now be correct for this segment, and will differ. This will then trigger a series of repairs that will propagate the fix.

2 - In one case the VC is unique-sorted before hashing, not sorted (although the nature of VCs means that usort is equivalent to sort)

riak_kv/src/riak_kv_vnode.erl

Lines 3751 to 3759 in b7b690f

UpdClock =
case UpdObj of
use_binary ->
{VC, _Sz, _Sc, _LMDs, _SibBin} =
riak_object:summary_from_binary(UpdObjBin),
lists:usort(VC);
_ ->
get_clock(UpdObj)
end,

@martinsumner martinsumner self-assigned this Mar 18, 2025
@martinsumner martinsumner added the bug Something isn't working label Mar 18, 2025
@martinsumner
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant