Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add merge() method to ApproxMostFrequentStreamSummary as an optimizat…
…ion over mergeSerialized() (#12236) Summary: # Context This diff adds a merge() method which behaves equivalent to mergeSerialized(). The reason this is added is because this avoids the extra round trip serialization + deserialization that's required when calling mergeSerialized(). # Rationale The rationale here is that, the serialize function simply calls values()[i] and priorities()[i], and `memcpy`'s the integers. Then when we deserialize, we also simply take the values as-is and re-insert them into the target data structure. So obviously; rather than ser + deser to perform the exact same operation, we can just copy directly. There are a couple of considerations: 1. **StringView** - Because we do not serialize, velox::StringView is still pointing at whatever `other` ApproxMostFrequentStreamSummary is pointing to, which means *the lifetime of the string must be kept alive across both structures, even if the `other` one disappears*. This ... feels okay, but requires consideration from the user of this class. 2. **StringView equality** - seems like previous behavior is correct (insert will lookup index in priority queue and use operator== on StringView which should compare contents). So technically if all `StringView` point to equal strings, then this won't be problem. Differential Revision: D68995835
- Loading branch information