-
Notifications
You must be signed in to change notification settings - Fork 6.9k
[Core] Refactor reference_counter out of memory store and plasma store #57590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
} | ||
|
||
Status CoreWorkerPlasmaStoreProvider::Get( | ||
const absl::flat_hash_set<ObjectID> &object_ids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic below lifts the owner address resolution logic into the caller of Get.
src/ray/core_worker/core_worker.cc
Outdated
} | ||
} | ||
|
||
absl::flat_hash_map<ObjectID, rpc::Address> CoreWorker::GetObjectIdToOwnerAddressMap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function represents the owner address resolution logic lifted out of plasma store.
for (auto &get_request : get_requests) { | ||
get_request->Set(object_id, object_entry); | ||
// If ref counting is enabled, override the removal behaviour. | ||
if (get_request->ShouldRemoveObjects() && ref_counter_ == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove after get flag is getting removed since it stems from the dark ages when reference counting wasn't a thing.
const ray::RayObject &object, const ObjectID &object_id)> object_allocator) | ||
: io_context_(io_context), | ||
ref_counter_(counter), | ||
reference_counting_(reference_counting), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All reference counter null checks for whether reference counting is enabled is lifted into the caller.
void CoreWorkerMemoryStore::Put(const RayObject &object, const ObjectID &object_id) { | ||
void CoreWorkerMemoryStore::Put(const RayObject &object, | ||
const ObjectID &object_id, | ||
const bool has_reference) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory store only uses reference counter for checking if it has reference to an object or if reference counting is enabled. Both is lifted into caller.
explicit DefaultCoreWorkerMemoryStoreWithThread( | ||
std::unique_ptr<InstrumentedIOContextWithThread> io_context) | ||
: CoreWorkerMemoryStore(io_context->GetIoService()), | ||
: CoreWorkerMemoryStore(io_context->GetIoService(), /*reference_counting=*/false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: CoreWorkerMemoryStore(io_context->GetIoService(), /*reference_counting=*/false), | |
: CoreWorkerMemoryStore(io_context->GetIoService(), /*reference_counting=*/true), |
There's no reason for the default core worker memory store to be created with reference counting disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't looked at this too hard yet, but at first glance it looks like having address in that vector will make wait/get perf on large lists even worse than it already is bc the impact of the vector set copying madness will be even worse
Do we need address to be in there??
Signed-off-by: davik <[email protected]>
…inimize_ref_count_dep
Nope, definitely doesn't need to be in the same data structure. I've moved it out for performance. |
explicit DefaultCoreWorkerMemoryStoreWithThread( | ||
std::unique_ptr<InstrumentedIOContextWithThread> io_context) | ||
: CoreWorkerMemoryStore(io_context->GetIoService()), | ||
: CoreWorkerMemoryStore(io_context->GetIoService(), /*reference_counting=*/false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Mock Initialization Mismatch Causes Test Inconsistency
The DefaultCoreWorkerMemoryStoreWithThread
mock is initialized with reference_counting=false
. PR discussions suggested this should be true
, as there's no reason to disable it. This inconsistency can cause the mock to behave differently from production code, potentially masking reference counting bugs in tests.
Why are these changes needed?
As discovered in the PR to better define the interface for reference counter, plasma store provider and memory store both share thin dependencies on reference counter that can be refactored out. This will reduce entanglement in our code base and improve maintainability.
The main logic changes are located in
Related issue number
Checks
git commit -s
) in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.