-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Effort-less way to run C++ clients #1873
Comments
Hi @qiranq99, Thanks for touching!
Yes, we do have a set of CMake options to control which components to enable: https://github.com/v6d-io/v6d/blob/main/CMakeLists.txt#L56-L65 If you only need the C++ client to access metadata and blobs, you could just enable the client and disable all other components.
As it uses shared memory and avoids any potential data copies. The cost should be a very small constant and won't scale with the size of your data.
May I know more about the test case (maybe some code snippets that I can use to reproduce the performance gap). We would investigate to check if there are any regression.
There should be no performance differences about if Python is used or not. Let us know if you have encountered such problems.
v6d is mainly optimized for sharing big data objects (e.g., tensors, tables, dataframes) between processes. |
Hi @sighingnow, import vineyard
import numpy as np
client = vineyard.connect()
data = np.arange(100000)
oid = client.put(data)
# %%timeit
retrieved_data = client.get(oid) basically we took the above example as a benchmark, and the tested latency is several hundreds of microseconds, while some low-latency-oriented object store could deliver several tens of nanoseconds. Though v6d is not aiming for low latency, <50us latency of retrieving data from the server is what we expect. |
BTW, I successfully compiled C++ client in isolation thanks to your hint. However, the C++ API reference seems to be generated directly from mkdoc tools and there is no clear guidance on how to use the C++ library. |
Will investigate. In my queue now.
More user-friendly tutorials about the C++ APIs are in our roadmap. For now, you may refer to our unittests as examples for usage: https://github.com/v6d-io/v6d/tree/main/test Sorry for the inconvenience. |
Benchmark update: on getting a
Machine: Intel Xeon 4316 @ 2.30GHz |
/cc @sighingnow, this issus/pr has had no activity for for a long time, could you folks help to review the status ?
|
/cc @sighingnow, this issus/pr has had no activity for for a long time, could you folks help to review the status ?
|
Hi,
I found it quite troublesome to get a C++ client all the way through compiling the entire project and trying to understand the raw C++ APIs in the doc, thus wondering is there a easier way to access the C++ clients, e.g., via partial compilation.
BTW, have you ever profiled the performance gap between a Python client and a C++ client regarding the client-server IPC overhead (inside one machine) when retrieving data via
data = client.get(oid)
? According to our experiments, each retrieval takes several hundreds of microseconds (using AMD EPYC 7763), which is unacceptable in some latency-sensitive workloads, which leads to 2 questions:v6d
be optimized for low latency scenarios in the future?The text was updated successfully, but these errors were encountered: