-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallel operations for some binary caching providers #1392
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
In general, you're parallelizing the download here, but we might expect most of the time this to not be CPU limited and thus not improve as a result of parallelization. Do you have a real world use case where this did end up being meaningfully faster? |
For clarity - my experience here is with an AWS S3 cache specifically Taking a project I have with 10 small dependencies (download size is minimal)
The lib uses vcpkg in manifest mode via CMake toolchain integration. Build machine is Windows 11 with i7-12700 (8P + 4E) vcpkg/vcpkg --version
vcpkg package management program version 2024-03-14-7d353e869753e5609a1f1a057df3db8fd356e49d
# all measurements are from the cmake configure output
cmake --preset=<preset name> AWS S3 cache missAWS S3 cache hitSummarySo approx a 5x improvement for both query and download and that should be fairly proportional to the total number of dependencies (if this was done correctly and all deps are relatively equivalent). Note that typically I only use the cache for CI (GHA) and the time per package can be 1-3 seconds. If it was 5s to check the AWS cache I would probably consider using that locally as well, even better if microsoft/vcpkg#38684 was a thing |
I agree a real world 5x improvement is worth opening a can of worms for; OK |
this is pushing to the different caches(?) in parallel, not by package like the others
Thoughts on where to go with this? |
|
} | ||
else | ||
{ | ||
out_sink.println_warning(res.error()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed prints not being synchronised in previous assessment.
- Make individual print calls thread safe #1290
- Introduce DiagnosticContext, an 'error document' like type. #1323
This particular print could be extracted relatively easily and is the only print inside the parallel execution.
microsoft/vcpkg#38404
This adds parallel operations using
parallel_for_each
/parallel_transform
toObjectStorageProvider
. There is currently no limitation placed on the parallelisation past that imposed by the threadpool.This impacts the following caching providers
GcsStorageTool
AwsStorageTool
CosStorageTool
Questions