Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Explain how sync between buckets work; slow overall speed #917

Open
ToshY opened this issue Sep 5, 2023 · 8 comments
Open

[Question] Explain how sync between buckets work; slow overall speed #917

ToshY opened this issue Sep 5, 2023 · 8 comments

Comments

@ToshY
Copy link

ToshY commented Sep 5, 2023

Problem

I have a sourceBucket with 30.000 images (50-100 KB each), roughly 1.4 GB storage in total, and I want to "sync" it from to a destinationBucket.

b2 sync --threads <10|100|500> --delete --replaceNewer b2://sourceBucket b2://destinationBucket

As this concern many files, or even a large filesize, I'm stumped that it takes 1.5 to 2 hours to complete. The output of the command above shows between 70-100 kB/s (deteriorating over time), which seems kind of low, even when I run this on a production server with 10 Gbps up/down.

Question

As it seems the network speed, or the amount of threads, don't really have much impact on the overall performance, so could you provide me answers on the following questions:

  1. How does sync between buckets work?
    • I cannot find in the documentation on how this works.
    • Does my environment e.g. network speed, system hardware, or other requirements have any influence on syncs between buckets.
  2. Why does it take so long to for a relatively small amount of files, of such a small size, to be synced to another bucket?
  3. What can be improved on my side to sync it faster?
    • I've already tried tweaking the threads argument, but it only differs in magnitude of +/- 10 kB/s.
@ppolewicz
Copy link
Collaborator

Hey,

replication is a cool new feature of b2 that sounds like it might be perfect for this case. Your objects are very small and you are processing a ton of them, which might (I guess) result in server throttling your operations, so threads wait until the limit is decreased. Perhaps you can try that? It should run much faster than sync.

With 30k files in 2h it'd be 4 files per second, assuming 75 KB/s that's like 312KB/s, but you are reporti8ng 70-100KB/s. I'm not sure what's up with that. Which cluster are you using? If there is a performance issue with the CLI, I'd like to try to replicate it. Is it the same 30k files and you only change a few of those or is it different 30k files every time?

@ToshY
Copy link
Author

ToshY commented Sep 5, 2023

Hey @ppolewicz 👋

replication is a cool new feature of b2 that sounds like it might be perfect for this case

Well I've tried replication before by setting it up from the UI, but I've found it very unintuitive. It gives almost no sense on howlong it actually takes before it's done replicating, and after watching the tutorial, it is said that for existing files "it can take up from a few minutes to a few hours". As my experience was also that it takes hours for it to replicate, I then changed to using the CLI instead with b2 sync command, which atleast gives me some sense on how long it will take.

Replication also doesn't really fit my use case, because in the example the sourceBucket is actually a production bucket, and the destinationBucket is a development bucket. So I don't have the need to fully replicate the entire production bucket to development bucket, as I don't want/need those replicated files in my development bucket.

The b2 sync gives me more freedom, because if I decide I want to work on a feature, I can run the command above to sync it to my development bucket, wait roughly 2 hours, and then actually start developping. It's a pain that I have to wait that long, but atleast I know howlong it's been running, at what speed it's processing files, which files it's currently syncing, and roughly estimate on howlong it will take before it's complete.

With 30k files in 2h it'd be 4 files per second, assuming 75 KB/s that's like 312KB/s, but you are reporti8ng 70-100KB/s. I'm not sure what's up with that. Which cluster are you using? If there is a performance issue with the CLI, I'd like to try to replicate it. Is it the same 30k files and you only change a few of those or is it different 30k files every time?

Every 1 to 2 months I run the sync command above, and in the last 6 months the production bucket accumulated 6k additional images (24k before), so you can say roughly 1k images are added to the production bucket each month.

Which cluster are you using?

If you refer to the endpoint/region, it is s3.eu-central-003.backblazeb2.com for both source and destination bucket.


I've performed a sync earlier today (with the same command I've used above), which took again roughly 1.5-2 hours. So now if the buckets contents are basically identical, it shouldn't take much time to sync again right? But as I'm currently running it again, it gives similar speeds in the range of +/- 90-100 Kb/s in the console, and doing +/- 5-6 files per second.


After diving a bit deeper in the documentation I eventually found --compareVersions:

image

So what I did next was try --compareVersions none and --compareVersions size, both only taking a couple of seconds now (!), which is my desired behaviour (as I previously synced the files earlier today I think that's why fast as it no longer compares the modified time).

Now looking back, maybe I had the unwittingly assumed that the sync would make a complete copy of the file and it's properities (like modified time). But I guess it makes sense the "modified time" for the new file in the destination bucket is newer than the one from the destination.

TLDR

b2 sync --threads <10|100|500> --delete --replaceNewer --compareVersions <none|size> b2://sourceBucket b2://destinationBucket

@ppolewicz final question 🏁

I've now completely wiped my development bucket clean and started a new sync. It currently only performs copy operations for all files and does this with +/- 75 kB/s. Is this low speed of copying files related to the server throttling your operations you've mentioned earlier? And if so, is this out of my control or are there ways to speed things up?

@ppolewicz
Copy link
Collaborator

In order to determine if the server is throttling, you'll have to enable logs (passing --verbose is a simple way to do it).

On any storage system based on erasure coding and HDDs, performance of reading of small files is not going to be great. If you'd sync a few bigger files, the speed would go way up.

There is a performance bottleneck somewhere, either the server is throttling you or python threading is not doing a very good job with all those copy operations. 6/s is way below what I'd expect to see though, so my bet would be on the throttling.

I'm not familiar enough with the throttling subsystem Backblaze B2 eu-central is currently running on, but from the client perspective you should be able to observe the retries and threads backing off. If you'll confirm it's not the retries and throttling, then I'll take a look at reproducing and analyzing performance of it - B2 and associated tools are supposed to handle 10TB objects and buckets with a billion objects, so not being able to deal with 30k files in a timely manner could be a bug.

What are you running this on? Windows, Liunux? How did you install, from pip or binary?

@ToshY
Copy link
Author

ToshY commented Sep 6, 2023

Hey @ppolewicz

I've tried adding verbose but I can't say I see any keywords related to "throttling", "retry" or "back off" limits.

Here's a gist with a portion of the logging (only ran it for a couple of seconds and truncated it to 2300 lines + redacted some information). Maybe you can spot things that are out of the ordinary.


I've been running it on the following systems:

  • Ubuntu 22.04.2 LTS (WSL2); binary v3.0.8
  • Ubuntu 22.04.3 LTS (production server); binary v3.0.9

@ppolewicz
Copy link
Collaborator

The log only shows scanning and 18 transfers - server wouldn't throttle you so early. You'd have to run it longer and then show a tail of the log (2k lines would be ok).

Since you are running ubuntu, it would be easy to pip install --user b2 on some user (or in venv) to check if that's maybe an issue caused with the binary builder.

@ToshY
Copy link
Author

ToshY commented Sep 7, 2023

@ppolewicz But if it already sticks to the 75-100 kB/s range at the first copies, and doesn't improve over time, then surely it is not related to throttling?

I will try your suggestions later today.

@ToshY
Copy link
Author

ToshY commented Sep 7, 2023

@ppolewicz Installed it with pip, ran the same initial command, roughly same performance +/- 100 kB/s. So no performance gain there.

Ran it for roughly 30 minutes and then pasted (2135 lines) it in the gist.

@ppolewicz
Copy link
Collaborator

We'll be setting up an environment to test performance of large files this week and after that happens we'll circle back to this one to test performance of small files too.

Thank you for the detailed bug report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants