-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document corner cases of Pull Order. #746
Comments
I've tested also:
I've also checked via |
UPDATE: if I generate test files like this
It seems a condition that happens only when files are small? I went down testing this as I was experiencing random behavior while syncing a sqlite file + sqlite-wal UPDATE 2:
|
The sorting order controls how the files are queued. Files are started in the queued order, but there is concurrency and the amount of work required is individual to the file, so they may not finish in the order they started. Especially a bunch of small files will start at about the same time, take approximately no time to complete, and finish in whatever order. |
So what's the point of deceiving the user? If "File Pull Order" is a best effort thing, or is due to probability, race conditions or files size, it should be stated accordingly. |
There is no deception, files are processed in alphabetic order. But we don't wait for one file to complete before we start processing the next -- if we have buffer space to issue more network requests we do so, for example, and we also try to keep disks busy with concurrent I/O. So for example by default we will issue (network) requests for up 64 MiB of data before pausing to wait for replies. That's a lot of 1 KiB files. Granted, we'll send the requests in the order from the queue, but that's not necessarily the order they'll be answered in, and hence not the order things will necessarily finish in. |
I'm not saying the current behavior is wrong, I understand that for performance reason CPU and network/disk usage has to be optimized, but the name of the setting IS deceiving: "File Pull Order" means "Order of arrival from the point of view of the pulling agent". Manual states "Pull files ordered by X" when really should be "Order files by X on the sender before pulling. Warning, received files might end up in a different order due to resource availability and concurrency." By reading the manual and considering the opportunity given by an option with such name, I spent time wrapping up a PoC solution based on reading syncthing events and pulled file order. I bet other users might end up on this issue too. I suggest to document the sorting options appropriately. |
You're really just misunderstanding. Assume you have 10 ropes to pull on hanging from a tree, each with a coconut attached to it. If you pull them in a certain order, that's not necessarily the order the nuts will hit the ground around you, because the ropes may have different lengths. Syncthing "pulling" in a certain order means asking other devices to send something. It has no control over the order of arrival unless it would wait for each file to finish before requesting the next, which would lead to terrible performance. The manual doesn't say anything else. Feel free to submit a docs PR if you have an idea how it could be communicated more clearly. |
Just add a warning that the final order of arrival is unknown, as it's not software controlled anyway and relies on best effort strategy. Internet is full of this stuff, just inform the users accordingly, as you're exposing an option that means a preference, not a fact. An ordered queue on the receiving side (or just on the event publisher?) would turn it into something actually usable, otherwise is just something you can't rely on. I'll give you a practical example: synchronization of sqlite database with wal journaling, where two syncthing users have to take turns being the db writer in a walkie-talkie style. I have to collect the events syncthing generates over time and find a valid pattern over time to trigger my logic. Not rocket science, but the point is that any proposed pull order in syncthing options really means random. |
It is an ordered queue on the receiving side, another problem is that there is no buffering. Namely, we start downloading files as soon as we are informed about them, and there is no guarantee what order we will be informed about them, so zzz might be the first file we are informed about and that is alphabetically first, for the few seconds until aaa arrives etc. Also, once we start downloading, we don't cancel, regardless that according to the ordering there are newer items in the queue. Even if there was some sort of buffering, I think we would still not handle a completely empty folder syncing 1 million files etc, I think (didn't check the code), our queue for the purposes of ordering is not infinite sized. So yes, personallt, I do agree, the docs should be updated to explain the edge cases, but the issue for that should probably live in the docs repo. |
I agree, this should be moved to docs repo. |
I'm using syncthing v1.19.1, Linux (64-bit Intel/AMD) running in docker container (image syncthing/syncthing) in ubuntu 20.04.
The synced folder is a mounted folder
./sync:/var/syncthing
I've wrapped up a python script to grab all
/rest/events
, and it works as I'm receiving allevent["id"]
sequentially.By filtering
event["type"] == ItemFinished
and checkingevent["data"]["item"]
contents, I've realized that the order of the synced files are somehow random, contrary to the file pull order configuration I've set on both folders (set as send & receive) equal to "Alphabetic".I'm testing this by simply attaching my event listener, pausing the receiving folder, create 5 files in the sender folder like:
echo foo > test_a
,echo foo > test_aa
,echo foo > test_aaa
,echo foo > test_aaaa
,echo foo > test_aaaaa
and unpause the receiver
here what I get in console:
I'm not a golang speaker, but if I'm not wrong this the relevant line (apparently is a no-op):
https://github.com/syncthing/syncthing/blob/518d5174e630185540c5b99ee8a03c6231dc3c72/lib/model/folder_sendrecv.go#L436
Should I expect
ItemFinished
event to comply with the "File Pull Order"?The text was updated successfully, but these errors were encountered: