Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osync hangs and the service has to be restarted manually. #234

Open
gaionaus opened this issue Jan 18, 2022 · 3 comments
Open

osync hangs and the service has to be restarted manually. #234

gaionaus opened this issue Jan 18, 2022 · 3 comments

Comments

@gaionaus
Copy link

My network is like that:
Win 10 pcs <-> | samba , Ubuntu server 20.04 (initiator) | <---- 50 Mbit dsl ----> | Ubuntu server 20.04 (replica) ,samba | <-> Win 10 pcs |

The 2 ubuntu servers run only osync and samba. Osync syncs 2 folders between the 2 Ubuntu servers.
Initiator: In the night there is also a cron job that runs fsync (not o sync) to backup the initiator folder to another local disk .
Replica: In the night at Sundays there is also a cron job that runs fsync (not osync) to backup the replica folder to another local disk .
These are the only tasks that the 2 servers run.

It hangs and I have to restart the service manually. So i cannot leave it unattended.

To Reproduce
Unfortunately this happens randomly, from once per day to ten times per day, so i don't know how to help you reproduce it.

Expected behavior
Kill the procs that are still running, and then continue monitor the folder for changes.

** Deviated behavior**
It kills the procs that are still running but then hungs.

Logs
I run osync as a service, it works fine, but randomly it become unresponsive. And this is what log says when that happens:
.
.
.
TIME: 2999 - Current tasks still running with pids [3402351].
TIME: 3001 - (WARN):Max soft execution time exceeded for task [Sync] with pids [3402351].
TIME: 3004 - Sent mail using sendmail command without attachment.
TIME: 3059 - Current tasks still running with pids [3402351].
TIME: 3119 - Current tasks still running with pids [3402351].
TIME: 3179 - Current tasks still running with pids [3402351].
TIME: 3239 - Current tasks still running with pids [3402351].
TIME: 3299 - Current tasks still running with pids [3402351].
TIME: 3359 - Current tasks still running with pids [3402351].
TIME: 3419 - Current tasks still running with pids [3402351].
TIME: 3479 - Current tasks still running with pids [3402351].
TIME: 3539 - Current tasks still running with pids [3402351].
TIME: 3599 - Current tasks still running with pids [3402351].
TIME: 3601 - (ERROR):Max hard execution time exceeded for task [Sync] with pids [3402351]. Stopping task execution.
TIME: 3601 - (CRITICAL):Cannot create replica file list in [/var/fs/].
TIME: 3601 - (WARN):Command was [/usr/bin/rsync --rsync-path="(o_O) rsync" -rltD -8 --modify-window=2 --omit-dir-times --no-whole-file -p -o -g --executability --exclude ".osync_workdir" -e "/usr/bin/ssh -i /home/gaionaus/.ssh/id_rsa -p 22" --list-only [email protected]:"/var/fs/" 2> "/tmp/osync.treeList.target.error.3400238.20220118T052539.886827061" | (grep -E "^-|^d|^l" || :) | (awk '{$1=$2=$3=$4="" ;print substr($0,5)}' || :) | (awk 'BEGIN { FS=" -> " } ; { print $1 }' || :) | (grep -v "^.$" || :) | sort > "/tmp/osync.treeList.target.3400238.20220118T052539.886827061"].
TIME: 3601 - (WARN):Command output
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(644) [Receiver=3.1.3]
TIME: 3601 - Task with pid [3402351] stopped successfully.
TIME: 3604 - Sent mail using sendmail command without attachment.
TIME: 3605 - (ERROR):osync finished with errors.
TIME: 3608 - Sent mail using sendmail command without attachment.
Tue Jan 18 06:25:47 UTC 2022 - (ERROR):osync child exited with error.
Tue Jan 18 06:25:47 UTC 2022 - #### Monitoring now.
Tue Jan 18 06:35:47 UTC 2022 - #### 600 timeout reached, running sync.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Tue Jan 18 06:35:47 UTC 2022 - osync 1.2 script begin.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)

Environment (please complete the following information):
Osync Version:
PROGRAM_VERSION=1.2
PROGRAM_BUILD=2017032101
IS_STABLE=yes

  • OS: ubuntu 20.04
  • Bitness: x64
  • Shell : bash

Additional context
It will stay on the last line for ever: "TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)"
And I have to RESTART the service manually .
What seems strange in the above log is that part. "/usr/bin/rsync --rsync-path="(o_O) rsync" "
It is like a pattern that it does not get replaced?

some settings from osync conf file :
RSYNC_OPTIONAL_ARGS="--modify-window=2 --omit-dir-times"
SOFT_MAX_EXEC_TIME=3000
HARD_MAX_EXEC_TIME=3600
KEEP_LOGGING=60
MIN_WAIT=120
MAX_WAIT=600

@deajan
Copy link
Owner

deajan commented Apr 9, 2022

That really sounds like a network problem.
Usually what happens is that a mounted drive over llost network will lead to a rsync zombie process.
The (o_O) part of the log is just a replacement of a security variable which should never be logged.

Do you have any supervision software on your systems ?

@gaionaus
Copy link
Author

gaionaus commented Apr 9, 2022

Hi and thanks for the reply.
Yes it is a network problem.
The osync runs just fine.
I have only observed some strange behavior on the deletion of files. Deleted files that it should not delete. The clocks of both servers are synced. Probably because of the bad network connection on the side of the target server. Incomplete file list creation because of the bad network connection?
Anyway I disabled the deletion of files and it syncs fine now for over 3 months.

@deajan
Copy link
Owner

deajan commented Apr 10, 2022

Bad netwrok connection shouldn't be an issue for tasks like deletion, since it wouldn't just allow to run further, just like it did in your logs.
Anyway, I've never lost a single file with osync over the years, so I have no idea what your culprit could be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants