Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rqd] Add frame recovery logic for docker mode #1614

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

DiegoTavares
Copy link
Collaborator

Whenever rqd restarts it loses track of all the frames launched by it that haven't finished. This change adds a new configurable option to backup frame states to a file, that is used to recover the frame cache state and try to re-bind to the running frames.

This first version only works on docker mode

Whenever rqd restarts it loses track of all the frames launched by it that haven't
finished. This change adds a new configurable option to backup frame states to a file,
that is used to recover the frame cache state and try to re-bind to the running frames.

This first version only works on docker mode
@DiegoTavares DiegoTavares force-pushed the recover_on_restart_rqd branch 3 times, most recently from 71ae0f9 to 5fa31c7 Compare December 11, 2024 19:26
@DiegoTavares DiegoTavares force-pushed the recover_on_restart_rqd branch 2 times, most recently from bda928c to b40527b Compare December 11, 2024 23:59
@DiegoTavares DiegoTavares force-pushed the recover_on_restart_rqd branch from b40527b to 3b24dd1 Compare December 12, 2024 00:28
@DiegoTavares DiegoTavares force-pushed the recover_on_restart_rqd branch 2 times, most recently from 18bf372 to 4d87bf6 Compare December 12, 2024 23:04
lint
@DiegoTavares DiegoTavares force-pushed the recover_on_restart_rqd branch from 4d87bf6 to bdbc360 Compare December 12, 2024 23:35
@DiegoTavares DiegoTavares marked this pull request as ready for review December 12, 2024 23:36
@@ -921,7 +990,7 @@ def runLinux(self):
finally:
rqd.rqutil.permissionsLow()

frameInfo.pid = frameInfo.forkedCommand.pid
frameInfo.pid = runFrame.pid = frameInfo.forkedCommand.pid
Copy link
Collaborator

@ramonfigueiredo ramonfigueiredo Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-check this double assignment and the other similar double assignments in the code.

frameInfo.pid = runFrame.pid = frameInfo.forkedCommand.pid

if time_till_next > (2 * rqd.rqconstants.RQD_MIN_PING_INTERVAL_SEC):
self.rqCore.onIntervalThread.cancel()
self.rqCore.onInterval(rqd.rqconstants.RQD_MIN_PING_INTERVAL_SEC)
# Atatch to the job and follow the logs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix comment:

Attach to the job and follow the logs

@@ -967,5 +1065,7 @@ def test_runDarwin(self, getTempDirMock, permsUser, timeMock, popenMock):
)



Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra 2 lines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants