Description
👋
I was wondering if there is any value in continuing to enqueue recurring jobs if a queue is paused.
One of our use cases for recurrent jobs is that we call an external API every 5 minutes to take a snapshot of their data and store it in our database. When this third-party API is down for maintenance, we usually pause the queue, so we won't call them for no reason. And when they're back up, we resume the queue, and a bunch of previously enqueued (from recurring.yml
) jobs are executed. That usually results in us being throttled by the third-party API.
Concurrency control doesn't seem to be helpful because it will just make API calls sequential, but they will still be executed in a short period of time, while we only care about the data once every 5 minutes, as it doesn't change often.
So what we currently did is monkey-patched RecurringExecution
and RecurringTask
models:
--- a/app/models/solid_queue/recurring_execution.rb
+++ b/app/models/solid_queue/recurring_execution.rb
@@ -3,6 +3,7 @@
module SolidQueue
class RecurringExecution < Execution
class AlreadyRecorded < StandardError; end
+ class QueuePaused < StandardError; end
scope :clearable, -> { where.missing(:job) }
@@ -25,6 +26,8 @@ module SolidQueue
def record(task_key, run_at, &block)
transaction do
block.call.tap do |active_job|
+ raise QueuePaused if active_job.queue.paused?
+
if active_job && active_job.successfully_enqueued?
create_or_insert!(job_id: active_job.provider_job_id, task_key: task_key, run_at: run_at)
end
diff --git a/app/models/solid_queue/recurring_task.rb b/app/models/solid_queue/recurring_task.rb
index 5363f0a..4919217 100644
--- a/app/models/solid_queue/recurring_task.rb
+++ b/app/models/solid_queue/recurring_task.rb
@@ -84,7 +84,7 @@ module SolidQueue
active_job.tap do |enqueued_job|
payload[:active_job_id] = enqueued_job.job_id
end
- rescue RecurringExecution::AlreadyRecorded
+ rescue RecurringExecution::AlreadyRecorded, RecurringExecution::QueuePaused
payload[:skipped] = true
false
rescue Job::EnqueueError => error
This seems to work fine, but I don't like the idea of keeping the monkey-patch around. That's why I was thinking if it is possible to upstream this feature (which can be configurable). And I'm happy to open a PR.
I can think of a couple of more use cases, like long polling external API, sending reports, updating cache, and pre-aggregating data. But IMO, none of them would benefit from enqueuing jobs while the queue is paused.
What do you all think? And what are your use cases for recurring jobs?
Thanks!
P.S. Arguably, this is related to #176 and can be solved by having a way to discard duplicate jobs (#523?). Then, if a queue is paused, only one job will be enqueued.