RetryBatchCommand causes overlapping of failed jobs when run concurrently with the same Batch ID #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Title: RetryBatchCommand causes overlapping of failed jobs when run concurrently with the same Batch ID
Description:
There is an issue in the job batching system that allows the same failed jobs to be added to the jobs table and executed multiple times (depending on the number of parallel processes of the RetryBatchCommand command for the same batch ID). This can result in a negative value for the number of pending jobs.
Steps to reproduce:
Create a batch with, for example, 1000 jobs.
Make the jobs fail (throw an exception)
Run the batch and verify that the value of failed_jobs in the pending_jobs table is 1000.
Make the jobs successful (remove the exception thrown)
Run several instances of the php artisan queue:retry-batch {id} command simultaneously or with a small time difference.
Check the job_batches table, and you will see that pending_jobs has a negative value. This indicates that some jobs have been executed multiple times by different processes.
Laravel version: 10.13.5
Proposed Solution:
Add the without-job-overlapping option to RetryCommand, which blocks the execution of a failed job using Cache lock, if such a job has already being executed
Add the execution of RetryCommand with this option to RetryBatchCommand