You've got a problem. You need to fix it.
But you also need a way to quickly and easily rerun all of your outstanding jobs because that can take hours. Microsoft provide
[msdb].[dbo].[sp_start_job] 'job name'to help out but that's only good for one job.
I've written the following code to rerun any jobs that have failed or been cancelled which fit the given name using the standard like syntax, waiting for a specified delay between each.
Syntax is
usp_ReRunJobs 'job name', 'delay'where 'job name' is literally the textual name of the job ie if all of your full backups are called 'Full Backup Of xxxx' then 'Full Backup %', alternatively, just '%' to do everything, whilst 'delay' is in the format 'hh:mm:ss' ie '00:00:30' means a 30 second delay between each job.
You might have even tried the retries feature of scheduled jobs - that works wonders except where a job is genuinely broken and the retry impacts on the next job and so on, so usp_ReRunJobs could be scheduled to run daily after the usual job complete as a backstop.
Here's the code :
IF OBJECT_ID ( 'usp_ReRunJobs', 'P' ) IS NOT NULLDROP PROCEDURE usp_ReRunJobs
gocreate procedure usp_ReRunJobs (@Name varchar(255) = 'Full Backup%',@Delay varchar(8) = '00:00:60')asbegin
set nocount on
IF ISDATE('2000-01-01 ' + @Delay + '.000') = 0BEGINSELECT 'Invalid time ' + @Delay + ',hh:mm:ss, submitted.';END else begin
create table #xp_results(job_id uniqueidentifier not null,last_run_date int not null,last_run_time int not null,next_run_date int not null,next_run_time int not null,next_run_schedule_id int not null,requested_to_run int not null, -- boolrequest_source int not null,request_source_id sysname collate database_default null,running int not null, -- boolcurrent_step int not null,current_retry_attempt int not null,job_state int not null )declare @JName varchar(255)declare cJobs cursor fast_forward forselect j.namefrom msdb..sysjobs jjoin msdb..sysjobhistory j2 on j.job_id = j2.job_idwhere j.name like @Nameand j.enabled = 1and j2.instance_id = (SELECT max(h.instance_id) FROM [msdb].[dbo].[sysjobhistory] h where j2.job_id = h.job_id )and not exists (select x.job_id from #xp_results x where x.job_id = j.job_id and x.running = 1 )and j2.run_status not in ('1','4')order by j.name
insert #xp_results exec master.dbo.xp_sqlagent_enum_jobs @is_sysadmin = 1, @job_owner= '' open cJobsfetch next from cJobs into @JNamewhile @@fetch_status = 0beginselect 'Started '+@JNameexec msdb..sp_start_job @Job_Name = @JNamewaitfor delay @Delayfetch next from cJobs into @JNameendclose cJobsdeallocate cJobs
drop table #xp_results
endendgo
I'm not claiming it's the prettiest code or the most efficient but it gets the job done. The only thing I haven't managed to crack is getting the job start message appearing during the execution of the code so if you want to watch your jobs you'll have to resort to activity monitor, I have tried raiserror nowait but that didn't work either but that is a downside I can live with given on the second day of having this code in place it saved me about an hour of rerunning tasks by hand.