Recovery options

You can use recovery options to move forward in the event of abnormal job completions and to resolve any issues efficiently, minimizing downtime. This feature ensures smooth and efficient job runs, even in the event of unexpected interruptions.

When you create or update a job, you can specify the recovery options either to proceed further or stop in the event of a failure. This gives you a method to manage any incomplete tasks or processes that are left unresolved. You can rerun the same job or run another job to resolve the issue with the parent job. This flexibility helps in efficiently resolving any issues and minimizing downtime. The jobs run on workstations may stop abnormally due to various reasons. Sometimes it might be due to a technical glitch or may be due to the failure of any associated services. You can use the recovery option to progress further if any jobs are finished abnormally.

Syntax and command line options

You can edit the job definition file as follows to add recovery options:
recovery 
[stop | 
continue |
rerun [same_workstation][repeatevery hhmm][for number attempts]]
[after [[folder/]workstation#][folder/]jobname]
Stop
You can add this option to stop the job stream from proceeding further if the job ends abnormally.
Continue
You can add this option to continue with the successor job, if the predecessor fails.
Rerun
Add this option to rerun the same job or another job to resolve the issue if the job fails. You can also customize the rerun option with the following parameters to run specific scenarios.
same workstation
If you add this option, the job runs on the same workstation where it was run earlier.
repeat every (hhmm)
You can specify the time interval between each rerun in the hhmm format.
number of attempts
You can specify the number of times you want to rerun the job. If you do not specify any value, the job reruns once.
after [[folder/]workstation#][folder/]jobname
You can specify the recovery job. The recovery job can be any job definition in the database. The recovery job can also be a job that is used to fix the issue of the successor job. If you want to use a job definition that is stored in a remote workstation, specify the workstation with or without the folder name.
Different scenarios with recovery options
If the job fails you can use the recovery options to run different scenarios. The following criteria and table explains how the jobs are run when the recovery options are added in the job definition.
  • The CRIC job stream contains two jobs, JOB A and JOB B. The JOB B is internally dependent on JOB A and does not start if JOB A fails. The recovery job for JOB A is recoveryjob.
Recovery job Stop continue rerun
No If JOB A ends in ABEND status, then JOB B is not started at all. You can use the confirm command to change the status of JOB A to SUCC, to initiate JOB B. After JOB A reaches any one of its final states (SUCC, FAIL, ABEND, or SUPPR), JOB B starts.
Important: If there is a conditional dependency (See Dependency keywords) between JOB A and JOB B, then it must be satisfied to start JOB B.
If JOB A ends in ABEND status, it triggers a rerun. If it succeeds then JOB B starts. If the rerun fails, then JOB A is run again until it reaches a successful state. The process continues until the value specified in the number of attempts is reached. If no value is specified for number of attempts, then the job is rerun only once. You can use the confirm command to change the status of JOB A to SUCC, to initiate JOB B.
Yes If JOB A ends in ABEND status, then the recoveryjob is initiated. If the recoveryjob is successful then JOB B runs. If the recoveryjob fails, the process is stopped. If JOB A ends in ABEND status, the recoveryjob is run and JOB B starts, unless the recoveryjob ends in FAIL status, in which case the process stops and JOB B does not run. If JOB A ends abnormally, it triggers the recoveryjob. If it ends successfully JOB A runs again.
Note: The recovery job does not show in the Orchestration Monitor, as only the last instance of a job is displayed. Check in Run instances for information about the recovery job.
If JOB A fails after a successful run of recoveryjob, the process is repeated and continues until the value specified in the number of attempts is reached. If no value is specified for number of attempts, the job reruns only once. If the recoveryjob fails, then JOB A is not initiated and the process is stopped.