Recovery options

You can use recovery options to move forward in the event of abnormal job completions and to resolve any issues efficiently, minimizing downtime. This feature ensures smooth and efficient job runs, even in the event of unexpected interruptions.

When you create or update a job, you can specify the recovery options either to proceed further or stop in the event of a failure. This gives you a method to manage any incomplete tasks or processes that are left unresolved. You can rerun the same job or run another job to resolve the issue with the parent job. This flexibility helps in efficiently resolving any issues and minimizing downtime. The jobs run on workstations may stop abnormally due to various reasons. Sometimes it might be due to a technical glitch or may be due to the failure of any associated services. You can use the recovery option to progress further if any jobs are finished abnormally.

Syntax and command line options

You can edit the job definition file as follows to add recovery options:

recovery 
[stop | 
continue |
rerun [same_workstation][repeatevery hhmm][for number attempts]]
[after [[folder/]workstation#][folder/]jobname]

Stop: You can add this option to stop the job stream from proceeding further if the job ends abnormally.

Continue: You can add this option to continue with the successor job, if the predecessor fails.

Rerun

Add this option to rerun the same job or another job to resolve the issue if the job fails. You can also customize the rerun option with the following parameters to run specific scenarios.

same workstation: If you add this option, the job runs on the same workstation where it was run earlier.

repeat every (hhmm): You can specify the time interval between each rerun in the hhmm format.

number of attempts: You can specify the number of times you want to rerun the job. If you do not specify any value, the job reruns once.

after [[folder/]workstation#][folder/]jobname: You can specify the recovery job. The recovery job can be any job definition in the database. The recovery job can also be a job that is used to fix the issue of the successor job. If you want to use a job definition that is stored in a remote workstation, specify the workstation with or without the folder name.

Different scenarios with recovery options

If the job fails you can use the recovery options to run different scenarios. The following criteria and table explains how the jobs are run when the recovery options are added in the job definition.

The CRIC job stream contains two jobs, JOB A and JOB B. The JOB B is internally dependent on JOB A and does not start if JOB A fails. The recovery job for JOB A is recoveryjob.


Recovery job	Stop	continue	rerun
No	If `JOB A` ends in `ABEND` status, then `JOB B` is not started at all. You can use the confirm command to change the status of `JOB A` to `SUCC`, to initiate `JOB B`.	After `JOB A` reaches any one of its final states (`SUCC`, `FAIL`, `ABEND`, or `SUPPR`), `JOB B` starts. Important: If there is a conditional dependency (See Dependency keywords) between `JOB A` and `JOB B`, then it must be satisfied to start `JOB B`.	If `JOB A` ends in `ABEND` status, it triggers a rerun. If it succeeds then `JOB B` starts. If the rerun fails, then `JOB A` is run again until it reaches a successful state. The process continues until the value specified in the number of attempts is reached. If no value is specified for number of attempts, then the job is rerun only once. You can use the confirm command to change the status of `JOB A` to `SUCC`, to initiate `JOB B`.
Yes	If `JOB A` ends in `ABEND` status, then the `recoveryjob` is initiated. If the `recoveryjob` is successful then `JOB B` runs. If the `recoveryjob` fails, the process is stopped.	If `JOB A` ends in `ABEND` status, the `recoveryjob` is run and `JOB B` starts, unless the `recoveryjob` ends in `FAIL` status, in which case the process stops and `JOB B` does not run.	If `JOB A` ends abnormally, it triggers the `recoveryjob`. If it ends successfully `JOB A` runs again. Note: The recovery job does not show in the Orchestration Monitor, as only the last instance of a job is displayed. Check in Run instances for information about the recovery job. If `JOB A` fails after a successful run of `recoveryjob`, the process is repeated and continues until the value specified in the number of attempts is reached. If no value is specified for number of attempts, the job reruns only once. If the `recoveryjob` fails, then `JOB A` is not initiated and the process is stopped.