Recovering from errors on the checkpoint data set

About this task

If there is a write error on the checkpoint data set, perform the following procedure:
  1. Stop HCL Workload Automation for Z.
  2. Rename the checkpoint data set to a temporary name.
  3. Allocate a new checkpoint data set.
  4. Copy the old checkpoint data set into the new data set. This can be done by ISPF COPY or by IDCAMS REPRO.
  5. Start HCL Workload Automation for Z again.
If there is a read error on the checkpoint data set, perform the following procedure:
  • If a good new-current-plan does not exist:
    1. Stop HCL Workload Automation for Z.
    2. Delete the checkpoint data set and reallocate it.
    3. Re-create the current plan using the refresh procedure (for details, see Re-creating the current plan from the long-term plan).
  • If a good new-current-plan data set exists:
    1. Stop HCL Workload Automation for Z.
    2. Check which job-tracking log is the current one. This can be done by reviewing the messages in the message log, or by browsing the JT log and checking the time stamp in position 13 in the first record of the data set. The data set with the latest time stamp in the first record is current.
    3. Copy the data from the active job-tracking log into the job-tracking log referenced by the EQQJT01 ddname.
    4. Determine which JS file was active. If EQQJS1DS defines the current data set, then continue with the next step. Otherwise, either copy the EQQJS2DS to the EQQJS1DS or switch the ddnames in the JCL procedure.
    5. Delete and reallocate the HCL Workload Automation for Z checkpoint data set.
    6. Change the JTOPTS statement to specify JOBSUBMIT(NO) and CURRPLAN(NEW), and start the scheduler.
    7. Enter the Modify Current® Plan dialog to set correct status for all operations in the current plan.
    8. When all operations have correct status, enter the SERVICE FUNCTIONS panel and enable job submission again. Restore the JTOPTS statement if you changed it.