ALERTS

Purpose

The ALERTS statement specifies the conditions under which HCL Workload Automation for Z generates an alert. You can specify this statement for a tracker, controller, or standby controller. You can use the following alert actions when an alert condition occurs:

ALERTS is defined in the member of the EQQPARM library as specified by the PARM parameter on the JCL EXEC statement.

Format


1  ALERTS?  GENALERT (
2.2.1+ ,
2.2.1 DURATION
2.2.1 ERROROPER
2.2.1 LATEOPER
2.2.1 OPCERROR
2.2.1 QLIMEXCEED
1 )?  INCIDENT (
2.2.1+ ,
2.2.1 error condition:member name
1 )?  MAIL (
2.2.1+ ,
2.2.1 error condition:member name
1 )?  MLOG (
2.1! OPCERROR
2.2.1+ ,
2.2.1 DURATION
2.2.1 ERROROPER
2.2.1 LATEOPER
2.2.1 QLIMEXCEED
2.2.1 RESCONT
1 )

1?  MONOPER (
2.1! YES
2.1 NO
1 )?  RECEIVERID (
2.1! NETVALRT
2.1 Alert Receiver ID
1 )?  WTO (
2.2.1+ ,
2.2.1 DURATION
2.2.1 ERROROPER
2.2.1 LATEOPER
2.2.1 OPCERROR
2.2.1 QLIMEXCEED
2.2.1 RESCONT
1 )

Parameters

GENALERT(alert condition,...,alert condition)
Defines the conditions under which HCL Workload Automation for Z sends a generic alert to NetView®.
INCIDENT(alert condition:member_name,...,alert condition:member_name)
This parameter is significant provided that you specified the EQQINCID DD statement in the Z controller JCL procedure. It defines the conditions under which the Z controller notifies the incident specified in the member_name of the EQQINCID data set.

For details about the EQQINCID data set, see Incident data set (EQQINCID).

You can specify up to 8 characters for the member_name.

The supported alert conditions are DURATION, ERROROPER, HIGHRISK, LATEOPER, OPCERROR, POTENTRISK, SPECRES, and WLMOPER.

MAIL(alert condition:member_name,...,alert condition:member_name)
Defines the conditions under which the Z controller sends the email specified in the member_name of the EQQEMAIL data set (for details, see Email data set (EQQEMAIL)). This parameter is significant provided that you specified the EQQMAIL and EQQSMTP DD statements in the Z controller JCL procedure.

You can specify up to 8 characters for the member_name.

The supported alert conditions are DURATION, ERROROPER, HIGHRISK, LATEOPER, OPCERROR, POTENTRISK, SPECRES, and WLMOPER.

MLOG(alert condition,...,alert condition|OPCERROR)
Defines the conditions under which HCL Workload Automation for Z writes a message to the message log.
MONOPER(YES|NO)
Defines if the following alert conditions specified in the MONALERT parameter are in effect for the jobs that have the EXTERNAL MONITOR option set to YES (default) or for all jobs. The alert conditions are ERROROPER, LATEOPER, DURATION, and WLMOPER.

It is used with IBM Tivoli Monitoring.

RECEIVERID(Alert Receiver ID|NETVALRT)
Defines the NetView® alert receiver that generic alerts are sent to. Specify this keyword if the alert receiver in NetView® address space that handles HCL Workload Automation for Z alert automation does not have the NetView® default ID, NETVALRT.
WTO(alert condition,...,alert condition)
Defines the conditions under which HCL Workload Automation for Z generates a write-to-operator (WTO) message.

Alert conditions

You can specify one or more of the following alert conditions for each alert type. Note that only OPCERROR and QLIMEXCEED are applicable to a tracker; other alert conditions are ignored if you specify them.
DURATION
The alert action is taken when an operation in the current plan is active for an unexpectedly long time. This means that an operation that has started (extended status S) must be active longer than its estimated duration multiplied by either the following values:
  • The alert action limit that you set for ALEACTION (for details, see JTOPTS).

    -OR-

  • The duration feedback limit that you set for LIMFDBK (for details, see JTOPTS). This value is used if ALEACTION is not set.
then divided by 100.

For example, if an operation has an estimated duration of 10 minutes and the limit for the alert action is 200, the alert action is taken if the operation is active for longer than 20 minutes. The alert action is also taken if the operation has been started but the associated job or started task has not yet started to run after 10 minutes (no A2/B2 event has been received), that is, the operation has had status/extended status SU and SQ totally for more than 10 minutes. The alert action is taken only for operations that have started status.

For MLOG and WTO alert actions, message EQQE028I is issued for an operation at a general workstation and EQQE038I for an operation at a computer or printer workstation for long running operations. Message EQQE039I is issued for computer operations that have been submitted but have not started.

Notes:
  1. The value used to select the operations for which a long duration alert must be issued is set with the ALEACTION keyword in the JTOPTS statement. If ALEACTION is not specified, the value set for LIMFDBK is used instead. In this case, the value for the feedback limit that you can optionally enter in the application description is ignored.
  2. If the alert action limit is 0 or the alert action limit is not specified and the feedback limit is 0, the alert action is taken as soon as the operation is active longer than its estimated duration.
  3. If the estimated duration of an operation is 99 hours, 59 minutes and 01 seconds, no duration alert is sent for this operation.
ERROROPER
The alert action is taken when an operation in the current plan is set to ended-in-error status. For MLOG and WTO alert actions, message EQQE026I is issued for an operation at a general workstation, and EQQE036I for an operation at a computer or printer workstation.
HIGHRISK
The alert action is taken when the risk level of a critical operation in the current plan has become High. Message EQQCP21I is issued for a critical operation at a general workstation.

The alert condition HIGHRISK is valid only for the INCIDENT and MAIL parameters.

LATEOPER
The alert action is taken when an operation in the current plan becomes late. An operation is considered late if it reaches its latest start time and does not have the status started, complete, or deleted. For MLOG and WTO alert actions, message EQQE027I is issued for an operation at a general workstation and EQQE037I for an operation at a computer or printer workstation.
Note: Use LATEOPER only when deadlines are accurate because it can affect the performance of HCL Workload Automation for Z.
OPCERROR
The alert action is taken when an HCL Workload Automation for Z subtask or HCL Workload Automation for Z subsystem ends unexpectedly. For MLOG and WTO alert actions, message EQQZ045W and EQN019E are issued. If GENALERT action is specified and the EQQN019E alert condition occurs, then the subtask-failed alert is sent to NetView®.
Note: OPCERROR is always in effect for MLOG.
POTENTRISK
The alert action is taken when the risk level of a critical operation in the current plan has become Potential. Message EQQCP20I is issued for a critical operation at a general workstation.

The alert condition POTENTRISK is valid only for the INCIDENT and MAIL parameters.

QLIMEXCEED
The alert action is taken each time an HCL Workload Automation for Z subtask queue exceeds a threshold value. Except for the event-writer queue, the threshold values are multiples of 10 between 10% and 90%, and then 95% and 99%. HCL Workload Automation for Z checks the size of a queue when an event is added to it. Except for the event-writer queue, HCL Workload Automation for Z subtask queues can contain up to 32,000 elements.

The size of the event-writer queue is determined by the ECSA you allocate. The queue is checked each time the event writer is about to read events; the alert action is taken if the queue exceeds 50%. If the event-writer queue becomes full, a message is issued indicating how many events have been lost.

The value in the alert shows the actual percentage used, which will be more than the threshold value. For MLOG and WTO alert actions, message EQQZ106W is issued.

RESCONT
You can specify RESCONT (resource contention) only for MLOG and WTO alert types. The alert action is taken when an operation has been waiting on a resource queue for the time specified on the CONTENTIONTIME keyword of RESOPTS. Message EQQQ515W is issued.
SPECRES
The alert action is taken when the time that an operation in the current plan is waiting to allocate a given resource exceeds the time specified by the RESOPTS CONTENTIONTIME parameter. This alert takes effect when it is defined in the MONALERT parameter.
WLMOPER
The alert action is taken when an operation in the current plan is promoted by WLM. The alert is sent only if specified in the MONALERT parameter.
 ALERTS MLOG(ERROROPER,LATEOPER,DURATION)   1
        WTO(DURATION)                       2
        GENALERT(ERROROPER)                 3 
        MONALERT(DURATION,OPCERROR,WLMOPER) 4
        MONOPER(YES)                        5 
          
In this example of an ALERTS statement:
1
HCL Workload Automation for Z writes a message to the message log for operations that are set to ended-in-error status, are late, or are active an unexpectedly long time. Although it is not specified, the OPCERROR condition also applies for the MLOG alert action.
2
A write-to-operator message is generated for operations that are active an unexpectedly long time.
3
HCL Workload Automation for Z sends a generic alert to NetView® for operations that are set to ended-in-error status. By default, generic alerts are sent to the alert receiver NETVALRT.
4
HCL Workload Automation for Z sends a generic alert to IBM® Tivoli® Monitoring for operations that have long durations, for HCL Workload Automation for Z substaks or subsystems that end unexpectedly, and for operations promoted by WLM.
5
HCL Workload Automation for Z sends a generic alert to IBM® Tivoli® Monitoring only for monitored operations (operations with EXTERNAL MONITOR = YES) satisfying the conditions specified in the MONALERT parameter and not for all jobs.