Automated implementation
Given two MVS™ systems, A and B, which do not share DASD, fault entries that are created on system A can be copied automatically to system B for viewing or reanalysis by using the Z Abend Investigator ISPF interface.
The entries are copied by using a Notification user exit on system A, which submits a TSO batch job to transmit fault entries as PDS members to a dedicated user ID on system B. On system B, a continually running batch TSO job receives fault entries into a staging data set. It then calls the HFZUTIL batch utility to import the fault entry into a local history file.
See Customizing Z Abend Investigator by using user exits for information about user exits in general. See Notification user exit for information about the Notification user exit specifically. See Managing history files (HFZUTIL utility) for information about the HFZUTIL batch utility.
NFYEXIT: sample Notification user exit
nodeid = 'MVSB' /* <--- verify/change */ ❶
userid = 'HFZROBOT' /* <--- verify/change */ ❷
jobcard = '//NOTIFY JOB MSGCLASS=Z' /* <--- verify/change */ ❸
/*********************************************************************/ ❹
/* #Optionally, add checks here for selective transmission of fault */
/* entries that only match a certain criteris. */
/* For example: */
/* If ENV.USER_ID ¬= "FRED" then exit 0 */
/* If ENV.USER_HFZHIST ¬= "MY.HISTFILE" the exit 0 */
/*********************************************************************/
"MAKEBUF"
queue jobcard
queue '//**************************************************************'
queue '//* Export fault entry'
queue '//**************************************************************'
queue "//EXPORT EXEC PGM=HFZUTIL"
queue "//DD1 DD DISP=(,PASS),"
queue "// SPACE=(CYL,(10,100,5),RLSE),"
queue "// DCB=(DSORG=PO,RECFM=VB,LRECL=10000)"
queue "//SYSPRINT DD SYSOUT=*"
queue "//SYSIN DD *"
queue " EXPORT("ENV.HFZHIST"("ENV.FAULT_ID"),DD1)"
queue "/*"
queue '//**************************************************************'
queue '//* Terse the export data set'
queue '//**************************************************************'
queue "//TERSE EXEC PGM=AMATERSE,PARM='PACK'"
queue "//SYSPRINT DD SYSOUT=*"
queue "//SYSUT1 DD DISP=SHR,DSN=*.EXPORT.DD1"
queue "//SYSUT2 DD DISP=(,PASS),"
queue "// SPACE=(CYL,(10,100),RLSE)"
queue "//SYSPRINT DD SYSOUT=*"
queue '//**************************************************************'
queue '//* Perform TSO XMIT of the exported and tersed fault entry'
queue '//**************************************************************'
queue "//XMIT EXEC PGM=IKJEFT01"
queue "//DD1 DD DISP=SHR,DSN=*.TERSE.SYSUT2"
queue "//SYSTSPRT DD SYSOUT=*"
queue "//SYSTSIN DD *"
q_rec(" XMIT" nodeid"."userid "DDNAME(DD1) -")
q_rec(" NONOTIFY")
queue '/*'
/* 'Submit' the stacked TSO batch job */
n = queued()
"HFZALLOC DD(DD1) SYSOUT PGM(INTRDR)"
if rc = 0 then do /* allocation worked so generate output */
address mvs "EXECIO" n "DISKW DD1 (FINIS"
"HFZFREE DD(DD1)"
say 'Fault entry' ENV.FAULT_ID 'sent to' nodeid'.'userid
end
else do
"HFZWTO Allocation of INTRDR failed"
say 'Fault entry' ENV.FAULT_ID 'job submission failure'
end
exit 0
/* Pad record with blanks to 80 bytes. */
q_rec: procedure
parse arg rec
if (length(rec) < 80) then rec = rec||copies(' ',80-length(rec))
queue rec
return 0
- ❶
- 'nodeid' specifies the target system to which the fault entry is sent.
- ❷
- 'userid' specifies the user ID for which fault entries are received on the target system. Use this user ID solely to receive fault entries.
- ❸
- Ensure that the job card adheres to local standards.
- ❹
- You can add checks here to see whether a fault is eligible to be sent to another system. The example shows how the user ID or history file name can be used, but any fields in the ENV or NFY data areas can be checked.
DataSets(HFZEXEC(exec.lib))
Exits(NOTIFY(REXX(NFYEXIT)))
HFZROBOT: sample REXX exec to receive fault entries
- Receive files for the HFZROBOT user into a staging data set from where they are imported into a local history file by using the HFZUTIL batch utility.
- Create the HFZUTIL IMPORT user exit, HFZROBEX (see Sample REXX HFZUTIL user exit).
histfile = 'B.HIST' /* <--- verify/change */ ❺
temphist = 'B.TEMP' /* <--- verify/change */ ❻
seconds = '60' /* <--- verify/change */ ❼
use_exit = 'Y' /* <--- Y|N. verify/change */ ❽
address tso
x = prompt('on')
x = outtrap('var.',10,'noconcat')
do forever
/* Obtain information about transmitted data on the JES output queue */
if queued() = 0 then queue 'end'
'receive'
input = 'N'
/* Examine the output from the 'dummy' receive command.
The following variables are initialized:
dsn - the 'sending' history file name
fromid - the user ID performing the TSO XMIT
node - the JES node from which the fault entry was sent
faultid - the fault ID (member name) */
do i = 1 to var.0
parse var var.i msgno t1 t2 t3 t4 t5 t6
if msgno = 'INMR901I' then do
dsn = t2
fromid = t4
node = t6
end
else if msgno = 'INMR902I' then do
faultid = t2
input = 'Y'
leave
end
end
/* Perform actual receive to the staging history file followed by an
HFZUTIL batch utility import if there is data available */
if input = 'Y' then do
if faultid <> "" then do
/* Receiving a PDS/E. */
say 'Receiving' dsn'('faultid') from' node'.'fromid
queue "DSN('"temphist"')"
queue 'END'
'RECEIVE'
end
else do
/* Receiving a sequential data set - assume AMATERSE PACKed. */
say 'Receiving' dsn 'from' node'.'fromid
queue "DSN('"temprecv"')"
queue 'END'
'RECEIVE'
/* Perform AMATERSE UNPACK. */
"ALLOC DD(SYSPRINT) DUMMY"
"ALLOC DD(SYSUT1) DA('"temprecv"') SHR"
"ALLOC DD(SYSUT2) DSN('"temphist"'),
NEW CATALOG UNIT(SYSALLDA) RECFM(V B) LRECL(10000),
CYLINDERS SPACE(10,100) DIR(5)"
address tso "CALL *(AMATERSE) 'UNPACK'"
say 'UNPACK rc =' RC
/* Get fault ID (member name). */
"LISTDS '"temphist"' MEMBERS"
/* Sample output: */
/* FRED.$$TEMP$$.HIST */
/* --RECFM-LRECL-BLKSIZE-DSORG */
/* VB 10000 27998 PO */
/* --VOLUMES-- */
/* E$US21 */
/* --MEMBERS-- */
/* F01103 */
mbr_start = 0
do i = 1 to var.0
/*say "var."i"='"var.i"'"*/
if mbr_start = 0 then do
if strip(var.i) = "--MEMBERS--" then do
mbr_start = i + 1
leave
end
end
end
if mbr_start = var.0 then do
/* One, and only one, member. */
faultid = strip(var.mbr_start)
end
else do
say 'ERROR: More than one member found in data set' temphist,
'- terminating'
exit 12
end
'FREE DD(SYSUT2)'
'FREE DD(SYSUT1)'
'FREE DD(SYSPRINT)'
"DELETE '"temprecv"'"
end
/* The target history file in the 'histfile' variable could be */
/* determined here based on any of the initialized variables */
/* dsn, fromid, node or faultid. This sample EXEC uses a single */
/* history file only. */ ❾
/* Perform HFZUTIL IMPORT. */
'ALLOC DD(SYSIN) NEW REU UNIT(VIO) RECFM(F B) LRECL(80)'
"ALLOC DD(SYSPRINT) SYSOUT"
if use_exit = 'Y' then
parms.1 = "EXITS(IMPORT(REXX(HFZROBEX)))"
else
parms.1 = "* Using HFZOPTLM for dump data set names"
parms.2 = "IMPORT("histfile","
parms.3 = " "temphist"("faultid"),PACKAGE)"
parms.0 = 3
"EXECIO * DISKW SYSIN (STEM parms. FINIS"
address tso "CALL *(HFZUTIL)"
say 'IMPORT rc =' RC
end
else do
/* Sleep for 60 seconds before attempting to receive again */
address tso "call *(hfzsleep) '"seconds"'"
end
end
- ❺
- This item is the name of the target history file in which the received fault entries are placed. To select the history file that is based on where the fault originally occurred, see ❾.
- ❻
- This item is a staging data set that is used for the TSO receive command and from
which fault entries are imported into the target history file.Important: Do not use a preallocated data set. Let the exec allocate and delete this staging data set for each fault received, as shown in the sample provided.
For the HFZROBOT exec and HFZUTIL IMPORT processing to work, the staging data set must never be used as a regular history file and must never contain more than a single member. If it is used as a regular history file (for example, if it is displayed using the Fault Entry List display, or used as the target of an HFZUTIL FILES or LISTHF control statement), then a $$INDEX member will likely be created, which will cause the processing not to work. Also, it is possible that the data set becomes HFZS subsystem managed, which will subsequently result in serialization issues.
By ensuring that the staging data set only exists for the duration of the receive and IMPORT processing, the possibility that these issues will occur is eliminated.
- ❼
- The HFZROBOT exec enters a WAIT state to preserve resources between checking for fault entries to be received. The time interval in number of seconds between receiving fault entries can be specified here. All fault entries on the JES output queue for the chosen user ID are received, then the HFZROBOT exec enters the WAIT.
- ❽
-
- If you set the RFRDSN, XDUMPDSN, and SDUMPDSN options to valid data set name patterns in the HFZOPTLM configuration-options load module, there is no need to use the HFZROBEX user exit. (See Customize Z Abend Investigator by using an HFZOPTLM configuration-options module.) In this case, set "use_exit" to 'N'.
- If you use the HFZROBEX user exit, any dump data set names provided by the exit override the equivalent option setting in HFZOPTLM.
- ❾
- The sample exec uses only a single target history file for all received fault entries. It is
possible to assign a target history file that is based on one of these items:
- The original history file name (in variable 'dsn').
- The sending user ID (in variable 'fromid').
- The node ID from where it was sent (in variable 'node'),
- The fault ID itself (in variable 'faultid').
//HFZSTSOB JOB <job card parameters>
// SET EXECDSN=exec.lib <--- verify/change
//TSOBATCH EXEC PGM=IKJEFT01
//SYSEXEC DD DISP=SHR,DSN=&EXECDSN.
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSIN DD DUMMY
//SYSTSIN DD *
HFZROBOT
/*
//HFZEXEC DD DISP=SHR,DSN=*.TSOBATCH.SYSEXEC
//HFZTRACE DD SYSOUT=*
Because the HFZROBOT exec never exits, the HFZSTSOB job executes indefinitely. However, the exec causes the job to enter a WAIT state between attempts to receive incoming data to prevent using unnecessary resources. To end the job, use the MVS™ CANCEL command during a period of inactivity. Alternatively, the exec could be made to recognize a special file that if sent to the selected user ID could trigger the exit to terminate.
A started task could be defined instead to execute this JCL in order to prevent tying up a JES initiator.