Condensing usage data with the ZCAT utility

The ZCAT utility concatenates and condenses Usage Monitor data sets and generates a file that is then processed by the Usage Import program. When you condense the data produced by the Usage Monitor program, you can save storage space and improve the performance of the Usage Import program.

The Usage Monitor started task produces at least one usage data set per day. You can design a work flow that runs the ZCAT utility on the data sets on a weekly, fortnightly, or monthly basis before the Usage Import program processes them. Running the ZCAT utility on a weekly basis is useful, but depends on the amount of data that is produced and processed at your site. The Usage Monitor program collects detail about which job, account ID, and user ID are using each module of a particular library on a specified date. This information is output into multiple files that are produced on a daily basis. The ZCAT utility condenses the files in the following manner:

Usage data across multiple files is condensed to a monthly granularity, as are the records stored in the Repository database.
Redundant records in files and records that are not stored in the database, are omitted.
Optionally, condensation can apply to user IDs, job names, or account ID details.
The ZCAT output file is compressed and ready to be transmitted for Usage Import processing.

The following diagram shows the syntax of program parameters to run the ZCAT utility.

Figure 1. ZCAT utility syntax

Catalog search parameters

UMDSN and UMMASK are mutually exclusive. One must be specified if the ZCAT0001 DD is not allocated.

UMDSN(hlq): hlq is the Usage Monitor data set high-level qualifier. When the UMDSN parameter is specified, ZCAT concatenates all data sets having names of hlq.Dyyyyddd.Thhmmsst where yyyyddd and hhmmsst are the timestamp patterns of data sets produced by the Usage Monitor. The hlq can contain wildcard characters of percent or asterisk. The percent character denotes a single character mask, and the asterisk character denotes all characters. For example UMDSN(hlq.**)would search for all data set names of hlq.**.D%%%%%%%.T%%%%%%%.
UMMASK(dsnmask): dsnmask is the full dsn mask search criteria. It can be used to search for a pattern of files that differ from the files produced by the Usage Monitor. This parameter is useful if the files produced by the Usage Monitor have been renamed, but still need processing. Specifying UMMASK(hlq.D%%%%%%%.T%%%%%%%) is equivalent to specifying UMDSN(hlq)

Note: An easy way to remember the difference between UMDSN and UMMASK is to remember that UMDSN can accept the data set name prefix value specified in the Usage Monitor DSN setting, whereas UMMASK requires a mask which will match the entire data set name.

Input data sets found by searching the catalog may be zipped or unzipped. If zipped, then records before the first Usage Monitor header record will be discarded. If unzipped, the data set will not be processed unless the first record is a Usage Monitor header record.

Data set disposition parameters

One or more optional parameters can follow the mandatory parameters.

DELETE: Delete the input data sets after the output data set is successfully generated. NODELETE is the default.
NORENAME: Do not rename input data sets from hlq.D*.T* to hlq.D*.S* after the output data set is successfully generated. The default is to rename these input data sets to stop them being reprocessed by the ZCAT utility. Use this option only to rename the data sets before further ZCAT processing. This option stops double counting of usage data. This parameter is automatically set when UMMASK is used.

RENAME must not be explicitly specified with DELETE.

Data sets allocated to the ZCAT0001 DD are not included in RENAME and DELETE processing.

Optional condensation parameters

Improvements in performance and data storage space are gained by using the ZCAT utility options to carry out further condensation of data, ignoring data differences that are not important at your site, and do not appear in your regular reporting. You can still point the Usage Monitor File Detail Report to the saved archive of the Consolidated detail file (ZCATDETL), or to the Usage Monitor output files. ZCATDETL is produced by the ZCAT utility.

JNM

JNM is used to condense data based on job names.

JNM=N - Condense different job names to generic names of -STC-, -JOB-, -TSO- or -SYS-

JNM=Y - Preserve collected job name.

The shipped version of the HZASZCAT sample job specifies Y.

UID

UID is used to condense data based on user IDs.

UID=N - Replace collected user identifiers with blanks.

UID=Y - Preserve collected user identifiers.

The shipped version of the HZASZCAT sample job specifies Y.

JAC

JAC is used to condense data based on job account codes.

JAC=N - Replace collected job account codes with blanks.

JAC=Y - Preserve collected job account codes.

The shipped version of the HZASZCAT sample job specifies Y.

Note: The ZCATDETL file can be used to collect all valid importable input records into a single data set for archiving purposes, with the exception that duplicate user records are suppressed, and all user records are discarded if UID=N is specified.

Optional control parameters

DSDTL

DSDTL is used to control data set statistics reporting.

DSDTL=Y - Report data set condensation statistics to SYSPRINT.

DSDTL=N - Suppress the reporting of data set condensation statistics.

VFY

VFY is used to control whether the ZCATOUT file is to be verified after creation.

VFY=Y - After the ZCATOUT file is complete, it will be unzipped and read to verify that its contents are readable and that the expected number of records are present. This is the default.

VFY=N - Bypass ZCATOUT verification processing.

PACK

PACK is used to specify the zip compaction level used when writing zipped data.

PACK=n - where n is a decimal digit in the 0 to 9 range.

PACK=0 - Specifies that the shrink zip algorithm is used while higher values specify the compaction level of the deflate zip algorithm to be used. Higher compaction levels will achieve greater data compression, but will also consume disproportionately more CPU time.

PACK=1 - Is the default setting which requests the fastest level of the deflate method.

DD statements

SYSPRINT: Specifies the report file required by ZCAT which is usually allocated to SYSOUT. By default, RECFM=VBA and LRECL=137 will be used, though these can be overridden within some limits.

ZCATOUT: Specifies the name of the ZCAT output data set. This data set can then be used as the input to the Usage Import program, where usage details are imported into the database. If the ZCATOUT DD card is omitted, ZCAT by default writes to a data set having the name hlq.Dyyyyddd.Uhhmmsst (U instead of T implied by the high level qualifier (hlq) option for input data sets), where yyyyddd and hhmmsst refer to the date and time timestamp of the first processed input data set. If dynamically allocated, SPACE=(TRK,(768,255),RLSE) is used.

ZCATDETL: If the ZCATDETL DD is allocated, the uncondensed data is written to this data set. This allows detailed job name, user ID and job account information to be retained for subsequent analysis and/or reference. Any diagnostic records and records that fail validity testing are not written. Duplicate user records are suppressed. If UID=N is specified then all user records are discarded.
The ZCATDETL and ZCATOUT data sets are compressed data sets written by the ZCAT utility. SMS compression is not supported for these data sets.

ZCAT0001: If the ZCAT0001 DD is allocated, it specifies one or more usage data zip archives to be processed by ZCAT. ZCAT0001 is processed after any data sets located by searching the catalog, and allows administrators to manually process specific data sets which may have fallen outside the usual processing regime, or may not fit any convenient data set name mask. Like dynamically allocated input data sets, data set(s) allocated to ZCAT0001 may be zip archives or may contain unzipped data. However, ZCAT0001 is treated by ZCAT as a single file, and a concatenation containing both zipped and unzipped data is not allowed.

In this example, all data sets having names of are processed due to the UMDSN parameter. The condensed output is written to where the SYSUID system symbol is the user ID of the person submitting the job. This file is then transmitted for Usage Import processing. All valid records are written to the ZCATDETL DD card, , which is then archived for reference purposes.