Binary data processing with the HZAZIP utility
The HZAZIP utility processes binary data in order to preserve record boundaries, while other platforms typically consider binary data to be a byte stream without structure.
- So that record boundaries can be preserved, the following is done depending on the input record format:
- For fixed-length records, no additional data preparation is done.
- For variable-length records, the record descriptor word (RDW) is retained as part of the data.
- When the record format is undefined, each block is prefixed by an RDW where the first two bytes contain the length of the block including the RDW, and the third and fourth bytes contain zeros.
- Compresses the data and writes it to the archive.
Each compressed member is marked as a binary file and the internal attribute value of the central file header is set to 0.
- Data set organisation
- Record format
- Block size
- Logical record length
- To establish the length of the record, the following is done depending on the record format of
the original input data set:
- For fixed-length records, the original record length is used.
- For other record formats, 4 bytes from the archive are decompressed and examined to determine if they form a valid RDW. If so, the RDW length indication is used, and if not, then the data is treated as a byte stream where record boundaries do not need to be preserved.
- Data is decompressed and written as a record of the determined length. Maximum-length records are written when the data is assessed to be a byte stream.
During decompression of binary data, the embedded RDWs are checked for validity. If an RDW does not indicate a positive length greater than 4 or does not end with two bytes of zeros, the HZAZIP utility switches to byte stream mode. In byte stream mode, the utility considers data as a stream of bytes without an inherent record structure. If the RDW that fails the validity test is the first four bytes of the file, the resultant decompression is broadly compatible with the decompression that most other platforms perform and the utility issues an informational message. If the RDW that fails the validity test is not at the start of the file, the utility issues a warning message, sets the final condition code to be greater than zero, but continues processing so that the output data is available for any necessary data recovery activity.