Browsing zipped data

Components such as the Inquisitor programs, the Usage Monitor, and the ZCAT utility deal with sequential data sets containing zip archives. Processing the data with a zip algorithm allows the same information to be conveyed in a fraction of the original size, but it does mean that browsing the file to quickly confirm the nature of its content becomes a more involved process.

One advantage of a file browser over an editor is that the browser can quickly present the data from the start of a file for inspection without reading in the whole data set, whereas editors typically read the complete data contents before proceeding. Similarly, a data set containing zipped data could be unzipped with the result being browsed in the usual way, but this means that space for all the unzipped data is required which tends to defeat the purpose of compressing the data.

Browsing zipped data using HZABRZIP

The HZABRZIP program can be used to browse the unzipped contents of a data set without requiring the unzipping of the entire data set. A REXX EXEC called HZAIBRWZ is shipped in the SHZCSAMP library to provide a convenient interface to invoke the HZABRZIP program.

To make the facility available for use, customize the HZAIBRWZ EXEC and place it in a suitable library giving it a suitable member name. (When installing it into your local library, you can call it HZAIBRWZ, or you can give it a simpler name, such as BRZIP or whatever you find suitable.) The customization process consists of supplying the data set name of the program library containing the HZABRZIP program.

HZABRZIP invokes the BRIF service of ISPF, and so it requires an ISPF environment for execution. The HZAIBRWZ EXEC expects an operand of a data set name, and so is suitable for general use under ISPF including as a line command in a data set list created by option 3.4.

HZABRZIP only unzips enough data to be able to provide the records selected by the user for browsing. For example, if there are ten million records but the user only scrolls down to view the first hundred records in the browse session, then only 100 records need to be unzipped. Unzipped records are staged in a data space so that scroll up requests can be satisfied by providing records from the data space without the need to interrupt the current progress of the suspended unzip process. The unzip process is resumed when previously unread records need to be accessed.

HZABRZIP has several limitations and behavioural characteristics:

  • All zipped data is assumed to be ASCII text, and is translated to EBCDIC before display.
  • The maximum record length without wrapping on to a new line is 1024 bytes.
  • The name passed to BRIF to display as the file name is the name of the first or only file in the zip archive.
  • If the end of a file is reached, before showing data from the next file, HZABRZIP will insert a record containing the following message:
    { HZABRZIP reached end of file  -  Start of file newname }

    where newname is the name of the next file being unzipped from the same archive.

  • If HZABRZIP recognizes records as having come from the Inquisitor or the Usage Monitor then it will insert records into the browse data to provide column headings for data items within the recognized records. Such inserted lines will be repeated whenever the record type is different from the previous record, and will have the following form:
    { details   This line was inserted for display by HZABRZIP }

    where details describes items present in the subsequent record(s).

  • HZABRZIP cannot present data that would cause it to read more records than can be stored in the data space, either because local limits failed a data space extend request, or because of the 2 gigabyte size limit of data spaces.