Recommended workflow

About this task

For high-volume dimensions, the following section provides a recommended workflow for populating the dimension with a data set that maintains data integrity while limiting database growth.

Procedure

  1. If possible, validate the data before creating the dimension.
    1. For some dimensions, the data is already recorded in the request.
    2. For example, the TLT_URL value is automatically inserted by the Discover Reference session agent, which is included and enabled in the default pipeline configuration. URL normalization is enabled by default, too. See "Discover Reference Session Agent" in the Unica Discover Configuration Manual.
    3. For other high-volume dimensions that extract from request or response data, you may want to verify that the data is being appropriately captured in a session through replay before you create the dimension. For example, you can search for specific event values or indexed request/response data. See "Searching Session Data" in the Unica Discover Manuals.
    4. If the values do not appear to be recorded properly:
      1. Ascertain if they are being inserted by Discover or your web application:
        1. If the data is being inserted by Discover, verify that the appropriate component is inserting the data. Data may be inserted by the DNCA, Canister, or event that is defined in the Event Manager.
        2. If the data is inserted by your web application, verify the data with your web development team.
  2. Create the dimension.
    1. Make sure to set the Values to Record to be Whitelist Only.
    2. You may want to adjust the Max Values Per Hour as needed.
      • Processed values include whitelisted values, which also count against this limit. Blacklisted values do not count.
        Note: For testing purposes, you may want to add this dimension to a report group that is associated with an event that occurs in each session. Later, through the Discover Report Builder, you can create a simple report with the event + dimension combination to review the captured values.
    3. Enable logging of values for the dimension. Dimension logging enables the capture of observed values for purposes of downloading and creating your whitelist. These values are captured in logs that are stored in the database, which are automatically cleared after a period of days. See Manage Events - Dimensions Tab.
  3. Let the log fill with a sufficient volume of values to be a meaningful cross-section of activity. For a high-volume dimension, you may have a representative data set by waiting a single hour.
    Note: A downloaded log file can contain up to the top 250,000 values by occurrence over the duration that they were collected in the logs.
  4. Edit the log values to be your first pass at the whitelist.
    1. Download the logged values to your local desktop.
    2. Load the values into Microsoft™ Excel. Sort them based on the occurrences.
    3. You can decide the top number of values to insert into your whitelist. You should copy and paste these values to a separate XLS sheet.
      Note: A whitelist can contain up to 5,000 values.
      • Retain the file that you used to upload for recordkeeping.
  5. Load the values into your whitelist through the Dimension editor.
  6. Monitor the captured values.
    1. After you loaded the dimension values into the whitelist, all subsequent observed values are checked against the whitelist.
    2. If the Maximum Number Per Hour of values is exceeded, an instance of the [Limit] value is recorded for the dimension.
    3. If an observed value does not appear in the whitelist and the Max Number Per Hour of values was not exceeded, an instance of the [Others] is recorded for the dimension.
    4. Through the Discover Report Builder, create a report:
      1. Add an event that occurs each session.
      2. Add the dimension, which should be available if you added it to a report group associated with the event.
      3. Each hour, you can track the count of occurrences of the [Others] and [Limit].
  7. Periodically, you should download a new set of log values and compare it to the set that you saved.
    1. Look for logged values that have a number of occurrences greater than 1 and that do not appear in the whitelist. These values should be added.
    2. Look for values in the whitelist that do not appear in the set of logged values. These values should be removed.
    3. In Microsoft Excel, the VLOOKUP function can be used to check the contents of one worksheet against another. For more information, see the documentation available inside Microsoft Excel.
      Note: If there are significant changes to your web application, your dimension whitelists are likely to need rebuilding. Contact your web application development team for details on the changes.
  8. When the values appear to stabilize, you can turn off logging of values.