Dividing contacts into sample groups

To create target and control groups, use the Sample process. There are several sampling methods: Random creates statistically valid control groups or test sets. Every other X allocates every other record to a sample group. Sequential portions allocates a number of records into subsequent samples.

Procedure

  1. Open a flowchart for editing.
  2. Drag the Sample process Two sets of heads from the palette to your flowchart.
  3. Connect at least one configured process (such as a Select process) as input to the Sample process box.
  4. Double-click the Sample process in the flowchart.
  5. Use the Input list on the Sample tab to select the cells that you want to sample. The list includes all output cells from any process connected to the Sample process. If multiple cells are providing input, you can optionally select the Multiple cells option. If more than one source cell is selected, the same sampling is performed on each source cell.
    Note: All selected cells must be defined at the same audience level, such as Household or Customer.
  6. Use the # of Samples/Output cells field to specify how many samples to create for each input cell. By default, three samples are created for each input cell, with default names Sample1, Sample2 and Sample3.
  7. To change the default sample names, double-click a sample in the Output name column, then type a new name. You can use any combination of letters, numbers, and spaces. Do not use periods (.) or slashes (/ or \).
    Important: If you change the name of a sample, you must update all subsequent processes that use this sample as an input cell. Changing a sample name might unconfigure subsequent connected processes. In general, you should edit the names of samples before connecting subsequent processes.
  8. Use one of the following methods to define the sample size:
    • To divide records up by percentages: Select Specify size by %, then double-click the Size field to indicate the percentage of records to use for each sample. Use the Max size field if you want to limit the size of the sample. The default is Unlimited. Repeat for each sample listed in the Output name column, or use the All remaining check box to assign all remaining records to that sample. You can select All remaining for only one output cell.
    • To specify the number of records for each sample size: Select Specify size by # records, then double-click the Max size field to specify the maximum number of records to allocate to the first sample group. Specify the Max size for the next sample that is listed or use the All remaining check box to assign all remaining records to that sample. You can select All remaining for only one output cell.

      Optional: Click Sample size calculator and use the calculator to determine the optimal sample size. (See About the sample size calculator.) Then copy the value from the Min. sample size field in the calculator, click Done to close the calculator, and paste the value into the Max. size field for Specify size by # records.

  9. Ensure that each sample in the Output name list has a Size defined or has All remaining checked.
  10. In the Sampling method section, specify how to build the samples:
    • Random sample: Use this option to create statistically valid control groups or test sets. This option randomly assigns records to sample groups using a random number generator based on the specified seed. Seeds are explained later in these steps.
    • Every other X: This option puts the first record into the first sample, the second record into the second sample, up to the number of samples specified. This process repeats, until all records are allocated to a sample group. To use this option, you must specify the Ordered by options to determine how records are sorted into groups. The Ordered by options are explained later in these steps.
    • Sequential portions: This option allocates the first N records into the first sample, the next set of records in the second sample, and so on. This option is useful for creating groups based on the top decile (or some other size) based on some sorted field (for example, cumulative purchases or model scores). To use this option, you must specify the Ordered by options to determine how records are sorted into groups. The Ordered by options are explained later in these steps.
  11. If you selected Random sample, in most cases you can accept the default seed. The Seed represents the starting point that Unica Campaign uses to select IDs randomly.

    To generate a new seed value, click Pick or enter a value in the Seed field. Examples of when you might need to use a new seed value are:

    • You have exactly the same number of records in the same sequence and if you use the same seed value, records are created into the same samples each time.
    • The random sample produces undesired results (for example, all males are being allocated to one group and all females to another).
    Note: The same random set of records will be used for each subsequent run of the Sample process (unless the input to the process changes). This is important if you intend to use the results for modeling purposes, because different modeling algorithms must be compared across the same set of records to determine each model's effectiveness. If you do not intend to use the results for modeling, you can make the Sample process select a different random set of records each time it runs. To do this, use a Random Seed of zero (0). A value of 0 ensures that a different random set of records will be selected each time the process runs.
  12. If you selected Every other X or Sequential portions, you must specify a sort order to determine how records will be allocated to sample groups:
    1. Select an Ordered by field from the drop-down list or use a derived field by clicking Derived fields.
    2. Select Ascending to sort numeric fields in increasing order (low to high) and sort alphabetic fields in alphabetical order. If you choose Descending, the sort order is reversed.
  13. Use the General tab as follows:
    1. Process name: Assign a descriptive name. The process name is used as the box label on the flowchart. It is also used in dialogs and reports to identify the process.
    2. Output cell names: By default, output cell names consist of the process name followed by the sample name and a digit. These names are used in dialogs and reports. You can double-click an output cell name to change it by typing in the field. Or, click the Copy button to open a text box that shows all of the existing Output cell names. Copy them manually, then click OK. Then click the Paste button to paste them into a text box, where you can edit them. Then click OK to copy the edited Output cell names into the fields. You can use the Reset cell names button if you want to revert to the default output cell names.
    3. Cell codes: The cell code has a standard format that is determined by your system administrator and is unique when generated. Do not change the cell code unless you understand the implications of doing so. By default, the name of a cell created in a process matches the process name. When you save changes to an output cell name, if Auto generate cell codes is selected, the cell code is regenerated. If you do not want the cell code to change, uncheck Auto generate cell codes. See Cell names and codes.
    4. Note: Use the Note field to explain the purpose or result of the process. The contents of this field appears when you rest your cursor over the process box in a flowchart.
  14. Click OK.

Results

The process is configured and enabled in the flowchart. You can test run the process to verify that it returns the results you expect.