Setting CustomerSampleSize

Properly configuring CustomerSampleSize for best Unica Optimize session run time while preserving optimality takes some consideration.

CustomerSampleSize and "chunks"

Unica Optimize works by dividing up the proposed contacts into random subsamples of customers called "chunks." When using a single thread, Unica Optimize processes one chunk at a time. All proposed contacts and contact history that belong to a single customer are processed with that customer in the chunk in which that customer belongs. A customer can belong to a single chunk only. Each chunk is created on a set of random customers. The accuracy of the optimization algorithm depends on these chunks of customers being statistically similar to each other. A larger chunk size makes this requirement more likely. Cross-customer capacity constraints are evenly distributed across the chunks. For example, your Unica Optimize session contains a constraint specifying there is a maximum of 1000 offer A allowed. If the Unica Optimize session is run with 10 chunks, each chunk has a capacity rule that allows a maximum of 100 offer A.

You use the algorithm tuning variable CustomerSampleSize to set the maximum chunk size. The larger the chunk, the more accurate the results. However, the session runtime and memory resources also increase. Do not use chunk sizes greater than 10,000 without careful planning. Many systems do not have enough memory resources to process more than 10,000 customers at a time. This lack of memory resources results in a failed Unica Optimize session run with an out of memory error. In many cases, a larger chunk size might not significantly increase the optimality of the solution at all, but still takes more time and memory to run. Optimality is measured as the sum of scores of the surviving transactions in the optimized contacts table (OCT). You might need to tune the CustomerSampleSize based on your specific optimization problem and performance needs.

In a simple optimization scenario where there are no cross-customer capacity rules defined, there is no added benefit from using larger chunk sizes.

During chunk building, Unica Optimize also creates set of rules, For Each Customer and Custom Capacity, for each chunk. The For Each Customer (FEC) rule uses the Min/Max constraints as it is. However, for the Custom Capacity (CC) rules, the Min/Max constraints are divided by the number of chunks and assigned to each chunk. This creates approximately an equal quota or distribution of capacity for each chunk for a given CC rule.

CustomerSampleSize and cross-customer capacity rules

To understand the cases where cross-customer capacity rules are used, you must understand how those rules are applied to multiple chunks. Consider the case where there is a single Min/Max # Offers Capacity rule with minimum set to 20 and maximum set to 1,000 for channel email. If there are 100,000 customers and a maximum chunk size of 10,000, each chunk is processed using a modified rule where the maximum is 100. Unica Optimize calculates the modified rule maximum value by dividing the rule maximum value (1,000) by the number of chunks (10).

A smaller maximum chunk size might cause more chunks to be created. This setting makes it more likely that a rule might depend on some element (such as email channel) that is less numerous than the number of chunks. If the chunk size is reduced to 100 there would be 1,000 chunks. Now, the minimum for the rule is less than the number of chunks, which makes the modified rule 0.02 (20 divided by 1,000). In this case, 2% of the chunks use a rule with a minimum of 1, and the other 98% of the chunks use a minimum of 0. If each chunk is statistically similar regarding channel email, Unica Optimize processes the rule as expected. A problem occurs when there are fewer customers offered emails than there are chunks. If only 500 customers are offered emails, each chunk has only a 50% chance of containing a customer offered an email. Also, the odds that a particular chunk has both a customer offered an email and a minimum 1 rule is only 1%. Instead of meeting the specified minimum of 20, Unica Optimize returns only 5 on average.

The number of chunks depends on the chunk size and the total number of customers. Since the maximum chunk size is 10,000, the minimum number of customers with a significant element (an item that is used in a rule) must be not less than the number of customers that are divided by 10,000 to achieve optimal results. It might seem that increasing the number of proposed contacts to maintain statistical similarity would lower performance, and it is true that more proposed contacts add to the processor usage. This usage can be more than offset if it allows a smaller chunk size to be used, since these smaller chunks can be processed more quickly.