Example STD Calculation for Anomaly Detection

Discover implements the Anomaly Detection calculations that are based on extracting the data points over the rolling window, depending on the several factors.

The calculations are based on:

  • Type of anomaly.
  • Configured required dataset for valid rolling window. If the data set includes fewer than four data points, the standard deviation is not computed and is reported as --.
  • Calculation mode.

This example shows the differences between daily and hourly anomalies. These configuration options are assumed:

  • Type of anomaly: Daily and Hourly
  • Configured required dataset for valid rolling window: The following are the default minimum and maximum values:
    • Anomaly Detections - Minimum data points for calculations - 4
    • Anomaly Detections - Maximum data points for calculations - 16
  • Calculation mode: Same Days

For calculating a valid Daily or Hourly anomaly detection with the preceding configuration options, a minimum of four weeks of data is required.

Suppose that you are calculating an anomaly of counts for Event A. The counts are summed in the following manner for Daily or Hourly anomalies in this example:

Day
count
Focus date
sum0
one week ago
sum1
two weeks ago
sum2
three weeks ago
sum3
four weeks ago
sum4

For a Daily anomaly, the counts are summed over 24-hour periods in the rolling window, while Hourly anomalies use hourly counts during the rolling window.

The metric STDev_SameDays is computed by using the following formula:


StDev_SameDays = std of (sum1, sum2, sum3, sum4)

The average of those four sums is computed this way:


avg_SameDays = avg of (sum1, sum2, sum3, sum4)

The count anomaly is then computed with this formula:


countDev_AvgSameDays = (sum0 - avg_SameDays)/StDev_SameDays