Example STD Calculation for Anomaly Detection
Discover implements the Anomaly Detection calculations that are based on extracting the data points over the rolling window, depending on the several factors.
The calculations are based on:
- Type of anomaly.
- Configured required dataset for valid rolling window. If the data set includes fewer than four
data points, the standard deviation is not computed and is reported as
--
. - Calculation mode.
This example shows the differences between daily and hourly anomalies. These configuration options are assumed:
- Type of anomaly: Daily and Hourly
- Configured required dataset for valid rolling window: The following
are the default minimum and maximum values:
Anomaly Detections - Minimum data points for calculations
- 4Anomaly Detections - Maximum data points for calculations
- 16
- Calculation mode: Same Days
For calculating a valid Daily or Hourly anomaly detection with the preceding configuration options, a minimum of four weeks of data is required.
Suppose that you are calculating an anomaly of counts for Event A. The counts are summed in the following manner for Daily or Hourly anomalies in this example:
- Day
- count
- Focus date
- sum0
- one week ago
- sum1
- two weeks ago
- sum2
- three weeks ago
- sum3
- four weeks ago
- sum4
For a Daily anomaly, the counts are summed over 24-hour periods in the rolling window, while Hourly anomalies use hourly counts during the rolling window.
The metric STDev_SameDays
is computed by using
the following formula:
StDev_SameDays = std of (sum1, sum2, sum3, sum4)
The average of those four sums is computed this way:
avg_SameDays = avg of (sum1, sum2, sum3, sum4)
The count anomaly is then computed with this formula:
countDev_AvgSameDays = (sum0 - avg_SameDays)/StDev_SameDays