startree-ets-ratio-percentile

Description

Detect an anomaly if the metric is outside the prediction boundaries of a model combining a linear regression and an ETS forecasting algorithm. The regression model learns the effect of events The ETS model learns the level, trend and seasonality in the timeseries. The metric is constructed as a ratio of 2 metrics. Aggregation function with 2 operands: PERCENTILETDIGEST, DISTINCTCOUNTHLL,etc…

Flowchart

Screenshot 2025 02 25 At 5.30.37 PM

Parameters

DATA

namedescriptiondefault value

aggregationColumn

The numerator column of the ratio metric.

 

aggregationFunction

The numerator aggregation function of the ratio metric.

 

aggregationColumn2

The denominator column of the ratio metric.

aggregationFunction2

The denominator aggregation function of the ratio metric.

dataSource

The Pinot datasource to use.

dataset

The dataset to query.

monitoringGranularity

The period of aggregation of the timeseries. In ISO-8601 format. Example: PT1H.

ratioMultiplier

Metric ratio multiplier. For instance, set it to 100 to return a percentage rather than a value between 0 and 1.

1

timezone

Timezone used to group by time. In TZ-identifier(opens in a new tab) format.

For instance, UTC or US/Pacific.

UTC

timeColumn

TimeColumn used to group by time. If set to AUTO (the default value), the Pinot primary time column is used.

AUTO

timeColumnFormat

Required if timeColumn is not AUTO. Learn more(opens in a new tab)

 

completenessDelay

The time for your data to be considered complete and ready for anomaly detection. In ISO-8601 format. Example: PT2HLearn more(opens in a new tab)

P0D

queryFilters

Filters to apply when fetching data. Prefix with AND. Example: AND country='US'

 

queryLimit

Maximum number of timeseries point to fetch.

100000001

aggregationParameter

The second argument of the aggregationFunction. Example: for PERCENTILETDIGEST95.

DETECTION

MAIN

namedescriptiondefault value

lookback

Historical time period to use to train the model. In ISO-8601 format. Example: P21D.

sensitivity

The model will detect fewer anomalies with lower sensitivity and more with higher sensitivity.

metricMinimumValue

If set, the predicted value of the detector and the lower/upper bounds cannot be smaller than the given value. For instance, set it to 0 if your metric cannot have a negative value.

metricMaximumValue

If set, the predicted value of the detector and the lower/upper bounds cannot be bigger than the given value. For instance, set it to 100 if your metric cannot be bigger than 100.

 

pattern

Whether to detect an anomaly if it’s a drop, a spike or any of the two.

 

UP_OR_DOWN

seasonalityPeriod

Biggest seasonality period to learn. In ISO-8601 format. Example: P7D.

alpha

ETS level smoothing factor. In [0,1]-1 means auto-optimized by BOBYQA.

-1

beta

ETS trend smoothing factor. In [0,1]-1 means auto-optimized by BOBYQA.

-1

gamma

ETS seasonal smoothing factor. In [0,1]-1 means auto-optimized by BOBYQA.

-1

phi

ETS trend smoothing factor. In [0,1]-1 means auto-optimized by BOBYQA. Only used if trendMode is set to DAMPED.

-1

yeoJohnsonLambda

Experimental. Yeo Johnson transformation lambda. Can be used if the timeseries has heteroskedasticity.

robustInitialization

Experimental. Whether the model should be robust to anomalies in the historical data at the initialization phase. Requires at least 3 seasonal periods of lookback.

true

followDst

Only applies to timezones that have DST changes. If True, the seasonal component of the ETS model follows the DST changes of the timezone. If false, the seasonal component follows the physical time.

true

robustFitting

Experimental. Whether the model should be robust to anomalies in the historical data at the fitting phase. Requires at least 3 seasonal periods of lookback.

true

robustIntervalsLambda

Experimental. Whether the intervals should be robust to anomalies in the historical data. Between 0 and 1. If 0, intervals are not robust. The closer to 1, the more the confidence intervals depend on recent observations.
0.1

intervalsMethod

Method to compute intervals. In CONFIDENCE, PERCENTAGE, ABSOLUTE.

CONFIDENCE

errorMode

ETS error mode as defined here(opens in a new tab)

ADDITIVE

seasonalMode

ETS seasonal mode as defined here(opens in a new tab)

ADDITIVE

trendMode

ETS trend mode as defined here(opens in a new tab)

NONE

regressors

For advanced users. Additional list of features to add to the regression model. These additional features may help the model to learn the effect of events. Events features are created automatically. Learn more(opens in a new tab)

[]

Events

namedescriptiondefault value

eventSqlFilter

Sql filter to apply when fetching events. Learn more(opens in a new tab)

 

eventLookaround

When fetching events, additional margin to apply on startTime and endTime to look around the timeframe. In ISO-8601 format. Example: P1D.

P1D

eventTypes

Type of events to fetch. Example: ["HOLIDAY", "DEPLOYMENT"][] or null means no filtering. The default value ["__NO_EVENTS"] means don’t fetch events.

[‘__NO_EVENTS’]

FILTER

Time of week

namedescriptiondefault value

daysOfWeek

Used to ignore anomalies that happen at specific time periods. A list of days. Anomalies happening on these days are ignored if timeOfWeekIgnore is true. Example: ["MONDAY", "SUNDAY"].

[]

hoursOfDay

Used to ignore anomalies that happen at specific time periods. A list of hours. Anomalies happening on these hours are ignored. Example: [0,1,2,23]

[]

dayHoursOfWeek

Used to ignore anomalies that happen at specific time periods. A mapping of {DAY: [hours]}. Anomalies happening on these timeframes are ignored if timeOfWeekIgnore is true. Example: {"FRIDAY": [22, 23], "SATURDAY": [0, 1, 2]}

 

{}

 

Threshold

namedescriptiondefault value

thresholdFilterMin

Used to ignore anomalies that don’t meet the thresholdFilter min and max. Example: set thresholdFilterMin = 10 to ignore anomalies when the metric is smaller than 10. Can help ignore anomalies happening in low data regimes. Filter threshold minimum. If -1, no minimum threshold is applied.

-1

thresholdFilterMax

Used to ignore anomalies that don’t meet the thresholdFilter min and max. Example: set thresholdFilterMin = 10 to ignore anomalies when the metric is smaller than 10. Can help ignore anomalies happening in low data regimes. Filter threshold maximum. If -1, no maximum threshold is applied.

-1

Guardrail metric

namedescriptiondefault value

guardrailMetricMin

Used to ignore anomalies that don’t meet the guardrail threshold. Minimum threshold of the guardrail metric. If -1, no minimum threshold is applied.

-1

guardrailMetricMax

 

Used to ignore anomalies that don’t meet the guardrail threshold. Maximum threshold of guardrailMetric. If -1, no maximum threshold is applied.

-1

guardrailMetric

Used to ignore anomalies that don’t meet the guardrail threshold. Metric to use as a threshold guardrail. Example: COUNT(*) and set guardrailMetricMin = 100 to ignore anomalies detected when there is less than 100 observations in the period.

COUNT(*)

Simple baseline

namedescriptiondefault value
offsetBaselineFilterPattern

Used to ignore anomalies that are not detected as anomalies by a simple model. Whether to detect an anomaly if it’s a drop, a spike or any of the two.

UP_OR_DOWN

offsetBaselineFilterSensitivity

 

Used to ignore anomalies that are not detected as anomalies by a simple model. Detection sensitivity. For instance with offsetBaselineFilterIntervalsMethod=PERCENTAGE, set 50 for a 50% percentage change threshold. With offsetBaselineFilterIntervalsMethod=ABSOLUTE, set 200 for a 200 absolute difference threshold between the metric and the baseline.

-1

offsetBaselineFilterIntervalsMethod

Used to ignore anomalies that are not detected as anomalies by a simple model. Method to compute intervals. PERCENTAGE or ABSOLUTE.

ABSOLUTE

offsetBaselineFilterModelOffsets

Used to ignore anomalies that are not detected as anomalies by a simple model. A list of offsets in ISO-8601 format to use as baseline. Eg [P7D, P14D] will compare the current value to the aggregation of the values of the 2 previous weeks.

[‘P7D’]

offsetBaselineFilterModelAggregation

Used to ignore anomalies that are not detected as anomalies by a simple model. The aggregation function to use to combine historical values. In MEDIANAVERAGEMINMAX and any of PCTXXXXX eg PCT05 (5th percentile), PCT95PCT999 (99.9th percentile).

MEDIAN

Special events

namedescriptiondefault value

eventFilterSqlFilter

Used to ignore anomalies that happen during events. Sql filter to apply on the events. Learn more

 

eventFilterLookaround

Used to ignore anomalies that happen during events. Offset to apply on startTime and endTime to look around the timeframe. In ISO-8601 format. Example: P1D.

P2D

eventFilterTypes

Used to ignore anomalies that happen during events. List of event types to fetch by. Example: ["HOLIDAY", "DEPLOYMENT"][] fetches all events. Use ["__NO_EVENTS"] to disable.

[‘__NO_EVENTS’]

eventFilterBeforeEventMargin

Used to ignore anomalies that happen during events. A period in ISO-8601 format that corresponds to a period that is also impacted by the event. Example: if beforeEventMargin is P1D, if event happens on [Dec 24 0:00, Dec 25 0:00[, the label will be applied to anomalies happening on [Dec 23 0:00 and Dec 25 0:00[

P0D

eventFilterAfterEventMargin
Used to ignore anomalies that happen during events. Same as eventFilterBeforeEventMargin at the end of the event.
P0D

Impact

namedescriptiondefault value

impactThreshold

Used to ignore anomalies that don’t meet the impact threshold. Impact filter threshold.

-1

POSTPROCESS

Data mutability

namedescriptiondefault value

mutabilityPeriod

Use if your data is mutable. ThirdEye will maintain the detection results up to date on the mutable period. For instance, if your last 10 days of data is mutable, set P10D. At each cron detection job, the detection results for the last 10 days will be updated.

P0D

reNotifyPercentageThreshold

For detection replay when data is mutable. If the percentage difference between an existing anomaly and a new anomaly on the same time frame is above this threshold, renotify. Combined with reNotifyAbsoluteThreshold. Both thresholds must pass to be re-notified. If zero, always renotify. If null or negative, never re-notifies.

-1

reNotifyAbsoluteThreshold

For detection replay when data is mutable. If the absolute difference between an existing anomaly and a new anomaly on the same time frame is above this threshold, renotify. Combined with reNotifyPercentageThreshold. Both thresholds must pass to be re-notified. If zero, always renotify. If null or negative, never re-notifies.

-1

Anomaly merger

namedescriptiondefault value

mergeMaxGap

Maximum gap between 2 anomalies for anomalies to be merged. In ISO-8601 format. Example: PT2H. The default behavior is to merge consecutive anomalies only. To disable anomaly merging entirely, set this value to P0D.

 

mergeMaxDuration

Maximum duration of an anomaly merger. At merge time, if an anomaly merger would get bigger than this limit, the anomalies are not merged. In ISO-8601 format. Example: P7D.

 

RCA

namedescriptiondefault value

rcaAggregationFunction

The aggregation function to use for RCA. If the detection metric name is known to ThirdEye, this parameter is optional.

 

rcaIncludedDimensions

List of the dimensions (columns in the dataset) to use in RCA drill-downs. If not set or empty, all dimensions of the table are used. Learn more(opens in a new tab).

[]

rcaExcludedDimensions

 

List of dimensions (columns in the dataset) to ignore in RCA drill-downs. If not set or empty, all dimensions of the table are used. rcaExcludedDimensions and rcaIncludedDimensions cannot be used at the same time.

[]

rcaEventTypes

A list of type to filter on for RCA. Only events that match such types will be shown in the RCA related events tab. Learn more(opens in a new tab).

[]

rcaEventSqlFilter

A Sql filter for RCA events. Only events that match the filter will be shown in the RCA related events tab. Learn more(opens in a new tab).