Index


Overview

Thresholds are used to record instances of notable events occurring within your network and, optionally, to trigger automated alerts advising you of such events. Statseeker allows you to configure thresholds against:

  • Any and every metric collected by Statseeker
  • Both raw and calculated values (averages, standard deviations, 95th percentile, rates of change, etc.)
  • Interpretive values such as anomaly strength
  • Not just recorded, but also predicted values using trendlines and forecasted data based on your specific device and environment data history

Thresholds events are recorded when all conditions specified in the threshold configuration are met, and each time those conditions are met.

Note: by default, threshold events are stored for 400 days before being purged from Statseeker. This value can be modified to suit your requirements (setting to “0” will store these events indefinitely) from the Administration Tool. Select:

  • Administration Tool > Network Discovery – Advanced Options > Advanced Options > History > Keep Threshold Event History For

Example:
A threshold has been configured for when Tx Utilization, of greater than 95%, occurs over a 10-minute interval.

  • An event is not recorded when interface utilization passes 95%
  • An event is recorded when the average interface utilization for the last 10 minutes exceeds 95%
  • If the average interface utilization remains above 95% for a further 10 minutes, then another event will be recorded

Thresholds can be set so that an event is recorded while the metric is in breach of threshold (as described above), or when it transitions from one ‘state’ to another. The states that Statseeker utilizes are:

  • High: the monitored value is above that set threshold level
  • Low: the monitored value is below that set threshold level
  • Unknown: the monitored value cannot currently be determined because the device\interface is unreachable; typically, because it is offline or something upstream of the device\interface is offline
Note: if you have ‘On Transition’ thresholds set for the interfaces on a device (or group of devices), and the device becomes unreachable, then a transition event will be recorded for every interface on the device. You can use a Bundling Policy to group all resulting alerts into a single alert message and prevent any alert bombardment.

You can also use an Upstream Device Configuration to suppress alerts for unreachable downstream devices\interfaces, when an upstream device becomes unreachable.

[top]


Threshold Levels

It is important to tailor thresholds to your network environment. It is strongly advised that you analyze your network to identify the extent of usual activity, and use this information when setting thresholds, to prevent threshold events being recorded against activity which is typical for your network. When setting a threshold:

  • Be selective when assigning the threshold, consider which metrics to threshold and on which devices/groups
  • Review the history of that metric on those devices, and select a threshold that is outside of the observed typical behavior
  • Be sure to set the Time Filter > Data Range parameter for the threshold to take into account the ‘spiky’ behavior often encountered with interface utilization and CPU and memory load. A very small data range can be responsible for generating an excessive number of threshold events.
    • Note: in the absence of an explicitly set Time Filter > Data Range, the Interval acts as both the interval (how often to assess the threshold), and the Data Range (what data to evaluate when assessing the threshold)
  • Remember to account for network changes, such as maintenance windows, with respect to alerts being generated from threshold events

[top]


Configuring a Threshold

To configure a new threshold:

  • Select Administration Tool > Alerting / Event Management > Threshold Config
  • Click Add

This will display the Threshold Configuration screen.

Field Required Description
Name The name of the threshold. This will be referenced by alerts and reports.
State The state of the threshold:

  • Enabled – the threshold configuration is generating events when the threshold requirements are met
  • Disabled – the threshold configuration generates no events, and no alerts can be triggered by breaches of this threshold
Attribute The metric that the threshold will be set against
Format Data manipulation (Average, Total, Standard Deviation, 95th Percentile, etc.) applied to the metric

Format Description
Average Average of the metric over the reporting period
Total Total of the metric over the reporting period
Minimum Minimum of the metric over the reporting period
Maximum Maximum of the metric over the reporting period
Median Median of the metric over the reporting period
Standard Deviation The value of a single standard deviation for the metric
95th Percentile 95th Percentile (95% of observed values lie below this point) of the metric over the reporting period
Percentile Custom percentile value of the metric over the reporting period [0-100]
0% = minimum; 50% = median; 100% = maximum
Count Number of non-null data points
Anomaly Metric An integer [-100 to +100] indicating how anomalous the data in the reporting period is compared to the data history of the metric on that device/interface, and whether that data presents a higher (positive anomaly metric), or lower (negative anomaly metric) value than typically seen

Note: values between -85 and +85 are not considered statistically anomalous.
Anomaly Strength A positive integer [0 – 100] indicating how anomalous the data in the reporting period is compared to the data history of the metric on that device/interface

Note: values under 85 are not considered statistically anomalous.
Trend A range of options for displaying a trend related value for the metric

  • Daily Change: the average daily change in the metric over the reporting period
  • Change: numerical change observed over reporting period
  • Percent Change: percent change observed over reporting period
  • Strength: how closely does the trendline fit the data. A high strength means that the data points lie close to the trendline.
  • Prediction: specify a future date/time to see a predicted value for the metric based on the trendline

The Disable bounds checking in trend calculations option can be unchecked to limit trendline values to the upper/lower boundary of historically observed data.

Baseline Extract a baseline value for the metric. The baseline percentile is set to 50 in all instances.

Baseline Formats:

  • Average: the average of the 50th percentile values throughout the baseline history range
  • Comparison: difference between the baseline 50th percentile and the observed value
Forecast A range of options for displaying forecast data for the metric

  • Value: the forecast value for the metric as the specified date-time
  • Upper: the upper boundary for expected forecast data between now and the specified date-time
  • Lower: the lower boundary for expected forecast data between now and the specified date-time
  • Daily Change: average daily change in metric between now and the specified date-time
Baseline History Timefilter specifying the scope of historical data used when calculating Baseline, Anomaly or Forecast values.
Defaults to previous 6 months (range = now – 180d to now).
Interval (mins) How often the threshold is checked.

Note: if a Time Filter is not set, then the Interval is used in its place. For example:

  • Format: average
  • Interval: 5mins
  • Time Filter: not set

Every 5 minutes, the attribute average is calculated for the previous 5 minutes and compared to the threshold value. Compare this to the following where a time filter is set.

  • Format: average
  • Interval: 5mins
  • Time Filter: Range = now -10m to now

Every 5 minutes, the attribute average is calculated for the previous 10 minutes and compared to the threshold value.

Condition Threshold is in breach when the metric is Above or Below the threshold Value
Trigger
  • While in BreachAttribute is checked at each Interval, an event record is created when the attribute enters/leaves the breached state. If an alert is associated with the threshold, then the alert will trigger when the attribute enters the breached state, and additional alerts are generated at each interval that the attribute remains in a breached state.
  • Only on TransitionAttribute is checked at each Interval, an event record is created when the attribute enters/leaves the breached state. If an alert is associated with the threshold, then the alert will trigger when the attribute enters the breached state, additional alerts are not generated if the attribute remains breached across multiple intervals.
Device Aggregation Format Enable to combine data from a single device. Trigger alerts based on:

  • CPU load metrics averaged across all cores on multi-CPU systems
  • Interface utilization averaged across all interfaces on each thresholded device
  • Total traffic or the 95th percentile across all interfaces on a device
Value The value that the monitored Attribute/Format combination is compared to for triggering an event
Time Filter
The time filter settings allow you to set a data range to be used when calculating the Attribute/Format combination, the result will be compared against the threshold Value to determine if an event should be triggered.
Data Range Basic time filter options used to populate the time filter Query field.

Note: the Last option offers simple and flexible time range assignment.
Time Zone The time-zone used when collecting data within the specified Data Range. If not specified, the server time zone will be used.
Query Info The time filter query string that will be used for the threshold. The content of this field is automatically constructed based on the Data Range and Time Zone fields but can also be manually edited.

The Advanced button opens the Advanced Time Filter Editor.

Attribute Filters
The Attribute Filters allow you to limit the devices/interfaces that the threshold will apply to, based on an attribute of those devices/interfaces. Multiple filters may be applied by clicking the associated (+) button, and are combined in a logical AND fashion, i.e. all must be satisfied for the event to trigger.
Attribute The attribute to be used to filter devices for thresholding
Regex The RegEx string used to match against attributes for filtering
Entity Filters
The Entity Filters allow you to limit the devices/interfaces that the threshold will apply to, based on an attribute of those devices/interfaces. Multiple filters may be applied by clicking the associated (+) button, and are combined in a logical AND fashion, i.e. all must be satisfied for the event to trigger. Multiple group/device/interface selections within a single filter are combined in a logical OR fashion.
Type The type of entity, this selection will populate the Entities list
Entities The Exclude and Include groups for entity selection
  • Enter a Name for the threshold
  • Select the Attribute (metric) for the threshold using the drop-down lists provided
  • Set all other fields in the Threshold section
Note: refer to the table above for details on any specific field and refer to the examples below for a demonstration of various typical configurations.
  • Optionally, set a Time Filter. If a time filter is not set, then the interval will be used instead.
  • Optionally, set one or more Attribute Filters to refine the device\interface selection that the threshold applies to
  • Optionally, set one or more Entity Filters to refine the device\interface selection that the threshold applies to
  • Click Save when done

Once the threshold configuration has been saved, and if the threshold State = Enabled, activity satisfying the threshold requirements will generate threshold events. These generated events are logged and can be used to trigger alert functionality.

Note: existing thresholds can be cloned to speed up the process of threshold creation. This is particularly useful when creating a number of very similar thresholds that vary in a single attribute such as the devices being targeted. For more information, see Cloning a Threshold.

[top]

Entity Filter Combination

Multiple Entity filters may be applied by clicking the associated (+) button, and are combined in a logical AND fashion, i.e. all must be satisfied for the event to trigger. Multiple group/device/interface selections within a single filter are combined in a logical OR fashion.


[top]


Example: Server File System Usage

The following is an example of a server file system threshold configuration.

  • Every 15 minutes, check the average used percentage, for the previous 15 minutes, of the /DataStore partition, on servers in the AU_Servers group
  • Trigger an event if this value transitioned to be above 85% (IE. do not trigger subsequent events if it remains above 85%)


[top]


Example: Predictive File System Usage

The following is an example of a predictive threshold configuration. Predictive thresholds use your data history to predict the future shape of your data.

  • Once a day, use the last 3 months data history of the file system to predict the used percentage at a point 60 days into the future, of the /DataStore partition, on servers in the Servers group
  • Trigger an event if this value predicted to be above 85%


[top]


Example: Outbound Traffic Rolling Average

Every Minute, calculate the average Tx Utilization for all interfaces in the AU_Routers and AU_Switches groups, and trigger an alert if this average is greater than 95%.


[top]

Example: Monthly Bandwidth Usage

Every 15 minutes, check the primary gateway router to see if incoming traffic is greater than 80% of the 3TB monthly limit for the site. Reset the traffic count on the 22nd of each month.


[top]


Monitoring Threshold Configurations

It is recommended that you monitor a newly configured threshold prior to configuring any alerts to be generated from the threshold. Too low a threshold can result in alert flooding, while too high a threshold can result in anomalous activity going unreported.

To review a threshold’s performance for the week previous:

  • Specify a time filter of Last Week
  • Select Thresholds > Threshold Summary from the Report List

The summary report will detail each of the threshold configurations found on your server.

Selecting one of the threshold configurations will open the Threshold Report filtered by the selected threshold and the timefilter specified in the Console. The total number of generated events is an indicator of how appropriate the threshold level and interval are for your network activity.


[top]

Managing Threshold Data

By default, Statseeker stores threshold event records for 400 days. This value can be altered as needed and threshold records can be kept indefinitely if required (set storage time to 0). To update the default value:

  • Select Admin Tool > Network Discovery – Advanced Options > Advanced Options
  • Under the History section, locate Keep Threshold Event History For and update as needed
  • Click Save


[top]


Editing a Threshold

To edit an existing threshold configuration:

  • Select Administration Tool > Alerting / Event Management > Threshold Config
  • Select the configuration to be edited, the Search field above the threshold list accepts both standard strings and case-insensitive RegEx

The configuration panel will display the selected threshold configuration.

  • Edit the configuration as required and click Save

Saving the configuration will restart the Statseeker thresholding monitoring engine, utilizing the latest configurations.

Note: editing some fields will affect how the threshold is referenced by Statseeker, in this instance you will be alerted to the fact that saving the changes will delete the existing threshold history for that threshold. In this instance, we recommend duplicating the threshold and updating all associated alert configurations to point to the new threshold, this will allow you to keep your event history for the old threshold.
These fields are:

  • Name
  • Attribute
  • Format
  • Device Aggregation Format
  • Value

[top]


Enabling/Disabling Thresholds

To enable/disable thresholds:

  • Select Administration Tool > Alerting / Event Management > Threshold Config
  • Click the threshold/s to select (the Search field is case-insensitive RegEx enabled) and click Enable/Disable



To enable/disable thresholds from within the rule configuration:

  • Select Administration Tool > Alerting / Event Management > Threshold Config
  • Click the threshold to select
  • Set Status to On/Off as needed
  • Click Save

[top]


Cloning a Threshold

To clone an existing threshold configuration:

  • Select Administration Tool > Alerting / Event Management > Threshold Config
  • Select the configuration to be cloned, the Search field above the threshold list accepts both standard strings and case-insensitive RegEx
  • Click Clone

A new threshold configuration will be created, but not saved, amend the configuration as needed and be sure to save the threshold prior to leaving the configuration screen.

[top]


Deleting Thresholds

To delete an existing threshold configuration:

  • Select Administration Tool > Alerting / Event Management > Threshold Config
  • Select the configuration/s to be deleted and click Delete
  • Confirm the action when prompted

This action will restart the Statseeker thresholding engine, utilizing the remaining configurations.

[top]


Using Thresholds to Trigger Alerts

Alerts can be triggered from recorded threshold events. To configure a threshold event alert:

Note: the instructions below presume a working knowledge of Statseeker alert configuration, for more details on this, see Configuring Alerts.
  • Select Administration Tool > Alerting / Event Management > Alerting
  • Click Add

This will display the New Alert configuration screen.

  • Select an alert Template to suit your requirements
  • Name the alert
  • Set the alert status to On
  • Set Event Type to the required threshold event

  • Set Entity and Time Filters if you want alerts triggered by a subset of the threshold event records
  • Set the Time Filter Mode as required for the alert
  • Configure the alert recipients
  • Click Save Alert


[top]