New Relic's Anomaly Detection is a game-changer for DevOps teams, helping them identify and resolve issues before they impact users.
It uses machine learning algorithms to analyze historical data and detect anomalies in real-time.
This allows teams to proactively address issues, reducing mean time to detect (MTTD) and mean time to resolve (MTTR).
By leveraging Anomaly Detection, DevOps teams can focus on what matters most – delivering high-quality software and experiences.
Configuring Anomaly Detection
You can create anomaly sensitivity thresholds from an alert condition to monitor incidents that happen either above or below the anomaly.
Set the anomaly direction to monitor incidents that happen either above or below the anomaly. This allows you to focus on specific types of anomalies.
Use the slider bar to adjust the Critical sensitivity threshold, represented in the preview chart by the light gray area around the signal. The tighter the band around the signal, the more sensitive it is and the more incidents it will generate.
You can create a Warning threshold (the darker gray area around the anomaly) in addition to the Critical threshold. This helps to catch potential issues before they become critical.
Here are some key settings to keep in mind when creating anomaly thresholds:
- Direction: Above or Below
- Critical Threshold: Adjust with the slider bar
- Warning Threshold: Create a separate threshold
With these settings in place, you can get a high-level view of your ingest patterns with the Lookout view. This view provides a quick snapshot of your data and helps you identify potential issues.
Understanding Anomaly Detection
Anomaly Detection is automatically enabled across your CloudZero account and all views using Real Cost data.
You can use the Lookout view to get a high-level view of your ingest patterns and create NRQL queries to search for ingest anomalies.
To get more detailed information, you can change the facet field in your NRQL query to something like consumingAccountName, which will give you ingest data by account name.
Anomaly Detection checks globally across the Cloud Provider Dimensions of Accounts, Service, and Usage Family.
You can create custom views and Anomaly Detection will be enabled for that subset of data, allowing you to focus on specific areas of your system.
Thresholds and Rules
You can create anomaly sensitivity thresholds from an alert condition to monitor incidents that happen either above or below the anomaly. Set the anomaly direction to choose which direction you want to monitor.
Use the slider bar to adjust the Critical sensitivity threshold, represented in the preview chart by the light gray area around the signal. The tighter the band around the signal, the more sensitive it is and the more incidents it will generate.
You can create a Warning threshold, represented by the darker gray area around the anomaly. This will alert you to potential issues before they become critical.
For multi-signal conditions, you can set thresholds that apply to all signals being monitored by the condition. Each signal is individually monitored and evaluated, but the settings apply consistently to all signals.
Here are the key rules governing the calculation of the predicted value:
- Age of data: The prediction is calculated using between 1 to 4 weeks of data, depending on data availability and prediction type.
- Consistency of data: Data that remains in a consistent range or trends slowly and steadily will have tighter sensitivity thresholds.
- Regular fluctuations: The prediction algorithm looks for cyclical fluctuations and attempts to adjust to them.
Configure Sensitivity Thresholds
To create anomaly sensitivity thresholds, you can set the anomaly direction to monitor incidents that happen either above or below the anomaly. This is a crucial step in fine-tuning your thresholds.
You can adjust the Critical sensitivity threshold using the slider bar, which is represented in the preview chart by the light gray area around the signal. The tighter the band around the signal, the more sensitive it is and the more incidents it will generate.
You can also create a Warning threshold, represented by the darker gray area around the anomaly. This allows you to set a secondary alert level for less severe anomalies.
When working with multi-signal conditions, the threshold settings apply to all signals being monitored by the condition. However, you can only show a maximum of 500 signals on the preview chart.
Here's a table outlining the rules governing the calculation of the predicted value:
Note that the algorithm takes into account ongoing data fluctuations over a long time period, although greater weight is given to more recent data.
Disabling
Disabling Anomaly Detection is a straightforward process. You can disable it for a specific View by editing the View.
To do this, simply toggle the View Anomalies switch to disable Anomaly Detection. This switch is a key part of the editing process.
Smarter, Easier Alerting
Smarter, easier alerting is now a reality with New Relic anomaly detection. Business services and infrastructure are always changing, and manual threshold setting can be intimidating for many engineers.
You can easily use dynamic thresholds by creating baseline alert conditions that cover all of your services and infrastructure. These adjust to accommodate the expected fluidity and volatility of your business.
No team should be sitting on the sidelines missing the benefits of incident response. We've expanded dynamic baseline alerting to allow one alert configuration to apply dynamic thresholding across up to five thousand related time series of a particular service or entity.
This makes it much easier for your teams to add alert coverage to all of your entities. Just add the “FACET” clause to the NRQL queries you are using, then specify the metadata attributes that differentiate the signals that you want to monitor.
You can simply adjust a slider in the user interface to set and tune the sensitivity. Learn more about Faceted baseline conditions.
Analyzing and Viewing Anomalies
Analyzing and viewing anomalies is a crucial part of identifying and resolving issues in your application. You can view anomalies under Insights, as well as in the Homepage, Explorer, and Notifications.
The Anomaly Details page provides additional details beyond what's available in Insights, including the anomaly start time and whether the spend is ongoing or not. It also includes a 90-day daily cost graph with the anomaly start time highlighted.
You can access the Anomaly Details page by clicking on an anomaly in the Homepage or Notifications. The page includes a View Name, Principal Dimension, and Element to help distinguish anomalies.
The Explorer view of anomalies provides the most granular details about the anomaly, and can be accessed by selecting "Anomalies & Events". When an anomaly is selected, it will expand with more details and other anomaly and events bars will be grayed out.
Changing the time granularity to "Hourly" in the Explorer view provides a precise view of when the anomaly was detected.
Anomaly alerts are also visible in the Notifications tab, and selecting an anomaly will take you to the detail page in Insights.
Here's a quick reference to the different views of anomalies:
- Homepage: Provides an overview of the total number of anomalies in the past 30 days and the total cost of detected anomalies in the last 30 days.
- Insights: Allows you to view anomalies and access the Anomaly Details page.
- Explorer: Provides the most granular details about the anomaly, and can be accessed by selecting "Anomalies & Events".
- Notifications: Displays anomaly alerts and allows you to select an anomaly to view its details.
With the Lookout view, you can get a high-level view of your ingest patterns and create NRQL queries to search for ingest anomalies. For example, the following NRQL query will give you information on all ingest anomalies by usageMetric over the last 24 hours:
API and Data
New Relic's anomaly detection uses API calls to collect data from your application, which is then analyzed to identify unusual patterns.
This data is collected through API calls to your application, which can be triggered by various events such as user interactions or background tasks.
The API calls are designed to be lightweight and non-intrusive, so they don't impact your application's performance.
New Relic's data collection is based on a push model, where data is sent to New Relic's servers in real-time.
This allows for near-instant detection of anomalies, which can be critical in preventing downtime and ensuring customer satisfaction.
New Relic's anomaly detection can identify anomalies in a wide range of data types, including metrics, events, and logs.
The data is analyzed using advanced algorithms and machine learning techniques to identify patterns and anomalies.
The algorithms are designed to learn from the data and adapt to changing patterns over time.
This allows New Relic's anomaly detection to become more accurate and effective over time.
New Relic's data is stored securely and is subject to strict access controls to ensure confidentiality and integrity.
This ensures that sensitive data is protected and only accessible to authorized personnel.
New Relic's data is also aggregated and anonymized to prevent individual user data from being identifiable.
Frequently Asked Questions
What is the most popular anomaly detection?
The most popular anomaly detection method is K-Nearest Neighbors (KNN), a non-parametric approach that identifies anomalies by analyzing their proximity to other data points. Its simplicity and effectiveness make it a widely used choice in many applications.
Sources
- https://docs.newrelic.com/docs/alerts/create-alert/set-thresholds/anomaly-detection/
- https://newrelic.com/blog/how-to-relic/next-gen-ai-ops
- https://docs.cloudzero.com/docs/anomaly-detection
- https://docs.newrelic.com/docs/tutorial-optimize-telemetry/detect-anomalies/
- https://www.devopsschool.com/blog/what-is-new-relic-and-use-cases-of-new-relic/
Featured Images: pexels.com