
New Relic service levels are designed to help you monitor and manage your application's performance in real-time. This means you can quickly identify and resolve issues before they impact your users.
By setting service levels, you can define specific thresholds for your application's performance, such as response time or error rate. This allows you to proactively address potential issues before they become major problems.
For example, if your application's response time exceeds 2 seconds, you can set a service level to trigger an alert and notify your team. This helps ensure that issues are addressed promptly, minimizing downtime and improving overall user experience.
New Relic's service levels also enable you to prioritize issues based on their severity and impact on your users. This helps you focus on the most critical issues first, ensuring that your application remains reliable and performant.
Service Level Indicators
Service Level Indicators are the key to measuring the performance and reliability of a service. They provide insights into how well a service is doing, and help teams identify areas for improvement.
A Service Level Indicator (SLI) is a measurable value that indicates the performance of a service. For example, latency, error rate, and throughput are all common SLIs used to measure the performance of a service.
SLIs can be categorized into several types, including latency, error rate, throughput, availability, and response time. Each type of SLI provides a different perspective on the performance of a service.
Here are some common SLI metrics:
SLIs are used to measure the performance of a service and identify areas for improvement. By tracking SLIs, teams can ensure that their services are meeting the required standards and make data-driven decisions to improve performance.
Setting Up SLIs
Setting up SLIs can be a daunting task, but New Relic One's service level management functionality can help identify SLIs to start measuring and establish a baseline for SLOs.
New Relic One scans historical data from a service to determine the best initial setup for SLIs. This can save you time and effort compared to starting from scratch.
To automatically set up SLIs in New Relic One, follow these steps: Log in to New Relic One and select APM from the navigation menu at the top.Select the service entity where you'd like to establish SLIs.Then, on the left hand menu, scroll down to the Service Levels option and select it.
New Relic One offers a one-click setup to establish a baseline for SLIs and SLOs. Simply select the "Add baseline service level objectives" button and let New Relic One work its magic!
Expand your knowledge: Important Level
Some common SLI metrics include latency, error rate, throughput, availability, and response time. These metrics can be used to determine the availability of a system and set up SLOs.
Here are some common SLI metrics and their definitions:
By setting up SLIs and SLOs, you can establish a baseline for performance and reliability, making it easier to identify and resolve issues in the future.
SLI Configuration
New Relic One's service level management functionality can help identify SLIs to start measuring and allow you to establish a baseline for SLOs.
You can automatically set up SLIs and SLOs with New Relic One by following a few simple steps: log in to New Relic One, select APM, choose the service entity, and then select the Service Levels option.
New Relic One's tool identifies common SLIs for a given service, such as availability and latency, and scans historical data to determine the best initial setup. You can also manually create SLIs with NRQL queries.
Here are the steps to track all programmatic SLIs in New Relic One:
- Gather metrics and events through instrumentation, and run New Relic Synthetic API tests to ensure APIs behave as expected.
- Create alert conditions that will create a violation if your SLIs exceed their objectives.
- Create NRQL queries and New Relic One dashboards that reveal when your services miss their indicators.
New Relic One supports the use of wildcards in SLI queries, making it easier to configure SLIs.
SLI and SLO Configuration
New Relic One's service level management functionality can help you identify SLIs to start measuring and establish a baseline for SLOs. It identifies common SLIs for a given service, often availability and latency, and scans historical data to determine the best initial setup.
To automatically set up SLIs and SLOs, log in to New Relic One, select APM from the navigation menu, and then select the service entity where you'd like to establish SLIs. From there, scroll down to the Service Levels option and select it.
You can also manually create SLIs with NRQL queries. If you're looking for a one-click setup, follow these steps: log in to New Relic One, select APM, select the service entity, and then scroll down to the Service Levels option.
Related reading: Newrelic Apm
New Relic One allows you to track all programmatic SLIs, which requires identifying existing instrumentation and deploying instrumentation in your systems and services where it doesn't already exist. This may involve custom instrumentation to track business logic not already captured by New Relic's automatic instrumentation.
To track programmatic SLIs, gather metrics and events through instrumentation, run New Relic Synthetic API tests, create alert conditions, and create NRQL queries and dashboards that reveal when your services miss their indicators.
New Relic One provides various ways to establish SLIs, including output performance SLIs, input performance SLIs, and capability SLIs. You can learn more about each of these in the provided documentation.
With the latest update, the serviceLevelCreate and serviceLevelUpdate mutations, the newrelic_service_level Terraform resource, and the UI setup flow all require one and only one SLO target and period for each SLI. Otherwise, they throw an error response.
Here are the changes to the SLI and SLO configuration:
- API: The serviceLevelCreate and serviceLevelUpdate mutations require one (and only one!) SLO target and period for each SLI.
- Terraform: The newrelic_service_level Terraform resource requires one and (and only one!) SLO target and period for each SLI.
- UI: The setup flow requires one (and only one!) SLO target and period for each SLI.
This simplifies the setup and ensures compliance is shown for 1, 7, and 28 days, regardless of the period set for your SLO.
Metric Timeslice Support
Metric timeslice support is a game-changer for SLI configuration. With this feature, you can now leverage timeslice data in your SLI queries.
APM metrics are reported as timeslice data, making it easier to get insights into your service's performance. This means you can analyze your data in smaller chunks, giving you more detailed information about your service's behavior.
Timeslice data allows you to see how your service performs over time, making it easier to identify trends and patterns. This can be particularly useful for services that experience high variability in traffic or usage.
Here are some benefits of using timeslice data in your SLI queries:
By using timeslice data in your SLI queries, you can gain a deeper understanding of your service's performance and make data-driven decisions to improve it.
Support for Wildcards
New Relic service levels now support the use of wildcards in SLI queries.
This means you can use asterisks (*) to represent multiple characters in your queries, making it easier to filter and analyze your data.
With wildcards, you can create more flexible and dynamic SLI queries that adapt to changing conditions.
New Relic service levels is now supporting the use of wildcards in SLI queries, giving you more options for customizing your SLI configuration.
Performance Benchmarks
Creating performance benchmarks is a crucial step in establishing service levels with New Relic. It helps you determine the availability and reliability of your application.
To set up performance benchmarks, you can use Service Level Indicators (SLIs) that measure the key performance metrics of your application. These SLIs can be based on Transaction events, which is the most common type for request-driven services.
Service success is a recommended SLI that measures the ratio of successful responses to all requests. This is essentially an error rate, and you can filter it down by removing expected errors.
A latency SLI measures the proportion of valid requests that were served faster than a certain threshold. To determine this threshold, check how your service has been performing in the past weeks and use that result as a realistic and achievable baseline.
To select an appropriate value for the duration condition, consider the 95 percentile duration of the responses for the last 7 or 15 days. This can be found using the query builder.
Here are some common SLIs for APM services instrumented with the New Relic agent:
By creating these performance benchmarks, you can establish a baseline for your application's performance and reliability. This will help you determine service boundaries and standardize reliability across teams.
Sources
- https://newrelic.com/blog/how-to-relic/programmatic-service-level-indicator
- https://docs.opslevel.com/docs/new-relic-integration
- https://docs.newrelic.com/docs/tutorial-service-level-mgmt/intro-to-slm/
- https://docs.newrelic.com/docs/tutorial-improve-app-performance/create-benchmarks/
- https://docs.newrelic.com/docs/release-notes/service-levels-release-notes/
Featured Images: pexels.com