To ensure the smooth operation of your OpenShift environment, implementing effective monitoring and management best practices is crucial. Monitoring your cluster's performance and resources is essential to identify potential issues before they become major problems.
Regularly checking the OpenShift dashboard can help you stay on top of your cluster's health. This includes monitoring metrics such as CPU and memory usage, as well as the number of pods and nodes.
Understanding the OpenShift monitoring tools, such as Prometheus and Grafana, is vital for effective cluster management. By leveraging these tools, you can gain valuable insights into your cluster's performance and make data-driven decisions.
By following these best practices, you can ensure your OpenShift environment runs efficiently and effectively, reducing downtime and improving overall performance.
vRealize Operations Monitoring
To monitor your OpenShift clusters with vRealize Operations, you'll need to grab the latest Container monitoring management pack from the VMware Solution Exchange website. This pack is called the vRealize Operations Management Pack for Container Monitoring.
You'll need to log in to your vRealize Operations Manager with administrator privileges to install the pack. From the menu, select Administration and in the left pane, select Solutions > Repository. Click Add/Upgrade and browse to locate the temporary folder and select the PAK file.
The upload might take several minutes, so be patient. Once the upload is complete, read and accept the EULA and click Next. When the installation is finished, click Finish.
To link any Kubernetes to your environment for monitoring, you'll need to install the cAdvisor Daemon. For OCP, you can use the cAdvisor YAML Definition on HostPort. You'll also need to create some credentials to authenticate to your cluster from your vROPs connection.
Here are the steps to configure your Kubernetes Adapter in vRealize Operations:
1. From the main menu of vRealize Operations Manager, click Administration, and then in the left pane, click Solutions.
2. From the Solutions list, select VMware vRealize Operations Management Pack for Container Monitoring.
3. Click the Configure icon to edit an object.
4. Enter the display name of the adapter.
5. Enter the http URL of the Kubernetes master node in the Master URL text box.
6. Select DaemonSet as the cAdvisor Service.
7. Enter the port number of cAdvisor (Default is 31194).
8. Enter the Credential details of the Master URL.
Note that if your OCP cluster is running on vCenter Server which is monitored by vRealize Operations, you can view a link from the Kubernetes node to the vSphere Virtual Machine. To view the link, enter the IP address of the vCenter Server instance.
New Relic Integration
The New Relic Kubernetes integration is available through the Red Hat Container Catalog and can be installed on an OpenShift cluster in just a few steps.
To start, you'll need to have an OpenShift cluster running, and then you can install the New Relic Kubernetes integration by following the steps outlined in the documentation.
The New Relic Infrastructure agent is installed as a Kubernetes DaemonSet, which ensures that the integration is automatically running on each node in your OpenShift cluster.
Here are the steps to create the DaemonSet:
- Run the command `oc create -f newrelic-infrastructure-k8s-latest.yaml`
- Check that the DaemonSet is running with the command `oc get daemonsets`
- Verify that kube-state-metrics is running with the command `oc get pods --all-namespaces | grep kube-state-metrics`
Once you've installed and verified the integration, you can start monitoring your OpenShift cluster with over 775 integrations available for free.
By linking your New Relic APM data with your OpenShift data, you can correlate the performance of your applications with your infrastructure and gain deep insights into your application performance.
Examining Kubernetes
Examining Kubernetes is a crucial step in OpenShift monitoring. To do this, you'll first need to ensure that kube-state-metrics is running.
You can then navigate to New Relic Infrastructure and click on Kubernetes in the menu bar. This opens the New Relic Kubernetes cluster explorer, which displays all the pods, namespaces, deployments, and nodes in your OpenShift environment.
The cluster explorer allows you to filter by namespace, deployment, node, and cluster, making it easy to locate and drill down to the pods you care about most.
To track resource requests and resource limits, you can use data gathered in New Relic to accurately plan your capacity and understand which teams are consuming the most resources in your cluster.
Examining Your Kubernetes
You can examine your Kubernetes cluster using New Relic Infrastructure. This tool shows all the pods, namespaces, deployments, and nodes in your OpenShift environment.
The New Relic Kubernetes cluster explorer allows you to filter by namespace, deployment, node, and cluster. This makes it easy to locate and drill down to the pods you care about most.
You can use the cluster explorer to track resource requests and resource limits. This helps you accurately plan your capacity and understand which teams are consuming the most resources in your cluster.
Some of the key metrics you can track include CPU and memory usage. For example, you can view the top 5 nodes by CPU details, including CPU limit, CPU request, and allocatable CPU processor count.
Here are some of the key metrics you can view:
You can also view node details, including name, hostname, internal IP, OS, and type. Additionally, you can view pod details, including name, pod usage, used pod count, allocatable pod count, and pod utilization.
The New Relic Kubernetes cluster explorer is a powerful tool for examining your Kubernetes cluster. With its ability to filter and track key metrics, you can gain a deeper understanding of your cluster's performance and make data-driven decisions to optimize its usage.
Pods
Examining Kubernetes is a complex task, but understanding the basics of pods is a great place to start.
A pod is the basic execution unit in a Kubernetes environment. You can think of it as a container, but with more capabilities.
Pods are identified by a unique name, and they have a specific project name associated with them. The project name is the name of the project in which the pod is created.
Each pod resides on a specific node in the cluster, and the node name is easily accessible. The node name is the name of the node in which the pod resides.
Pods have various attributes, including pod name, project name, node name, and pod application. The pod application is the name of the pod application.
Pods also have a pod type, pod IP, pod status, pod start time, and pod created time. These attributes provide valuable information about the pod's status and performance.
Here are some key pod attributes:
Understanding these attributes is crucial for effective pod management and troubleshooting.
Containers
Examining Kubernetes is a complex task, but let's take a closer look at the containers.
Containers are a fundamental component of Kubernetes, and understanding how they work is crucial for effective management.
You can view the top 5 containers by restart count, which can help identify potential issues.
This information is particularly useful when troubleshooting container-related problems.
Here are some key details about containers:
Each of these details provides valuable insights into a container's behavior and performance.
Container restart count is an important metric to monitor, as it can indicate issues with the container or its dependencies.
Replica
In Kubernetes, a Replica is a crucial component that ensures your application is always available. It specifies the number of pods that need to be running at any given time.
A Replica can be created with a specific name, and it's created within a project. The name of the project is also displayed.
The Desired Replica is a key parameter that determines the number of pods that need to be running. This value is crucial in ensuring that your application is always available to your users.
The Running Replica displays the actual number of pods that are currently running. This value may not always match the Desired Replica, especially during scaling or deployment.
Available Replica is another important parameter that specifies the number of application replicas that are available to your users. This value is essential in ensuring that your users have access to the application.
You can perform various actions on a Replica, including viewing its details and taking actions on it. The exact actions available depend on the specific Replica you're working with.
Here are the key parameters of a Replica:
Node Transitions
Node Transitions are a crucial aspect of Kubernetes, and understanding how to move monitoring components to different nodes is essential for optimizing cluster performance.
To move monitoring components, you can edit the cluster-monitoring-config ConfigMap, which is the first step in the process.
The next step is to specify the nodeSelector constraint for the component under data/config.yaml. This involves adding the nodeSelector key-value pairs that specify the destination node.
You can only run a component on a node that has each of the specified key-value pairs as labels. The node can have additional labels as well.
For example, to move components to the node that is labeled foo: bar, you would use the following configuration. Here's an example of what this might look like:
Once you've saved the file, the components affected by the new configuration are moved to new nodes automatically.
Understanding OpenShift
Understanding OpenShift is crucial for effective monitoring.
OpenShift is a container application platform that allows developers to deploy, manage, and scale applications in a cloud-native environment.
It's built on top of Kubernetes, which provides a robust and scalable way to manage containers.
OpenShift offers a range of features that make it an attractive choice for organizations, including automated deployment, rolling updates, and self-healing.
These features help ensure high availability and reliability for applications running on OpenShift.
Overview
OpenShift is a powerful and flexible open source container application orchestrated and managed by Kubernetes. They’re complex to set up, monitor and maintain.
To outmaneuver the operational challenges faced when dealing with OpenShift containers, round-the-clock OpenShift monitoring is necessary. This is because applications can be complex and difficult to manage, making monitoring a crucial aspect of their operation.
The monitoring stack in OpenShift is based on the Prometheus open source project and its wider ecosystem. This includes default platform monitoring components, which provide monitoring for core OpenShift Container Platform components including Kubernetes services.
The default monitoring stack also enables remote health monitoring for clusters. After optionally enabling monitoring for user-defined projects, additional monitoring components are installed in the openshift-user-workload-monitoring project, providing monitoring for user-defined projects.
Here's a list of some key metrics that can be monitored in OpenShift:
OpenShift v4.17 and below are supported by OpenShift monitoring capabilities.
Understanding the Stack
The OpenShift Container Platform monitoring stack is built on the Prometheus open source project and its ecosystem.
The monitoring stack includes default platform monitoring components, which are installed in the openshift-monitoring project by default during an OpenShift Container Platform installation. These components provide monitoring for core OpenShift Container Platform components, including Kubernetes services, and enable remote health monitoring for clusters.
The default monitoring stack also includes components for monitoring user-defined projects, which are installed in the openshift-user-workload-monitoring project. This provides monitoring for user-defined projects.
The monitoring stack includes the following components:
In addition to the components of the stack itself, the monitoring stack monitors various other OpenShift Container Platform components, including CoreDNS, Elasticsearch, and HAProxy.
User-Defined Projects
OpenShift Container Platform 4.10 includes an optional enhancement to the monitoring stack that enables you to monitor services and pods in user-defined projects.
This feature includes the following components: Prometheus Operator, Prometheus, and Thanos Ruler. These components are deployed after monitoring is enabled for user-defined projects.
You can monitor metrics provided through service endpoints in user-defined projects and pods running in user-defined projects.
Here are the specific components for monitoring user-defined projects:
Assigning Tolerations
You can assign tolerations to any of the monitoring stack components to enable moving them to tainted nodes. This is done by editing the cluster-monitoring-config ConfigMap.
To start, you'll need to edit the cluster-monitoring-config ConfigMap using the command `oc -n openshift-monitoring edit configmap cluster-monitoring-config`.
The tolerations for the component are specified in the config.yaml file within the ConfigMap. For example, to make the alertmanagerMain component ignore a taint and place it in a tainted node, you would use the following toleration specification:
- key: "key1"
- operator: "Equal"
- value: "value1"
- effect: "NoSchedule"
You can substitute the component and toleration specification as needed. For instance, to make the alertmanagerMain component ignore a taint that prevents the scheduler from placing pods in a node with key1=value1:NoSchedule, you would use the following toleration specification:
- key: "key1"
- operator: "Equal"
- value: "value1"
- effect: "NoSchedule"
Once you've specified the tolerations, save the file to apply the changes. The new component placement configuration is applied automatically.
Frequently Asked Questions
What is OC monitoring?
OpenShift Container Platform (OC) monitoring is a preconfigured monitoring stack that tracks core platform components and user-defined projects. It's a self-updating tool that helps you stay on top of your cluster's performance and health
What is OpenShift telemetry?
OpenShift telemetry collects anonymous data on cluster size, component health, and usage to help improve the platform. This data is used to enhance the overall performance and reliability of OpenShift Container Platform.
Sources
- https://www.manageengine.com/products/applications_manager/help/openshift-monitoring-tools.html
- https://blogs.vmware.com/management/2020/08/vrops-cloud-monitoring-openshift-container-platform.html
- https://newrelic.com/blog/how-to-relic/red-hat-openshift-monitor
- https://docs.openshift.com/container-platform/4.10/monitoring/monitoring-overview.html
- https://docs.redhat.com/en/documentation/openshift_container_platform/4.2/html/monitoring/cluster-monitoring
Featured Images: pexels.com