When running applications in production, a fast feedback loop is a key factor. The following reasons show why it’s essential to gather and combine all sorts of metrics when running an application in production:
Application Metrics provide insights into what is happening inside our Quarkus Applications using the MicroProfile Metrics specification.
Those Metrics (e.g. Request Count on a specific URL) are collected within the application and then can be processed with tools like Prometheus for further analysis and visualization.
Prometheus is a monitoring system and timeseries database which integrates great with all sorts of applications and platforms.
The basic principle behind Prometheus is to collect metrics using a polling mechanism. There are a lot of different so-called exporters , where metrics can be collected from.
In our case, the metrics will be collected from a specific path provided by the application (/metrics
)
On our lab cluster, a Prometheus / Grafana stack is already deployed. Using the service discovery capability of the Prometheus - Kubernetes integration the running Prometheus server will be able to locate our application almost out of the box.
pitc-infra-monitoring
In an early stage of Prometheus - Kubernetes integration, the configuration has been done by annotations. The Prometheus - Kubernetes integration worked by reading specific configured annotations from Kubernetes resources. The information form those annotations helped the Prometheus Server to find the endpoints to collect Metrics from.
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/scheme: http
prometheus.io/port: "8080"
The current OpenShift - Prometheus integration works differently and is way more flexible. It is based on the ServiceMonitor CustomResource.
oc explain ServiceMonitor
We first check that the project is ready for the lab.
Ensure that the LAB_USER
environment variable is set.
echo $LAB_USER
If the result is empty, set the LAB_USER
environment variable.
export LAB_USER=<username>
Change to your main Project.
oc project $LAB_USER
Don’t forget to deploy/update your resources with the git instead of the oc command for this lab.
Let’s now create our first ServiceMonitor.
Create the following ServiceMonitor resource as local file <workspace>/servicemonitor.yaml
.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: amm-techlab
name: amm-techlab-monitor
spec:
endpoints:
- interval: 30s
port: http
scheme: http
path: /metrics
selector:
matchLabels:
application: amm-techlab
Let ArgoCD create the ServiceMonitor by adding the file to git and push it.
git add servicemonitor.yaml && git commit -m "Add ServiceMonitor Manifest" && git push
In hurry and do not want to wait for ArgoCD to sync? Do it manually by applying the file.
oc apply -f servicemonitor.yaml
Expected result: servicemonitor.monitoring.coreos.com/amm-techlab-monitor created
oc policy add-role-to-user monitoring-edit <user> -n <username>
Tell your trainer if you get a permission error while creating the ServiceMonitorPrometheus is integrated into the OpenShift Console under the Menu Item Monitoring. But as part of this lab, we want to use Grafana to interact with prometheus. Open Grafana (https://grafana.techlab.openshift.ch/ ) and switch to the explore tab, then execute the following query to check whether your target is configured or not:
<username>
with your current namespaceprometheus_sd_discovered_targets{config="serviceMonitor/<username>/amm-techlab-monitor/0"}
Expected result on the bottom of the Graph: two targets (Consumer and provider) similar to:
prometheus_sd_discovered_targets{cluster="console.techlab.openshift.ch", config="serviceMonitor/<username>/amm-techlab-monitor/0", container="kube-rbac-proxy", endpoint="metrics", instance="10.128.2.18:9091", job="prometheus-user-workload", name="scrape", namespace="openshift-user-workload-monitoring", pod="prometheus-user-workload-1", prometheus="openshift-monitoring/k8s", service="prometheus-user-workload"}
prometheus_sd_discovered_targets{cluster="console.techlab.openshift.ch", config="serviceMonitor/<username>/amm-techlab-monitor/0", container="kube-rbac-proxy", endpoint="metrics", instance="10.131.0.33:9091", job="prometheus-user-workload", name="scrape", namespace="openshift-user-workload-monitoring", pod="prometheus-user-workload-0", prometheus="openshift-monitoring/k8s", service="prometheus-user-workload"}
The Prometheus Operator “scans” namespaces for ServiceMonitor CustomResources. It then updates the ServiceDiscovery configuration accordingly.
The selector part in the Service Monitor defines in our case which services will be auto discovered.
# servicemonitor.yaml
...
selector:
matchLabels:
application: amm-techlab
...
And the corresponding Service
apiVersion: v1
kind: Service
metadata:
name: data-producer
labels:
application: amm-techlab
...
This means Prometheus scrapes all Endpoints where the application: amm-techlab
label is set.
The spec
section in the ServiceMonitor resource allows us now to further configure the targets Prometheus will scrape.
In our case Prometheus will scrape:
http
(this must match the name in the Service resource)/metrics
using http
This means now: since all three Services data-producer
, data-consumer
and data-transformer
have the matching label application: amm-techlab
, a port with the name http
is configured and the matching pods provide metrics on http://[Pod]/metrics
, Prometheus will scrape data from these pods.
Since the Metrics are now collected from all three services, let’s execute a query and visualize the data. For example, the total amount of Produced, Consumed and Transformed Messages.
<username>
with your current namespace.sum(application_ch_puzzle_quarkustechlab_reactiveproducer_boundary_ReactiveDataProducer_producedMessages_total{namespace="<username>"})
Then click Add Query
and enter the transformed messages query.
sum(application_ch_puzzle_quarkustechlab_reactivetransformer_boundary_ReactiveDataTransformer_messagesTransformed_total{namespace="<username>"})
Add another query with Add Query
and enter the consumed messages query.
sum(application_ch_puzzle_quarkustechlab_reactiveconsumer_boundary_ReactiveDataConsumer_consumedMessages_total{namespace="<username>"})
Finally click Run Query
to execute the queries.
The needed resource files are available inside the folder manifests/05.0/5.1/ of the techlab github repository .
If you weren’t successful, you can update your project with the solution by cloning the Techlab Repository git clone https://github.com/puzzle/amm-techlab.git
. You need to add the new file into your git repository. If not, ArgoCD will delete the resources again.
cd ~/amm-workspace
cp <path-to-the-amm-techlab-repo>/manifests/05.0/5.1/* .
git add servicemonitor.yaml && git commit -m "Add ServiceMonitor Manifest" && git push