Observability

Status Fields

Fleet reports most information via status fields on its custom resources. These fields are also used by the Rancher UI to display information about the state of the resources.

See status fields reference for more information on status fields and conditions.

K8S Events

Fleet will generate k8s events a user can subscribe to. This is the list of events:

  • Created - a new git cloning job was created
  • GotNewCommit - a git repository has a new commit
  • JobDeleted - a successful git cloning job is removed
  • FailedValidatingSecret - a git cloning job cannot be created, because a required secret is missing
  • FailedToApplyRestrictions - the GitRepo resource violates the GitRepoRestriction resource’s rules
  • FailedToCheckCommit - cannot get latest commit from the git server
  • FailedToGetGitJob - cannot retrieve information from the git cloning job
  • Failed - polling is disabled, triggered via webhook, but cannot get latest commit from the git server

Metrics

Fleet publishes prometheus metrics. They can be retrieved from these services:

  • monitoring-fleet-controller.cattle-fleet-system.svc.cluster.local:8080/metrics
  • monitoring-gitjob.cattle-fleet-system.svc.cluster.local:8081/metrics

The collection of exported metrics includes all the information from controller-runtime, like the number of reconciled resources, the number of errors, and the time it took to reconcile.

When the Fleet is used by Rancher and the rancher-monitoring chart is installed, Prometheus is automatically configured to scrape the Fleet metrics.

NOTE Depending on how many resources are handled by Fleet, metrics may cause performance issues. If you have a lot of resources, you may want to disable metrics. You can do this by setting metrics.enabled in the values.yaml file to false when installing Fleet.

Grafana

When using Grafana and Prometheus, e.g. from https://github.com/prometheus-community/helm-charts, some setup is needed to access Fleet metrics.

  1. Create a ServiceMonitor resource to scrape Fleet metrics. Here is an example:
  1. ---
  2. apiVersion: monitoring.coreos.com/v1
  3. kind: ServiceMonitor
  4. metadata:
  5. # Create this in the same namespace as your application
  6. namespace: cattle-fleet-system
  7. name: fleet-controller-monitor
  8. labels:
  9. # This label makes the ServiceMonitor discoverable by the Prometheus Operator
  10. release: monitoring # <-- ADD THIS LABEL!
  11. spec:
  12. selector:
  13. matchLabels:
  14. # This label must exist on the service you want to scrape
  15. app: fleet-controller # Assumed label, verify this
  16. namespaceSelector:
  17. matchNames:
  18. # We are only looking for the service in its own namespace
  19. - cattle-fleet-system
  20. endpoints:
  21. - port: metrics
  22. path: /metrics
  23. interval: 30s
  24. ---
  25. apiVersion: monitoring.coreos.com/v1
  26. kind: ServiceMonitor
  27. metadata:
  28. # Create this in the same namespace as your application
  29. namespace: cattle-fleet-system
  30. name: fleet-gitjob-monitor
  31. labels:
  32. # This label makes the ServiceMonitor discoverable by the Prometheus Operator
  33. release: monitoring # <-- ADD THIS LABEL!
  34. spec:
  35. selector:
  36. matchLabels:
  37. # This label must exist on the service you want to scrape
  38. app: gitjob
  39. namespaceSelector:
  40. matchNames:
  41. # We are only looking for the service in its own namespace
  42. - cattle-fleet-system
  43. endpoints:
  44. - port: metrics
  45. path: /metrics
  46. interval: 30s

And create it in Fleet’s namespace, e.g. cattle-fleet-system: kubectl create -f servicemonitor.yaml -n cattle-fleet-system

  1. Build the Grafana dashboards and import them into Grafana. You can find the dashboards in the fleet-dashboard repository. Follow the README to build them.