收集metrics

在本文档中,介绍如何以Deployment或者Daemonset的方式部署grafana-agent到您的k8s集群中,抓取宿主机上kubeletcAdvisor的metrics指标,并把抓取到的数据,以remote_write的方式推送到Nightingale.

通过本文档,我们预期达成以下目标:

  1. 部署grafana-agent到您的K8s集群中;
  2. 配置grafana-agent抓取kubelet和cAdvisor的metrics;

K8s是开源的容器编排系统,自动化管理容器的部署、扩缩容等工作。K8s默认会暴露Node和控制面的若干metrics接口,这些接口兼容Prometheus的metrics规范。我们可以部署grafana-agent来收集Node的cAdvisor和kubelet metrics,并以remote_write的方式发送到Nightingale.

前置依赖

  1. 一个开启RBAC(role-based access control)的Kubernetes集群;
  2. 安装并配置好了kubectl命令行工具;

步骤一:创建 ServiceAcountClusterRoleClusterRoleBinding

  1. export NAMESPACE=default
  2. MANIFEST_URL=https://raw.githubusercontent.com/flashcatcloud/fc-agent/fc-release/etc/k8s/agent-bare.yaml
  3. curl -fsSL $MANIFEST_URL | envsubst | kubectl apply -f -

步骤二:创建ConfigMap,配置grafana-agent

  1. export NAMESPACE=default
  2. export CLUSTER_NAME=kubernetes
  3. export FC_REMOTE_WRITE_URL=http://10.206.0.16:8480/insert/0/prometheus/api/v1/write
  4. #export FC_REMOTE_WRITE_URL=https://n9e-server:19000/prometheus/v1/write
  5. #export FC_REMOTE_WRITE_USERNAME=fc_laiwei
  6. #export FC_REMOTE_WRITE_PASSWORD=fc_laiweisecret
  7. cat <<EOF |
  8. kind: ConfigMap
  9. metadata:
  10. name: grafana-agent
  11. apiVersion: v1
  12. data:
  13. agent.yaml: |
  14. server:
  15. http_listen_port: 12345
  16. metrics:
  17. wal_directory: /tmp/grafana-agent-wal
  18. global:
  19. scrape_interval: 15s
  20. scrape_timeout: 10s
  21. external_labels:
  22. cluster: ${CLUSTER_NAME}
  23. configs:
  24. - name: integrations
  25. remote_write:
  26. - url: ${FC_REMOTE_WRITE_URL}
  27. basic_auth:
  28. username: ${FC_REMOTE_WRITE_USERNAME}
  29. password: ${FC_REMOTE_WRITE_PASSWORD}
  30. scrape_configs:
  31. - job_name: integrations/kubernetes/cadvisor
  32. bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  33. kubernetes_sd_configs:
  34. - role: node
  35. metric_relabel_configs:
  36. - action: drop
  37. regex: container_([a-z_]+);
  38. source_labels:
  39. - __name__
  40. - image
  41. - action: drop
  42. regex: container_(network_tcp_usage_total|network_udp_usage_total|tasks_state|cpu_load_average_10s)
  43. source_labels:
  44. - __name__
  45. relabel_configs:
  46. - replacement: kubernetes.default.svc:443
  47. target_label: __address__
  48. - regex: (.+)
  49. replacement: /api/v1/nodes/\$1/proxy/metrics/cadvisor
  50. source_labels:
  51. - __meta_kubernetes_node_name
  52. target_label: __metrics_path__
  53. scheme: https
  54. tls_config:
  55. ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  56. insecure_skip_verify: false
  57. server_name: kubernetes
  58. - job_name: integrations/kubernetes/kubelet
  59. bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  60. kubernetes_sd_configs:
  61. - role: node
  62. relabel_configs:
  63. - replacement: kubernetes.default.svc:443
  64. target_label: __address__
  65. - regex: (.+)
  66. replacement: /api/v1/nodes/\$1/proxy/metrics
  67. source_labels:
  68. - __meta_kubernetes_node_name
  69. target_label: __metrics_path__
  70. scheme: https
  71. tls_config:
  72. ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  73. insecure_skip_verify: false
  74. server_name: kubernetes
  75. EOF
  76. envsubst | kubectl apply -n $NAMESPACE -f -

步骤三:在K8s中创建grafana-agent实例

Daemonset

对于采集 node_exporter/ kubelet/ cAdvisor等指标,每个节点上只运行一个grafana-agent实例的情况,推荐以daemonset运行

  1. export NAMESPACE=default
  2. MANIFEST_URL=https://raw.githubusercontent.com/flashcatcloud/fc-agent/fc-release/etc/k8s/agent-daemonset.yaml
  3. curl -fsSL $MANIFEST_URL | envsubst | kubectl apply -f -

Deployment

对于采集MySQLd_Exporter等需要运行多个grafana-agent实例的情况,推荐以deployment运行。

  1. export NAMESPACE=default
  2. MANIFEST_URL=https://raw.githubusercontent.com/flashcatcloud/fc-agent/fc-release/etc/k8s/agent-deployment.yaml
  3. curl -fsSL $MANIFEST_URL | envsubst | kubectl apply -f -

如何重建grafana-agent

Daemonset

  1. kubectl rollout restart daemonset/grafana-agent

Deployment

  1. kubectl rollout restart deployment/grafana-agent

至此,我们已经完成了在K8s中部署grafana-agent并收集metrics,进一步,我们还可以配置grafana-agent来建立起完整的kubernetes指标监控体系