使用Grafana-agent采集监控数据

Acknowledgement: Grafana Agent is a lightweight telemetry collector based on Prometheus that only performs its scraping and remote_write functions. Agent can also collect metrics, logs, and traces for storage in Grafana Cloud and Grafana Enterprise, as well as OSS deployments of Loki (logs), and Tempo (traces), Prometheus (metrics), and Cortex (metrics). Grafana Agent also contains several integrations (embedded metrics exporters) like node-exporter, a MySQL exporter, and many more.

如果您使用和管理着Kubernetes集群以及您的应用运行在Kubernetes之上,请参考 在K8s中使用grafana-agent

在Windows环境安装和运行grafana-agent

  1. Grafana github releases下载Windows安装文件;
  2. 运行安装文件后,会对grafana-agent进行配置,并注册为Windows服务;
  3. 更详细的配置文档,可以参考Windows Guide

在Docker中运行grafana-agent

如果您的宿主机上运行有docker服务,那么使用docker运行grafana-agent 是最快捷的方式。在命令行终端运行以下命令,即可在容器中启动grafana-agent

1. 生成 grafana-agent 的配置文件

  1. cat <<EOF > /tmp/grafana-agent-config.yaml
  2. server:
  3. log_level: info
  4. http_listen_port: 12345
  5. metrics:
  6. global:
  7. scrape_interval: 15s
  8. scrape_timeout: 10s
  9. configs:
  10. - name: flashtest
  11. host_filter: false
  12. scrape_configs:
  13. - job_name: local_scrape
  14. static_configs:
  15. - targets: ['127.0.0.1:12345']
  16. labels:
  17. cluster: 'mymac'
  18. remote_write:
  19. - url: https://n9e-server:19000/prometheus/v1/write
  20. basic_auth:
  21. username: <string>
  22. password: <string>
  23. EOF

2. 启动 grafana-agent 容器

  1. docker run \
  2. -v /tmp/agent:/etc/agent/data \
  3. -v /tmp/grafana-agent-config.yaml:/etc/agent/agent.yaml \
  4. -p 12345:12345 \
  5. -d \
  6. grafana/agent:v0.23.0 \
  7. --config.file=/etc/agent/agent.yaml \
  8. --prometheus.wal-directory=/etc/agent/data

或者您也可以从 Dockerfile 在本地 build 镜像之后再运行:

  1. curl -sO https://raw.githubusercontent.com/grafana/agent/main/cmd/agent/Dockerfile
  2. docker build -t grafana/agent:v0.23.0 -f ./Dockerfile

上述步骤中,几个需要注意的点:

  • remote_writebasic_auth ,请根据自己的实际情况填写;
  • -p 把容器中的端口12345映射到主机,-d 把容器进程放到后台运行;
  • -v /tmp/agent:/etc/agent/data 是把宿主机的目录 /tmp/agent 映射到容器中 /etc/agent/data,用于 grafana-agent 持久化保存其 WAL(Write Ahead Log) ;
  • -v /tmp/grafana-agent-config.yaml:/etc/agent/agent.yaml 是把 grafana-agent 的配置文件,放置到容器指定的位置,即 /etc/agent/agent.yaml

3. 验证 grafana-agent 是否正常工作

您可以通过直接 curl http://localhost:12345/metrics 来验证数据的产生是否符合预期,正常情况下会显示如下:

  1. agent_build_info{branch="HEAD",goversion="go1.17.6",revision="36b8ca75",version="v0.23.0"} 1
  2. agent_inflight_requests{method="GET",route="metrics"} 1
  3. agent_metrics_active_configs 1
  4. agent_metrics_active_instances 1
  5. agent_tcp_connections{protocol="grpc"} 0
  6. agent_tcp_connections{protocol="http"} 2
  7. go_gc_duration_seconds_sum 0.0040902
  8. go_gc_duration_seconds_count 6
  9. go_goroutines 50
  10. log_messages_total{level="debug"} 44
  11. log_messages_total{level="error"} 0
  12. log_messages_total{level="info"} 13
  13. log_messages_total{level="warn"} 0
  14. loki_logql_querystats_duplicates_total 0
  15. loki_logql_querystats_ingester_sent_lines_total 0
  16. net_conntrack_dialer_conn_attempted_total{dialer_name="local_scrape"} 1
  17. net_conntrack_dialer_conn_attempted_total{dialer_name="remote_storage_write_client"} 1
  18. net_conntrack_dialer_conn_closed_total{dialer_name="local_scrape"} 0
  19. net_conntrack_dialer_conn_closed_total{dialer_name="remote_storage_write_client"} 0
  20. net_conntrack_dialer_conn_established_total{dialer_name="local_scrape"} 1
  21. net_conntrack_dialer_conn_established_total{dialer_name="remote_storage_write_client"} 1
  22. process_cpu_seconds_total 11.53
  23. process_max_fds 1.048576e+06
  24. process_open_fds 17
  25. process_resident_memory_bytes 9.4773248e+07
  26. process_start_time_seconds 1.64499076013e+09
  27. process_virtual_memory_bytes 1.356931072e+09
  28. process_virtual_memory_max_bytes 1.8446744073709552e+19
  29. prometheus_interner_num_strings 275
  30. prometheus_interner_string_interner_zero_reference_releases_total 0
  31. prometheus_sd_consulagent_rpc_duration_seconds_sum{call="services",endpoint="agent"} 0
  32. prometheus_sd_consulagent_rpc_duration_seconds_count{call="services",endpoint="agent"} 0
  33. prometheus_sd_consulagent_rpc_failures_total 0
  34. prometheus_sd_dns_lookup_failures_total 0
  35. prometheus_sd_dns_lookups_total 0
  36. prometheus_sd_file_read_errors_total 0
  37. prometheus_sd_file_scan_duration_seconds{quantile="0.5"} NaN
  38. ...

您也可以通过访问 grafana-agent 所暴露的 API,获取到 targets 列表来确认是否符合预期:

  1. curl http://localhost:12345/agent/api/v1/targets |jq
  2. {
  3. "status": "success",
  4. "data": [
  5. {
  6. "instance": "7f383657f506f53a739e2df61be58891",
  7. "target_group": "local_scrape",
  8. "endpoint": "http://127.0.0.1:12345/metrics",
  9. "state": "up",
  10. "labels": {
  11. "cluster": "mymac",
  12. "instance": "127.0.0.1:12345",
  13. "job": "local_scrape"
  14. },
  15. "discovered_labels": {
  16. "__address__": "127.0.0.1:12345",
  17. "__metrics_path__": "/metrics",
  18. "__scheme__": "http",
  19. "__scrape_interval__": "15s",
  20. "__scrape_timeout__": "10s",
  21. "cluster": "mymac",
  22. "job": "local_scrape"
  23. },
  24. "last_scrape": "2022-02-16T07:18:55.6221085Z",
  25. "scrape_duration_ms": 6,
  26. "scrape_error": ""
  27. }
  28. ]
  29. }

在本机安装运行grafana-agent

如果您的主机上没有docker或者您希望直接把grafana-agent运行在宿主机上,可以依照以下步骤:

1. 下载预先编译好的二进制包

下载地址为: https://github.com/grafana/agent/releases/download/${version}/agent-${platform}-${arch}.zip

  • 其中,version当前为v0.23.0
  • 其中,可下载的platformarch列表如下:
    • linux/amd64
    • linux/arm64
    • linux/armv7
    • linux/armv6
    • darwin/amd64
    • darwin/arm64
    • windows/amd64
    • linux/mipsle
    • freebsd/amd64

比如,我们现在的操作系统为Linux,架构为Amd64, 那么grafana-agent的二进制包下载命令如下:

  1. # download the binary
  2. curl -SOL "https://github.com/grafana/agent/releases/download/v0.23.0/agent-linux-amd64.zip"
  3. # extract the binary
  4. gunzip ./agent-linux-amd64.zip
  5. # make sure it is executable
  6. chmod a+x "agent-linux-amd64"

2. 生成 grafana-agent 的配置文件

  1. cat <<EOF > ./agent-cfg.yaml
  2. server:
  3. log_level: info
  4. http_listen_port: 12345
  5. metrics:
  6. global:
  7. scrape_interval: 15s
  8. scrape_timeout: 10s
  9. remote_write:
  10. - url: https://n9e-server:19000/prometheus/v1/write
  11. basic_auth:
  12. username: <string>
  13. password: <string>
  14. integrations:
  15. agent:
  16. enabled: true
  17. node_exporter:
  18. enabled: true
  19. include_exporter_metrics: true
  20. EOF

3. 启动 grafana-agent

  1. nohup ./agent-linux-amd64 \
  2. -config.file ./agent-cfg.yaml \
  3. -metrics.wal-directory ./data \
  4. &> grafana-agent.log &

4. 验证 grafana-agent 是否正常工作

  • 您可以通过直接 curl http://localhost:12345/metrics 来验证数据的产生是否符合预期;
  • 您也可以通过访问 grafana-agent 所暴露的 API ,获取到 targets 列表来确认是否符合预期,操作命令为 curl http://localhost:12345/agent/api/v1/targets

至此,我们已经成功的将 grafana-agent 运行起来,并且开始收集 grafana-agent 自身的 metrics 指标。下一步,我们讲述如何通过 grafana-agent 的内嵌的各种 exporter 来采集主机、进程、MySQL等监控指标。