总览
Acknowledgement: Grafana Agent is a lightweight telemetry collector based on Prometheus that only performs its scraping and remote_write functions. Agent can also collect metrics, logs, and traces for storage in Grafana Cloud and Grafana Enterprise, as well as OSS deployments of Loki (logs), and Tempo (traces), Prometheus (metrics), and Cortex (metrics). Grafana Agent also contains several integrations (embedded metrics exporters) like node-exporter, a MySQL exporter, and many more.
如果您使用和管理着Kubernetes集群以及您的应用运行在Kubernetes之上,请参考 在K8s中使用grafana-agent。
在Windows环境安装和运行grafana-agent
- 从Grafana github releases下载Windows安装文件;
- 运行安装文件后,会对grafana-agent进行配置,并注册为Windows服务;
- 更详细的配置文档,可以参考Windows Guide;
在Docker中运行grafana-agent
如果您的宿主机上运行有docker服务,那么使用docker运行grafana-agent 是最快捷的方式。在命令行终端运行以下命令,即可在容器中启动grafana-agent:
1. 生成 grafana-agent 的配置文件
cat <<EOF > /tmp/grafana-agent-config.yamlserver:log_level: infohttp_listen_port: 12345metrics:global:scrape_interval: 15sscrape_timeout: 10sconfigs:- name: flashtesthost_filter: falsescrape_configs:- job_name: local_scrapestatic_configs:- targets: ['127.0.0.1:12345']labels:cluster: 'mymac'remote_write:- url: https://n9e-server:19000/prometheus/v1/writebasic_auth:username: <string>password: <string>EOF
2. 启动 grafana-agent 容器
docker run \-v /tmp/agent:/etc/agent/data \-v /tmp/grafana-agent-config.yaml:/etc/agent/agent.yaml \-p 12345:12345 \-d \grafana/agent:v0.23.0 \--config.file=/etc/agent/agent.yaml \--prometheus.wal-directory=/etc/agent/data
或者您也可以从 Dockerfile 在本地 build 镜像之后再运行:
curl -sO https://raw.githubusercontent.com/grafana/agent/main/cmd/agent/Dockerfiledocker build -t grafana/agent:v0.23.0 -f ./Dockerfile
上述步骤中,几个需要注意的点:
remote_write和basic_auth,请根据自己的实际情况填写;-p把容器中的端口12345映射到主机,-d把容器进程放到后台运行;-v /tmp/agent:/etc/agent/data是把宿主机的目录/tmp/agent映射到容器中/etc/agent/data,用于grafana-agent持久化保存其WAL(Write Ahead Log) ;-v /tmp/grafana-agent-config.yaml:/etc/agent/agent.yaml是把grafana-agent的配置文件,放置到容器指定的位置,即/etc/agent/agent.yaml
3. 验证 grafana-agent 是否正常工作
您可以通过直接 curl http://localhost:12345/metrics 来验证数据的产生是否符合预期,正常情况下会显示如下:
agent_build_info{branch="HEAD",goversion="go1.17.6",revision="36b8ca75",version="v0.23.0"} 1agent_inflight_requests{method="GET",route="metrics"} 1agent_metrics_active_configs 1agent_metrics_active_instances 1agent_tcp_connections{protocol="grpc"} 0agent_tcp_connections{protocol="http"} 2go_gc_duration_seconds_sum 0.0040902go_gc_duration_seconds_count 6go_goroutines 50log_messages_total{level="debug"} 44log_messages_total{level="error"} 0log_messages_total{level="info"} 13log_messages_total{level="warn"} 0loki_logql_querystats_duplicates_total 0loki_logql_querystats_ingester_sent_lines_total 0net_conntrack_dialer_conn_attempted_total{dialer_name="local_scrape"} 1net_conntrack_dialer_conn_attempted_total{dialer_name="remote_storage_write_client"} 1net_conntrack_dialer_conn_closed_total{dialer_name="local_scrape"} 0net_conntrack_dialer_conn_closed_total{dialer_name="remote_storage_write_client"} 0net_conntrack_dialer_conn_established_total{dialer_name="local_scrape"} 1net_conntrack_dialer_conn_established_total{dialer_name="remote_storage_write_client"} 1process_cpu_seconds_total 11.53process_max_fds 1.048576e+06process_open_fds 17process_resident_memory_bytes 9.4773248e+07process_start_time_seconds 1.64499076013e+09process_virtual_memory_bytes 1.356931072e+09process_virtual_memory_max_bytes 1.8446744073709552e+19prometheus_interner_num_strings 275prometheus_interner_string_interner_zero_reference_releases_total 0prometheus_sd_consulagent_rpc_duration_seconds_sum{call="services",endpoint="agent"} 0prometheus_sd_consulagent_rpc_duration_seconds_count{call="services",endpoint="agent"} 0prometheus_sd_consulagent_rpc_failures_total 0prometheus_sd_dns_lookup_failures_total 0prometheus_sd_dns_lookups_total 0prometheus_sd_file_read_errors_total 0prometheus_sd_file_scan_duration_seconds{quantile="0.5"} NaN...
您也可以通过访问 grafana-agent 所暴露的 API,获取到 targets 列表来确认是否符合预期:
curl http://localhost:12345/agent/api/v1/targets |jq{"status": "success","data": [{"instance": "7f383657f506f53a739e2df61be58891","target_group": "local_scrape","endpoint": "http://127.0.0.1:12345/metrics","state": "up","labels": {"cluster": "mymac","instance": "127.0.0.1:12345","job": "local_scrape"},"discovered_labels": {"__address__": "127.0.0.1:12345","__metrics_path__": "/metrics","__scheme__": "http","__scrape_interval__": "15s","__scrape_timeout__": "10s","cluster": "mymac","job": "local_scrape"},"last_scrape": "2022-02-16T07:18:55.6221085Z","scrape_duration_ms": 6,"scrape_error": ""}]}
在本机安装运行grafana-agent
如果您的主机上没有docker或者您希望直接把grafana-agent运行在宿主机上,可以依照以下步骤:
1. 下载预先编译好的二进制包
下载地址为: https://github.com/grafana/agent/releases/download/${version}/agent-${platform}-${arch}.zip
- 其中,
version当前为v0.23.0 - 其中,可下载的
platform和arch列表如下:- linux/amd64
- linux/arm64
- linux/armv7
- linux/armv6
- darwin/amd64
- darwin/arm64
- windows/amd64
- linux/mipsle
- freebsd/amd64
比如,我们现在的操作系统为Linux,架构为Amd64, 那么grafana-agent的二进制包下载命令如下:
# download the binarycurl -SOL "https://github.com/grafana/agent/releases/download/v0.23.0/agent-linux-amd64.zip"# extract the binarygunzip ./agent-linux-amd64.zip# make sure it is executablechmod a+x "agent-linux-amd64"
2. 生成 grafana-agent 的配置文件
cat <<EOF > ./agent-cfg.yamlserver:log_level: infohttp_listen_port: 12345metrics:global:scrape_interval: 15sscrape_timeout: 10sremote_write:- url: https://n9e-server:19000/prometheus/v1/writebasic_auth:username: <string>password: <string>integrations:agent:enabled: truenode_exporter:enabled: trueinclude_exporter_metrics: trueEOF
3. 启动 grafana-agent
nohup ./agent-linux-amd64 \-config.file ./agent-cfg.yaml \-metrics.wal-directory ./data \&> grafana-agent.log &
4. 验证 grafana-agent 是否正常工作
- 您可以通过直接
curl http://localhost:12345/metrics来验证数据的产生是否符合预期; - 您也可以通过访问
grafana-agent所暴露的API,获取到targets列表来确认是否符合预期,操作命令为curl http://localhost:12345/agent/api/v1/targets;
至此,我们已经成功的将 grafana-agent 运行起来,并且开始收集 grafana-agent 自身的 metrics 指标。下一步,我们讲述如何通过 grafana-agent 的内嵌的各种 exporter 来采集主机、进程、MySQL等监控指标。
