Redis Exporter

grafana-agent内置了redis_exporter,可以采集Redis server的运行指标。

目前grafana-agent,只支持配置一个Redis server地址,对其进行数据采集。如果您希望采集多个redis实例的metrics数据,那么需要启动多个grafana-agent实例,并通过relabel_configs来区分来自不同redis实例的数据。

配置并启用redis_exporter

  1. redis_exporter:
  2. enabled: true
  3. redis_addr: "redis-2:6379"
  4. relabel_configs:
  5. - source_labels: [__address__]
  6. target_label: instance
  7. replacement: redis-2

我们强烈推荐您使用独立的账号运行grafana-agent,并做好访问redis实例的最小化授权,避免过度授权带来的安全隐患,更多可以参考official documentation

采集的关键指标列表

  1. redis_active_defrag_running: When activedefrag is enabled, this indicates whether defragmentation is currently active, and the CPU percentage it intends to utilize.
  2. redis_allocator_active_bytes: Total bytes in the allocator active pages, this includes external-fragmentation.
  3. redis_allocator_allocated_bytes: Total bytes allocated form the allocator, including internal-fragmentation. Normally the same as used_memory.
  4. redis_allocator_frag_bytes: Delta between allocator_active and allocator_allocated. See note about mem_fragmentation_bytes.
  5. redis_allocator_frag_ratio: Ratio between allocator_active and allocator_allocated. This is the true (external) fragmentation metric (not mem_fragmentation_ratio).
  6. redis_allocator_resident_bytes: Total bytes resident (RSS) in the allocator, this includes pages that can be released to the OS (by MEMORY PURGE, or just waiting).
  7. redis_allocator_rss_bytes: Delta between allocator_resident and allocator_active.
  8. redis_allocator_rss_ratio: Ratio between allocator_resident and allocator_active. This usually indicates pages that the allocator can and probably will soon release back to the OS.
  9. redis_aof_current_rewrite_duration_sec: Duration of the on-going AOF rewrite operation if any.
  10. redis_aof_enabled: Flag indicating AOF logging is activated.
  11. redis_aof_last_bgrewrite_status: Status of the last AOF rewrite operation.
  12. redis_aof_last_cow_size_bytes: The size in bytes of copy-on-write memory during the last AOF rewrite operation.
  13. redis_aof_last_rewrite_duration_sec: Duration of the last AOF rewrite operation in seconds.
  14. redis_aof_last_write_status: Status of the last write operation to the AOF.
  15. redis_aof_rewrite_in_progress: Flag indicating a AOF rewrite operation is on-going.
  16. redis_aof_rewrite_scheduled: Flag indicating an AOF rewrite operation will be scheduled once the on-going RDB save is complete.
  17. redis_blocked_clients: Number of clients pending on a blocking call (BLPOP, BRPOP, BRPOPLPUSH, BLMOVE, BZPOPMIN, BZPOPMAX).
  18. redis_client_recent_max_input_buffer_bytes: Biggest input buffer among current client connections.
  19. redis_client_recent_max_output_buffer_bytes: Biggest output buffer among current client connections.
  20. redis_cluster_enabled: Indicate Redis cluster is enabled.
  21. redis_commands_duration_seconds_total: The total CPU time consumed by these commands.(Counter)
  22. redis_commands_processed_total: Total number of commands processed by the server.(Counter)
  23. redis_commands_total: The number of calls that reached command execution (not rejected).(Counter)
  24. redis_config_maxclients: The value of the maxclients configuration directive. This is the upper limit for the sum of connected_clients, connected_slaves and cluster_connections.
  25. redis_config_maxmemory: The value of the maxmemory configuration directive.
  26. redis_connected_clients: Number of client connections (excluding connections from replicas).
  27. redis_connected_slaves: Number of connected replicas.
  28. redis_connections_received_total: Total number of connections accepted by the server.(Counter)
  29. redis_cpu_sys_children_seconds_total: System CPU consumed by the background processes.(Counter)
  30. redis_cpu_sys_seconds_total: System CPU consumed by the Redis server, which is the sum of system CPU consumed by all threads of the server process (main thread and background threads).(Counter)
  31. redis_cpu_user_children_seconds_total: User CPU consumed by the background processes.(Counter)
  32. redis_cpu_user_seconds_total: User CPU consumed by the Redis server, which is the sum of user CPU consumed by all threads of the server process (main thread and background threads).(Counter)
  33. redis_db_keys: Total number of keys by DB.
  34. redis_db_keys_expiring: Total number of expiring keys by DB
  35. redis_defrag_hits: Number of value reallocations performed by active the defragmentation process.
  36. redis_defrag_misses: Number of aborted value reallocations started by the active defragmentation process.
  37. redis_defrag_key_hits: Number of keys that were actively defragmented.
  38. redis_defrag_key_misses: Number of keys that were skipped by the active defragmentation process.
  39. redis_evicted_keys_total: Number of evicted keys due to maxmemory limit.(Counter)
  40. redis_expired_keys_total: Total number of key expiration events.(Counter)
  41. redis_expired_stale_percentage: The percentage of keys probably expired.
  42. redis_expired_time_cap_reached_total: The count of times that active expiry cycles have stopped early.
  43. redis_exporter_last_scrape_connect_time_seconds: The duration(in seconds) to connect when scrape.
  44. redis_exporter_last_scrape_duration_seconds: The last scrape duration.
  45. redis_exporter_last_scrape_error: The last scrape error status.
  46. redis_exporter_scrape_duration_seconds_count: Durations of scrapes by the exporter
  47. redis_exporter_scrape_duration_seconds_sum: Durations of scrapes by the exporter
  48. redis_exporter_scrapes_total: Current total redis scrapes.(Counter)
  49. redis_instance_info: Information about the Redis instance.
  50. redis_keyspace_hits_total: Hits total.(Counter)
  51. redis_keyspace_misses_total: Misses total.(Counter)
  52. redis_last_key_groups_scrape_duration_milliseconds: Duration of the last key group metrics scrape in milliseconds.
  53. redis_last_slow_execution_duration_seconds: The amount of time needed for last slow execution, in seconds.
  54. redis_latest_fork_seconds: The amount of time needed for last fork, in seconds.
  55. redis_lazyfree_pending_objects: The number of objects waiting to be freed (as a result of calling UNLINK, or FLUSHDB and FLUSHALL with the ASYNC option).
  56. redis_master_repl_offset: The server's current replication offset.
  57. redis_mem_clients_normal: Memory used by normal clients.(Gauge)
  58. redis_mem_clients_slaves: Memory used by replica clients - Starting Redis 7.0, replica buffers share memory with the replication backlog, so this field can show 0 when replicas don't trigger an increase of memory usage.
  59. redis_mem_fragmentation_bytes: Delta between used_memory_rss and used_memory. Note that when the total fragmentation bytes is low (few megabytes), a high ratio (e.g. 1.5 and above) is not an indication of an issue.
  60. redis_mem_fragmentation_ratio: Ratio between used_memory_rss and used_memory. Note that this doesn't only includes fragmentation, but also other process overheads (see the allocator_* metrics), and also overheads like code, shared libraries, stack, etc.
  61. redis_mem_not_counted_for_eviction_bytes: (Gauge)
  62. redis_memory_max_bytes: Max memory limit in bytes.
  63. redis_memory_used_bytes: Total number of bytes allocated by Redis using its allocator (either standard libc, jemalloc, or an alternative allocator such as tcmalloc)
  64. redis_memory_used_dataset_bytes: The size in bytes of the dataset (used_memory_overhead subtracted from used_memory)
  65. redis_memory_used_lua_bytes: Number of bytes used by the Lua engine.
  66. redis_memory_used_overhead_bytes: The sum in bytes of all overheads that the server allocated for managing its internal data structures.
  67. redis_memory_used_peak_bytes: Peak memory consumed by Redis (in bytes)
  68. redis_memory_used_rss_bytes: Number of bytes that Redis allocated as seen by the operating system (a.k.a resident set size). This is the number reported by tools such as top(1) and ps(1)
  69. redis_memory_used_scripts_bytes: Number of bytes used by cached Lua scripts
  70. redis_memory_used_startup_bytes: Initial amount of memory consumed by Redis at startup in bytes
  71. redis_migrate_cached_sockets_total: The number of sockets open for MIGRATE purposes
  72. redis_net_input_bytes_total: Total input bytes(Counter)
  73. redis_net_output_bytes_total: Total output bytes(Counter)
  74. redis_process_id: Process ID
  75. redis_pubsub_channels: Global number of pub/sub channels with client subscriptions
  76. redis_pubsub_patterns: Global number of pub/sub pattern with client subscriptions
  77. redis_rdb_bgsave_in_progress: Flag indicating a RDB save is on-going
  78. redis_rdb_changes_since_last_save: Number of changes since the last dump
  79. redis_rdb_current_bgsave_duration_sec: Duration of the on-going RDB save operation if any
  80. redis_rdb_last_bgsave_duration_sec: Duration of the last RDB save operation in seconds
  81. redis_rdb_last_bgsave_status: Status of the last RDB save operation
  82. redis_rdb_last_cow_size_bytes: The size in bytes of copy-on-write memory during the last RDB save operation
  83. redis_rdb_last_save_timestamp_seconds: Epoch-based timestamp of last successful RDB save
  84. redis_rejected_connections_total: Number of connections rejected because of maxclients limit(Counter)
  85. redis_repl_backlog_first_byte_offset: The master offset of the replication backlog buffer
  86. redis_repl_backlog_history_bytes: Size in bytes of the data in the replication backlog buffer
  87. redis_repl_backlog_is_active: Flag indicating replication backlog is active
  88. redis_replica_partial_resync_accepted: The number of accepted partial resync requests(Gauge)
  89. redis_replica_partial_resync_denied: The number of denied partial resync requests(Gauge)
  90. redis_replica_resyncs_full: The number of full resyncs with replicas
  91. redis_replication_backlog_bytes: Memory used by replication backlog
  92. redis_second_repl_offset: The offset up to which replication IDs are accepted.
  93. redis_slave_expires_tracked_keys: The number of keys tracked for expiry purposes (applicable only to writable replicas)(Gauge)
  94. redis_slowlog_last_id: Last id of slowlog
  95. redis_slowlog_length: Total slowlog
  96. redis_start_time_seconds: Start time of the Redis instance since unix epoch in seconds.
  97. redis_target_scrape_request_errors_total: Errors in requests to the exporter
  98. redis_up: Flag indicating redis instance is up
  99. redis_uptime_in_seconds: Number of seconds since Redis server start

完整地配置项说明

  1. # Enables the redis_exporter integration, allowing the Agent to automatically
  2. # collect system metrics from the configured redis address
  3. [enabled: <boolean> | default = false]
  4. # Sets an explicit value for the instance label when the integration is
  5. # self-scraped. Overrides inferred values.
  6. #
  7. # The default value for this integration is inferred from the hostname
  8. # portion of redis_addr.
  9. [instance: <string>]
  10. # Automatically collect metrics from this integration. If disabled,
  11. # the redis_exporter integration will be run but not scraped and thus not
  12. # remote-written. Metrics for the integration will be exposed at
  13. # /integrations/redis_exporter/metrics and can be scraped by an external
  14. # process.
  15. [scrape_integration: <boolean> | default = <integrations_config.scrape_integrations>]
  16. # How often should the metrics be collected? Defaults to
  17. # prometheus.global.scrape_interval.
  18. [scrape_interval: <duration> | default = <global_config.scrape_interval>]
  19. # The timeout before considering the scrape a failure. Defaults to
  20. # prometheus.global.scrape_timeout.
  21. [scrape_timeout: <duration> | default = <global_config.scrape_timeout>]
  22. # Allows for relabeling labels on the target.
  23. relabel_configs:
  24. [- <relabel_config> ... ]
  25. # Relabel metrics coming from the integration, allowing to drop series
  26. # from the integration that you don't care about.
  27. metric_relabel_configs:
  28. [ - <relabel_config> ... ]
  29. # How frequent to truncate the WAL for this integration.
  30. [wal_truncate_frequency: <duration> | default = "60m"]
  31. # Monitor the exporter itself and include those metrics in the results.
  32. [include_exporter_metrics: <bool> | default = false]
  33. # exporter-specific configuration options
  34. # Address of the redis instance.
  35. redis_addr: <string>
  36. # User name to use for authentication (Redis ACL for Redis 6.0 and newer).
  37. [redis_user: <string>]
  38. # Password of the redis instance.
  39. [redis_password: <string>]
  40. # Path of a file containing a passord. If this is defined, it takes precedece
  41. # over redis_password.
  42. [redis_password_file: <string>]
  43. # Namespace for the metrics.
  44. [namespace: <string> | default = "redis"]
  45. # What to use for the CONFIG command.
  46. [config_command: <string> | default = "CONFIG"]
  47. # Comma separated list of key-patterns to export value and length/size, searched for with SCAN.
  48. [check_keys: <string>]
  49. # Comma separated list of LUA regex for grouping keys. When unset, no key
  50. # groups will be made.
  51. [check_key_groups: <string>]
  52. # Check key or key groups batch size hint for the underlying SCAN. Keeping the same name for backwards compatibility, but this applies to both key and key groups batch size configuration.
  53. [check_key_groups_batch_size: <int> | default = 10000]
  54. # The maximum number of distinct key groups with the most memory utilization
  55. # to present as distinct metrics per database. The leftover key groups will be
  56. # aggregated in the 'overflow' bucket.
  57. [max_distinct_key_groups: <int> | default = 100]
  58. # Comma separated list of single keys to export value and length/size.
  59. [check_single_keys: <string>]
  60. # Comma separated list of stream-patterns to export info about streams, groups and consumers, searched for with SCAN.
  61. [check_streams: <string>]
  62. # Comma separated list of single streams to export info about streams, groups and consumers.
  63. [check_single_streams: <string>]
  64. # Comma separated list of individual keys to export counts for.
  65. [count_keys: <string>]
  66. # Path to Lua Redis script for collecting extra metrics.
  67. [script_path: <string>]
  68. # Timeout for connection to Redis instance (in Golang duration format).
  69. [connection_timeout: <time.Duration> | default = "15s"]
  70. # Name of the client key file (including full path) if the server requires TLS client authentication.
  71. [tls_client_key_file: <string>]
  72. # Name of the client certificate file (including full path) if the server requires TLS client authentication.
  73. [tls_client_cert_file: <string>]
  74. # Name of the CA certificate file (including full path) if the server requires TLS client authentication.
  75. [tls_ca_cert_file: <string>]
  76. # Whether to set client name to redis_exporter.
  77. [set_client_name: <bool>]
  78. # Whether to scrape Tile38 specific metrics.
  79. [is_tile38: <bool>]
  80. # Whether to scrape Client List specific metrics.
  81. [export_client_list: <bool>]
  82. # Whether to include the client's port when exporting the client list. Note
  83. # that including this will increase the cardinality of all redis metrics.
  84. [export_client_port: <bool>]
  85. # Whether to also export go runtime metrics.
  86. [redis_metrics_only: <bool>]
  87. # Whether to ping the redis instance after connecting.
  88. [ping_on_connect: <bool>]
  89. # Whether to include system metrics like e.g. redis_total_system_memory_bytes.
  90. [incl_system_metrics: <bool>]
  91. # Whether to to skip TLS verification.
  92. [skip_tls_verification: <bool>]