Grafana 插件(grafana/v1-alpha

Grafana 插件是一个可选插件,用于脚手架生成 Grafana Dashboard,帮助你查看由使用 controller-runtime 的项目导出的默认指标。

何时使用?

如何使用?

前置条件

基本用法

Grafana 插件挂载在 initedit 子命令上:

# 使用 grafana 插件初始化新项目
kubebuilder init --plugins grafana.kubebuilder.io/v1-alpha

# 在已有项目上启用 grafana 插件
kubebuilder edit --plugins grafana.kubebuilder.io/v1-alpha

插件会创建一个新目录并在其中生成 JSON 文件(例如 grafana/controller-runtime-metrics.json)。

使用演示

如下动图展示了在项目中启用该插件:

output

如何在 Grafana 中导入这些 Dashboard

  1. 复制 JSON 文件内容。
  2. 打开 <your-grafana-url>/dashboard/import,按指引导入新的仪表盘
  3. 将 JSON 粘贴到 “Import via panel json”,点击 “Load”。
  4. 选择作为数据源的 Prometheus。
  5. 成功导入后,Dashboard 即可使用。

Dashboard 说明

Controller Runtime Reconciliation 总数与错误数

  • 指标:
    • controller_runtime_reconcile_total
    • controller_runtime_reconcile_errors_total
  • 查询:
    • sum(rate(controller_runtime_reconcile_total{job="$job"}[5m])) by (instance, pod)
    • sum(rate(controller_runtime_reconcile_errors_total{job="$job"}[5m])) by (instance, pod)
  • 描述:
    • 近 5 分钟内 Reconcile 总次数的每秒速率。
    • 近 5 分钟内 Reconcile 错误次数的每秒速率。
  • 示例:

控制器 CPU 与内存使用

  • 指标:
    • process_cpu_seconds_total
    • process_resident_memory_bytes
  • 查询:
    • rate(process_cpu_seconds_total{job="$job", namespace="$namespace", pod="$pod"}[5m]) * 100
    • process_resident_memory_bytes{job="$job", namespace="$namespace", pod="$pod"}
  • 描述:
    • 近 5 分钟内 CPU 使用率的每秒速率。
    • 控制器进程的常驻内存字节数。
  • 示例:

P50/90/99 工作队列等待时长(秒)

  • 指标:
    • workqueue_queue_duration_seconds_bucket
  • 查询:
    • histogram_quantile(0.50, sum(rate(workqueue_queue_duration_seconds_bucket{job="$job", namespace="$namespace"}[5m])) by (instance, name, le))
  • 描述:
    • 条目在工作队列中等待被取用的时长。
  • 示例:

P50/90/99 工作队列处理时长(秒)

  • 指标:
    • workqueue_work_duration_seconds_bucket
  • 查询:
    • histogram_quantile(0.50, sum(rate(workqueue_work_duration_seconds_bucket{job="$job", namespace="$namespace"}[5m])) by (instance, name, le))
  • 描述:
    • 从工作队列中取出并处理一个条目所花费的时间。
  • 示例:

Add Rate in Work Queue

  • Metrics
    • workqueue_adds_total
  • Query:
    • sum(rate(workqueue_adds_total{job=“$job”, namespace=“$namespace”}[5m])) by (instance, name)
  • Description
    • Per-second rate of items added to work queue
  • Sample:

Retries Rate in Work Queue

  • Metrics
    • workqueue_retries_total
  • Query:
    • sum(rate(workqueue_retries_total{job=“$job”, namespace=“$namespace”}[5m])) by (instance, name)
  • Description
    • Per-second rate of retries handled by workqueue
  • Sample:

Number of Workers in Use

  • Metrics
    • controller_runtime_active_workers
  • Query:
    • controller_runtime_active_workers{job=“$job”, namespace=“$namespace”}
  • Description
    • The number of active controller workers
  • Sample:

WorkQueue Depth

  • Metrics
    • workqueue_depth
  • Query:
    • workqueue_depth{job=“$job”, namespace=“$namespace”}
  • Description
    • Current depth of workqueue
  • Sample:

Unfinished Seconds

  • Metrics
    • workqueue_unfinished_work_seconds
  • Query:
    • rate(workqueue_unfinished_work_seconds{job=“$job”, namespace=“$namespace”}[5m])
  • Description
    • How many seconds of work has done that is in progress and hasn’t been observed by work_duration.
  • Sample:

Visualize Custom Metrics

The Grafana plugin supports scaffolding manifests for custom metrics.

Generate Config Template

When the plugin is triggered for the first time, grafana/custom-metrics/config.yaml is generated.

---
customMetrics:
#  - metric: # Raw custom metric (required)
#    type:   # Metric type: counter/gauge/histogram (required)
#    expr:   # Prom_ql for the metric (optional)
#    unit:   # Unit of measurement, examples: s,none,bytes,percent,etc. (optional)

Add Custom Metrics to Config

You can enter multiple custom metrics in the file. For each element, you need to specify the metric and its type. The Grafana plugin can automatically generate expr for visualization. Alternatively, you can provide expr and the plugin will use the specified one directly.

---
customMetrics:
  - metric: memcached_operator_reconcile_total # Raw custom metric (required)
    type: counter # Metric type: counter/gauge/histogram (required)
    unit: none
  - metric: memcached_operator_reconcile_time_seconds_bucket
    type: histogram

Scaffold Manifest

Once config.yaml is configured, you can run kubebuilder edit --plugins grafana.kubebuilder.io/v1-alpha again. This time, the plugin will generate grafana/custom-metrics/custom-metrics-dashboard.json, which can be imported to Grafana UI.

Show case:

See an example of how to visualize your custom metrics:

output2

Subcommands

The Grafana plugin implements the following subcommands:

  • edit ($ kubebuilder edit [OPTIONS])

  • init ($ kubebuilder init [OPTIONS])

Affected files

The following scaffolds will be created or updated by this plugin:

  • grafana/*.json

Further resources