2023-11-16    2023-12-28    487 字  1 分钟

安装process-exporter

脚本地址,点击下载。

下载安装。

1
2
3
4
cd /usr/local/src
wget https://github.com/ncabatoff/process-exporter/releases/download/v0.7.10/process-exporter-0.7.10.linux-amd64.tar.gz
tar xf process-exporter-0.7.10.linux-amd64.tar.gz -C /usr/local/
ln -s /usr/local/process-exporter-0.7.10.linux-amd64/ /usr/local/process-exporter

配置监控所有进程。

1
2
3
4
5
6
cat > /usr/local/process-exporter/process-all.yaml << EOF
process_names:
  - name: "{{.Comm}}"
    cmdline:
    - '.+'
EOF

若只想监控指定的进程,写法如下:

1
2
3
4
5
6
7
8
9
cat > /usr/local/process-exporter/process-all.yaml << EOF
process_names:
  - name: "{{.Matches}}"
    cmdline:
    - 'nginx'
  - name: "{{.Matches}}"
    cmdline:
    - 'sshd'
EOF

ops用户启动,需要先创建用户并授权。

1
2
useradd ops
chown -R ops. /usr/local/process-exporter*

服务启动

system配置文件。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
cat > /etc/systemd/system/process_exporter.service << \EOF
[Unit]
Description=Process_exporter daemon
After=network.target

[Service]
ExecStart=/usr/local/process-exporter/process-exporter -config.path /usr/local/process-exporter/process-all.yaml
User=ops
Group=ops
PrivateTmp=True

[Install]
WantedBy=multi-user.target
EOF

启动

1
2
3
4
systemctl daemon-reload
systemctl enable process_exporter.service
systemctl start process_exporter.service
systemctl status process_exporter.service

Prometheus

注册到consul

1
pip=$(ip a|grep eth0|grep inet|awk '{print $2}'|cut -d "/" -f 1);hn=$(hostname);curl -s -X PUT -d '{"id":"'"${hn}-process"'","name":"'"${hn}"'","address":"'"${pip}"'","port":9256,"tags":["hosts-process"],"checks":[{"http":"http://'"${pip}"':9256/metrics","interval":"15s"}]}' http://10.66.21.15:8500/v1/agent/service/register

若使用Ansible批量注册,可参考play-book

Prometheus基于consul自动发现

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
- job_name: consul-process
  consul_sd_configs:
    - server: "10.66.2.152:8500"
  relabel_configs:
    - source_labels: [__meta_consul_tags]
      # 从consul中过滤hosts-xxxxxx相关主机
      regex: .*hosts-process.*
      action: keep
    - source_labels:
        - __meta_consul_service
        - __meta_consul_service_address
      separator: "::"
      target_label: "sd_instance_name"
    - source_labels: [__meta_consul_dc]
      target_label: "dc"
    - source_labels: [__meta_consul_tags]
      target_label: "sd_tag"

Grafana

GrafanaDashboard8378点击跳转官方地址。

Grafana面板CPU无法正常出图。请做如下修改:

Top processes by Total CPU cores used修改为

1
2
3
4
5
6
7
topk(5,(rate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="user"}[$interval]) 
+
rate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="system"}[$interval]))
or 
(irate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="user"}[5m])
+
irate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="user"}[5m])))

Top processes by System CPU cores used修改为

1
2
3
4
5
topk(5,
rate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="system"}[$interval])
or 
(
irate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="system"}[5m])))

最终效果如图

image-20231117113011494


image-20231028232834657