2023-11-16    2023-12-28    487 字  1 分钟

安装process-exporter

脚本地址,点击下载。

下载安装。

1
2
3
4
shell
cd /usr/local/src wget https://github.com/ncabatoff/process-exporter/releases/download/v0.7.10/process-exporter-0.7.10.linux-amd64.tar.gz tar xf process-exporter-0.7.10.linux-amd64.tar.gz -C /usr/local/ ln -s /usr/local/process-exporter-0.7.10.linux-amd64/ /usr/local/process-exporter

配置监控所有进程。

1
2
3
4
5
6
shell
cat > /usr/local/process-exporter/process-all.yaml << EOF process_names: - name: "{{.Comm}}" cmdline: - '.+' EOF

若只想监控指定的进程,写法如下:

1
2
3
4
5
6
7
8
9
shell
cat > /usr/local/process-exporter/process-all.yaml << EOF process_names: - name: "{{.Matches}}" cmdline: - 'nginx' - name: "{{.Matches}}" cmdline: - 'sshd' EOF

ops用户启动,需要先创建用户并授权。

1
2
shell
useradd ops chown -R ops. /usr/local/process-exporter*

服务启动

system配置文件。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
shell
cat > /etc/systemd/system/process_exporter.service << \EOF [Unit] Description=Process_exporter daemon After=network.target [Service] ExecStart=/usr/local/process-exporter/process-exporter -config.path /usr/local/process-exporter/process-all.yaml User=ops Group=ops PrivateTmp=True [Install] WantedBy=multi-user.target EOF

启动

1
2
3
4
shell
systemctl daemon-reload systemctl enable process_exporter.service systemctl start process_exporter.service systemctl status process_exporter.service

Prometheus

注册到consul

1
shell
pip=$(ip a|grep eth0|grep inet|awk '{print $2}'|cut -d "/" -f 1);hn=$(hostname);curl -s -X PUT -d '{"id":"'"${hn}-process"'","name":"'"${hn}"'","address":"'"${pip}"'","port":9256,"tags":["hosts-process"],"checks":[{"http":"http://'"${pip}"':9256/metrics","interval":"15s"}]}' http://10.66.21.15:8500/v1/agent/service/register

若使用Ansible批量注册,可参考play-book

Prometheus基于consul自动发现

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
yaml
- job_name: consul-process consul_sd_configs: - server: "10.66.2.152:8500" relabel_configs: - source_labels: [__meta_consul_tags] # 从consul中过滤hosts-xxxxxx相关主机 regex: .*hosts-process.* action: keep - source_labels: - __meta_consul_service - __meta_consul_service_address separator: "::" target_label: "sd_instance_name" - source_labels: [__meta_consul_dc] target_label: "dc" - source_labels: [__meta_consul_tags] target_label: "sd_tag"

Grafana

GrafanaDashboard8378点击跳转官方地址。

Grafana面板CPU无法正常出图。请做如下修改:

Top processes by Total CPU cores used修改为

1
2
3
4
5
6
7
shell
topk(5,(rate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="user"}[$interval]) + rate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="system"}[$interval])) or (irate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="user"}[5m]) + irate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="user"}[5m])))

Top processes by System CPU cores used修改为

1
2
3
4
5
shell
topk(5, rate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="system"}[$interval]) or ( irate(namedprocess_namegroup_cpu_seconds_total{groupname=~"$processes",instance=~"$host",mode="system"}[5m])))

最终效果如图

image-20231117113011494


image-20231028232834657
阅读全文