prometheus的node自动发现


prometheus的node自动发现

一、正常安装普罗米修斯之后,默认静态配置有所局限性,官方也贴心提供了许多自动发现服务,记录路径发现方式

在prometheus.yml同级目录下,创建一个node文件夹,用来存放后续想拉取的node配置信息。

yml配置里采用通配符,指定node目录的*yml

1.node节点配套规范如下:

1.1.文件名称node开头 .yml结尾
如:node_serviceFind.yml

1.2.编辑node_serviceFind.yml文件
示例:

- targets:
  - "192.168.56.234:9098"
  labels:
    appName: locahost

–完整的prometheus.yml参考–

# my global config
global:
  scrape_interval: 3s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
#  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

#    static_configs:
#      - targets: ["localhost:9099"]
  - job_name: "nodeTest"
    file_sd_configs:
      - files: ['node/node*.yml']
        refresh_interval: 5m

1.3.编辑启动脚本start.sh,指定端口,保留时间

`nohup ./prometheus --web.listen-address=":9099" --storage.tsdb.retention.time=30d &`

1.4.编辑停止脚本stop.sh

pids=`ps -ef | grep prometheus | grep -v grep| awk '{print $2}'`
for pid in $pids
do
   echo "Kill process prometheus $pid...";
   kill -9 $pid;
   rlt=$?;
   if [ $rlt -eq 0 ]; then
      echo "Success!!";
   else
      echo "Failed!!!";
   fi
done

二、探针节点上传之后,启动脚本也编辑了方便复用
示例如下:
文件名 start.sh

nohup ./node_exporter --web.listen-address=":9098" &

文件名 stop.sh

pids=`ps -ef | grep node_exporter | grep -v grep| awk '{print $2}'`
for pid in $pids
do
   echo "Kill process node_exporter $pid...";
   kill -9 $pid;
   rlt=$?;
   if [ $rlt -eq 0 ]; then
      echo "Success!!";
   else
      echo "Failed!!!";
   fi
done

  目录