prometheus的node自动发现
一、正常安装普罗米修斯之后,默认静态配置有所局限性,官方也贴心提供了许多自动发现服务,记录路径发现方式
在prometheus.yml同级目录下,创建一个node文件夹,用来存放后续想拉取的node配置信息。
yml配置里采用通配符,指定node目录的*yml
1.node节点配套规范如下:
1.1.文件名称node开头 .yml结尾
如:node_serviceFind.yml
1.2.编辑node_serviceFind.yml文件
示例:
- targets:
- "192.168.56.234:9098"
labels:
appName: locahost
–完整的prometheus.yml参考–
# my global config
global:
scrape_interval: 3s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
# - job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
# static_configs:
# - targets: ["localhost:9099"]
- job_name: "nodeTest"
file_sd_configs:
- files: ['node/node*.yml']
refresh_interval: 5m
1.3.编辑启动脚本start.sh,指定端口,保留时间
`nohup ./prometheus --web.listen-address=":9099" --storage.tsdb.retention.time=30d &`
1.4.编辑停止脚本stop.sh
pids=`ps -ef | grep prometheus | grep -v grep| awk '{print $2}'`
for pid in $pids
do
echo "Kill process prometheus $pid...";
kill -9 $pid;
rlt=$?;
if [ $rlt -eq 0 ]; then
echo "Success!!";
else
echo "Failed!!!";
fi
done
二、探针节点上传之后,启动脚本也编辑了方便复用
示例如下:
文件名 start.sh
nohup ./node_exporter --web.listen-address=":9098" &
文件名 stop.sh
pids=`ps -ef | grep node_exporter | grep -v grep| awk '{print $2}'`
for pid in $pids
do
echo "Kill process node_exporter $pid...";
kill -9 $pid;
rlt=$?;
if [ $rlt -eq 0 ]; then
echo "Success!!";
else
echo "Failed!!!";
fi
done