软件部署
角色 软件 配置时间同步
集中管控服务器 Prometheus 是
分布式探测点 blackbox_exporter 是
第一:在集中管控服务器上安装Prometheus
1、下载Prometheus ,解压,并重命名
wget https://github.com/prometheus/prometheus/releases/download/v2.15.2/prometheus-2.15.2.linux-amd64.tar.gz
tar xf prometheus-2.15.2.linux-amd64.tar.gz -C /root/
rm -rf prometheus-2.15.2.linux-amd64.tar.gz
mv prometheus-2.15.2* prometheus
2、编辑Prometheus配置文件
cat /root/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s #每15秒执行一次规则触发,默认1分钟
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets: ["localhost:9093"]
# Prometheus的规则配置,支持通配
rule_files:
- "./rules/*.yml"
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090'] #添加自己
#添加如下两组blackbox的配置文件,执行icmp探测
- job_name: 'DL_保定百度云'
scrape_interval: 6s
metrics_path: /probe
params:
module: [icmp]
#static_configs:
file_sd_configs:
- files:
- /root/prometheus/targets/targets_DL_DC.yml
refresh_interval: 1m
- files:
- /root/prometheus/targets/targets_HW_DC.yml
refresh_interval: 1m
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 123.123.123.123:9115
- job_name: 'HW_BWG洛杉矶'
scrape_interval: 6s
metrics_path: /probe
params:
module: [icmp]
#static_configs:
file_sd_configs:
- files:
- /root/prometheus/targets/targets_HW_DC.yml
refresh_interval: 1m
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 234.234.234.234:9115
#下面是进行TCP监控,注意监控点可以是相同的,job_name 不要重复了,Prometheus查询延时的语句是 "probe_duration_seconds"
# - job_name: 'DL_广州移动TCP'
# scrape_interval: 1s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
# metrics_path: /probe
# params:
# module: [tcp_connect]
# file_sd_configs:
# - files:
# - /root/prometheus/targets/targets_MLS_LA_TCP.yml
# refresh_interval: 1m
# relabel_configs:
# - source_labels: [__address__]
# target_label: __param_target
# - source_labels: [__param_target]
# target_label: instance
# - target_label: __address__
# replacement: 123.123.123.123:9115 # The blackbox exporter's real hostname:port.
#下面是进行HTTP监控,注意监控点可以是相同的,job_name 不要重复了,Prometheus查询延时的语句是 "probe_duration_seconds"
# - job_name: 'DL_陕西联通HTTP'
# scrape_interval: 6s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
# metrics_path: /probe
# params:
# module: [http_2xx]
# file_sd_configs:
# - files:
# - /root/prometheus/targets/targets_HTTP.yml
# refresh_interval: 1m
# relabel_configs:
# - source_labels: [__address__]
# target_label: __param_target
# - source_labels: [__param_target]
# target_label: instance
# - target_label: __address__
# replacement: 123.123.123.123:9115
3、编写targets 配置文件
#大陆数据中心
cat /root/prometheus/targets/targets_DL_DC.yml
#######################################
########### 国内机房监控 ##########
#######################################
- targets:
- 11.96.11.126
labels:
datacenter: '广东江门'
ISP: '江门电信'
other: '供应商设备'
- targets:
- 40.27.68.170
- 13.24.16.227
- 40.24.58.1
- 40.24.58.99
labels:
datacenter: '深圳易信'
ISP: '易信BGP'
other: '易信新67-vlanif1500-互联vlan3040'
===========================
#海外数据中心
cat /root/prometheus/targets/targets_HW_DC.yml
#######################################
########### 海外机房监控 ##########
#######################################
####### 美西华盛顿机房监控 #######
- targets:
- 23.52.174.129
- 23.52.173.1
labels:
datacenter: 'IKG-华盛顿'
########### 台湾机房监控 #######
- targets:
- 15.23.61.1
- 45.15.139.1
- 154.22.62.1
labels:
datacenter: 'IKG-台湾'
ISP: '国际带宽'
- targets:
- 5.82.5.129
labels:
datacenter: 'IKG-台湾'
ISP: '优化带宽'
other: '不过墙'
4、调试启动Prometheus,启动后监听 9090端口
#检测Prometheus配置文件是否正确
promtool check rules /etc/prometheus/prometheus.rules.yml
/root/prometheus/prometheus --config.file=/root/prometheus/prometheus.yml
#浏览器访问这个url,看看是否正常
http://xx.xxx.com:9090
5、将Prometheus 写入启动服务,并设置为自动启动
cat /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
After=network-online.target
[Service]
User=root
Group=root
ExecStart=/root/prometheus/prometheus --config.file=/root/prometheus/prometheus.yml --storage.tsdb.retention.time=90d --web.enable-admin-api --web.enable-lifecycle --web.external-url=http://xx.xxx.com
Restart=on-abort
#--storage.tsdb.retention=180d 指定Prometheus的数据存储周期
#--web.enable-admin-api 开放API,这样才能够,使用api删除数据
#--web.enable-lifecycle 开放重载配置,通过curl -X POST http://localhost:9090/-/reload
#--web.external-url=http://xx.xxx.com 用于在推送告警时, 代表.GeneratorURL参数的值,方便直接访问Prometheus
#注意:Prometheus的数据 默认保存在根目录下 /data里面 它通过预写日志(WAL)防止崩溃
#--storage.tsdb.path:这确定Prometheus在何处写入其数据库。默认为data/
[Install]
WantedBy=multi-user.target
systemctl start prometheus.service
systemctl enable prometheus.service
第二:在分布式探测点上部署blackbox_exporter
1、下载,解压,重命名
wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.16.0/blackbox_exporter-0.16.0.linux-amd64.tar.gz
tar xf blackbox_exporter-0.16.0.linux-amd64.tar.gz -C /root/
rm blackbox_exporter-0.16.0.linux-amd64.tar.gz -rf
mv blackbox_exporter-0.16* blackbox_exporter
2、写入服务,并设为开机自启动,无需修改配置文件,后期也可不用调整
#调试测试
/opt/blackbox_exporter/blackbox_exporter --config.file=/opt/blackbox_exporter/blackbox.yml
#浏览器访问这个url,看看是否正常,这是探测点的ip
http://175.10.134.13:9115/metrics
#写入服务文件
cat /etc/systemd/system/blackbox_exporter.service
[Unit]
Description=Blackbox Exporter Server
After=network-online.target
[Service]
User=root
Group=root
ExecStart=/root/blackbox_exporter/blackbox_exporter --config.file=/root/blackbox_exporter/blackbox.yml
Restart=on-abort
[Install]
WantedBy=multi-user.target
#启动并加入开机自启
systemctl start blackbox_exporter
systemctl enable blackbox_exporter
3、在分布式探测点上实施安全限制,只让集中式管理服务器ip访问9115端口
iptables -A INPUT -s 13.1.7.9/32 -p tcp --dport 9115 -j ACCEPT
iptables -A INPUT -p tcp --dport 9115 -j DROP
#参考文章
https://www.cnblogs.com/xiaobaozi-95/p/10684524.html
https://github.com/kaihendry/pingprom
https://yq.aliyun.com/articles/502585
https://cloud.tencent.com/developer/article/1388153
https://www.jianshu.com/p/0403598ac8eb
如果文章对你有帮助,欢迎点击上方按钮打赏作者
暂无评论