zabbix监控脑裂
- 1. haproxy+keepalived实现nginx负载均衡机高可用
- 2. 部署nginx
- 3. 部署haproxy
- 3.1 在 master 上部署 haproxy
- 3.2 在 slave 上部署 haproxy
- 4. keepalived安装
- 4.1 keepalived配置
- 4.1.1 配置主keepalived
- 4.1.2 配置备keepalived
- 4.1.3 查看VIP在哪里
- 4.4 修改内核参数,开启监听VIP功能
- 4.5 让keepalived监控haproxy负载均衡机
- 4.6 配置keepalived加入监控脚本的配置
- 模拟master 宕机后 slave 会继续提供业务
- 5. 监控脑裂
- 5.1添加主机
- 5.2添加模板
- 5.3zabbix 添加监控项
- 5.4添加触发器
- 5.5 报警发邮件
1. haproxy+keepalived实现nginx负载均衡机高可用
环境说明
系统信息 | 主机名 | IP | |
---|---|---|---|
centos8 / redhat8 | master | 192.168.229.130 | haproxy+keepalived |
centos8 / redhat8 | slave | 192.168.229.148 | haproxy+keepalived |
centos8 / redhat8 | RS1 | 192.168.229.150 | nginx |
centos8 / redhat8 | RS2 | 192.168.229.151 | nginx |
本次高可用虚拟IP(VIP)地址暂定为 192.168.229.250
2. 部署nginx
在 RS1 跟 RS2 上部署网站、RS1和RS2 已安装 yum 源和已关闭防火墙跟SElinux
1
2
3
4[root@RS2 ~]# dnf -y install nginx [root@RS1 ~]# echo 'RS1' > /usr/share/nginx/html/index.html [root@RS1 ~]# systemctl start nginx
访问 RS1
1
2
3
4[root@RS2 ~]# dnf -y install nginx [root@RS1 ~]# echo 'RS1' > /usr/share/nginx/html/index.html [root@RS1 ~]# systemctl start nginx
访问 RS2
3. 部署haproxy
在 master 和 slave 上部署 haproxy
3.1 在 master 上部署 haproxy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162//关闭防火墙与SELINUX [root@master ~]# systemctl stop firewalld [root@master ~]# systemctl disable firewalld Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service. [root@master ~]# setenforce 0 [root@master ~]# sed -ri 's/^(SELINUX=).*/1disabled/g' /etc/selinux/config ## 下载依赖包 [root@master ~]# dnf -y install boost-devel --allowerasing make gcc pcre-devel bzip2-devel openssl-devel systemd-devel ## 下载gcc 报错需要下载这个 包 boost-devel --allowerasing ## 下载软件包 [root@master ~]# cd /usr/src/ [root@master src]# wget https://github.com/haproxy/haproxy/archive/refs/tags/v2.6.0.tar.gz ## 创建一个系统用户 [root@master ~]# useradd -r -M -s /sbin/nologin haproxy [root@master ~]# id haproxy uid=995(haproxy) gid=992(haproxy) groups=992(haproxy) ## 解压软件包 [root@master src]# tar xf haproxy-2.6.0.tar.gz [root@master src]# ls debug haproxy-2.6.0 haproxy-2.6.0.tar.gz kernels [root@master src]# cd haproxy-2.6.0 [root@master haproxy-2.6.0]# make clean [root@master haproxy-2.6.0]# make -j $(grep 'processor' /proc/cpuinfo |wc -l) TARGET=linux-glibc USE_OPENSSL=1 USE_ZLIB=1 USE_PCRE=1 USE_SYSTEMD=1 ## make install 安装 [root@master haproxy-2.6.0]# make install PREFIX=/usr/local/haproxy ## 软链接 [root@master ~]# ln -s /usr/local/haproxy/sbin/haproxy /usr/sbin/ [root@master ~]# ll /usr/sbin/haproxy -d lrwxrwxrwx. 1 root root 31 Aug 15 17:42 /usr/sbin/haproxy -> /usr/local/haproxy/sbin/haproxy 配置各个负载的内核参数 [root@master ~]# echo 'net.ipv4.ip_nonlocal_bind = 1' >> /etc/sysctl.conf [root@master ~]# echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf [root@master ~]# sysctl -p net.ipv4.ip_nonlocal_bind = 1 net.ipv4.ip_forward = 1 [root@master ~]# cat /etc/sysctl.conf # sysctl settings are defined through files in # /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/. # # Vendors settings live in /usr/lib/sysctl.d/. # To override a whole file, create a new file with the same in # /etc/sysctl.d/ and put new settings there. To override # only specific settings, add a file with a lexically later # name in /etc/sysctl.d/ and put new settings there. # # For more information, see sysctl.conf(5) and sysctl.d(5). net.ipv4.ip_nonlocal_bind = 1 net.ipv4.ip_forward = 1 [root@master ~]# mkdir /etc/haproxy ## 提供配置文件 [root@master ~]# cat /etc/haproxy/haproxy.cfg #--------------全局配置---------------- global log 127.0.0.1 local0 info #log loghost local0 info maxconn 20480 #chroot /usr/local/haproxy pidfile /var/run/haproxy.pid #maxconn 4000 user haproxy group haproxy daemon #--------------------------------------------------------------------- #common defaults that all the 'listen' and 'backend' sections will #use if not designated in their block #--------------------------------------------------------------------- defaults mode http log global option dontlognull option httpclose option httplog #option forwardfor option redispatch balance roundrobin timeout connect 10s timeout client 10s timeout server 10s timeout check 10s maxconn 60000 retries 3 #--------------统计页面配置------------------ listen admin_stats bind 0.0.0.0:8189 stats enable mode http log global stats uri /haproxy_stats stats realm Haproxy Statistics stats auth admin:admin #stats hide-version stats admin if TRUE stats refresh 30s #---------------web设置----------------------- listen webcluster bind 0.0.0.0:80 mode http #option httpchk GET /index.html log global maxconn 3000 balance roundrobin cookie SESSION_COOKIE insert indirect nocache server web01 192.168.229.150:80 check inter 2000 fall 5 server web02 192.168.229.151:80 check inter 2000 fall 5 #server web03 192.168.80.102:80 cookie web01 check inter 2000 fall 5 ## 配置 service 文件 cat > /usr/lib/systemd/system/haproxy.service <<EOF [Unit] Description=HAProxy Load Balancer After=syslog.target network.target [Service] ExecStartPre=/usr/local/haproxy/sbin/haproxy -f /etc/haproxy/haproxy.cfg -c -q ExecStart=/usr/local/haproxy/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid ExecReload=/bin/kill -USR2 $MAINPID [Install] WantedBy=multi-user.target EOF ## 刷新进程 [root@master ~]# systemctl daemon-reload ## 启用日志 [root@master ~]# vim /etc/rsyslog.conf # Save boot messages also to boot.log local7.* /var/log/boot.log local0.* /var/log/haproxy.log [root@master ~]# systemctl restart rsyslog ## 设置开机自启 [root@master ~]# systemctl enable --now haproxy
master 访问 验证
3.2 在 slave 上部署 haproxy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159//关闭防火墙与SELINUX [root@slave ~]# systemctl stop firewalld [root@slave ~]# systemctl disable firewalld Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service. [root@slave ~]# setenforce 0 [root@slave ~]# sed -ri 's/^(SELINUX=).*/1disabled/g' /etc/selinux/config ## 下载依赖包 [root@slave ~]# dnf -y install boost-devel --allowerasing make gcc pcre-devel bzip2-devel openssl-devel systemd-devel ## 下载gcc 报错需要下载这个 包 boost-devel --allowerasing ## 下载软件包 [root@slave ~]# cd /usr/src/ [root@slave src]# wget https://github.com/haproxy/haproxy/archive/refs/tags/v2.6.0.tar.gz ## 创建一个系统用户 [root@slave ~]# useradd -r -M -s /sbin/nologin haproxy [root@slave ~]# id haproxy uid=995(haproxy) gid=992(haproxy) groups=992(haproxy) ## 解压软件包 [root@slave src]# tar xf haproxy-2.6.0.tar.gz [root@slave src]# ls debug haproxy-2.6.0 haproxy-2.6.0.tar.gz kernels [root@slave src]# cd haproxy-2.6.0 [root@slave haproxy-2.6.0]# make clean [root@slave haproxy-2.6.0]# make -j $(grep 'processor' /proc/cpuinfo |wc -l) TARGET=linux-glibc USE_OPENSSL=1 USE_ZLIB=1 USE_PCRE=1 USE_SYSTEMD=1 ## make install 安装 [root@slave haproxy-2.6.0]# make install PREFIX=/usr/local/haproxy ## 软链接 [root@slave ~]# ln -s /usr/local/haproxy/sbin/haproxy /usr/sbin/ [root@slave ~]# ll /usr/sbin/haproxy -d lrwxrwxrwx. 1 root root 31 Aug 15 17:42 /usr/sbin/haproxy -> /usr/local/haproxy/sbin/haproxy 配置各个负载的内核参数 [root@slave ~]# echo 'net.ipv4.ip_nonlocal_bind = 1' >> /etc/sysctl.conf [root@slave ~]# echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf [root@slave ~]# sysctl -p net.ipv4.ip_nonlocal_bind = 1 net.ipv4.ip_forward = 1 [root@slave ~]# cat /etc/sysctl.conf # sysctl settings are defined through files in # /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/. # # Vendors settings live in /usr/lib/sysctl.d/. # To override a whole file, create a new file with the same in # /etc/sysctl.d/ and put new settings there. To override # only specific settings, add a file with a lexically later # name in /etc/sysctl.d/ and put new settings there. # # For more information, see sysctl.conf(5) and sysctl.d(5). net.ipv4.ip_nonlocal_bind = 1 net.ipv4.ip_forward = 1 [root@slave ~]# mkdir /etc/haproxy ## 提供配置文件 [root@slave ~]# cat > /etc/haproxy/haproxy.cfg <<EOF #--------------全局配置---------------- global log 127.0.0.1 local0 info #log loghost local0 info maxconn 20480 #chroot /usr/local/haproxy pidfile /var/run/haproxy.pid #maxconn 4000 user haproxy group haproxy daemon #--------------------------------------------------------------------- #common defaults that all the 'listen' and 'backend' sections will #use if not designated in their block #--------------------------------------------------------------------- defaults mode http log global option dontlognull option httpclose option httplog #option forwardfor option redispatch balance roundrobin timeout connect 10s timeout client 10s timeout server 10s timeout check 10s maxconn 60000 retries 3 #--------------统计页面配置------------------ listen admin_stats bind 0.0.0.0:8189 stats enable mode http log global stats uri /haproxy_stats stats realm Haproxy Statistics stats auth admin:admin #stats hide-version stats admin if TRUE stats refresh 30s #---------------web设置----------------------- listen webcluster bind 0.0.0.0:80 mode http #option httpchk GET /index.html log global maxconn 3000 balance roundrobin cookie SESSION_COOKIE insert indirect nocache server web01 192.168.229.150:80 check inter 2000 fall 5 server web02 192.168.229.151:80 check inter 2000 fall 5 #server web01 192.168.80.102:80 cookie web01 check inter 2000 fall 5 EOF ## 配置 service 文件 cat > /usr/lib/systemd/system/haproxy.service <<EOF [Unit] Description=HAProxy Load Balancer After=syslog.target network.target [Service] ExecStartPre=/usr/local/haproxy/sbin/haproxy -f /etc/haproxy/haproxy.cfg -c -q ExecStart=/usr/local/haproxy/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid ExecReload=/bin/kill -USR2 $MAINPID [Install] WantedBy=multi-user.target EOF ## 刷新进程 [root@slave ~]# systemctl daemon-reload ## 启用日志 [root@master ~]# vim /etc/rsyslog.conf # Save boot messages also to boot.log local7.* /var/log/boot.log local0.* /var/log/haproxy.log [root@master ~]# systemctl restart rsyslog
slave 访问验证
4. keepalived安装
配置主keepalived
1
2
3
4
5
6
7
8
9
10
11
12
13//安装keepalived [root@master ~]# dnf -y install keepalived //查看安装生成的文件 [root@master ~]# rpm -ql keepalived /etc/keepalived //配置目录 /etc/keepalived/keepalived.conf //此为主配置文件 /etc/sysconfig/keepalived /usr/bin/genhash /usr/lib/systemd/system/keepalived.service //此为服务控制文件 /usr/libexec/keepalived /usr/sbin/keepalived .....此处省略N行
用同样的方法在备服务器上安装keepalived
1
2//安装keepalived [root@slave ~]# yum -y install keepalived
4.1 keepalived配置
4.1.1 配置主keepalived
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57[root@master ~]# cat /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { router_id lb01 } vrrp_instance VI_1 { state MASTER interface ens160 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass agantkl } virtual_ipaddress { 192.168.229.250 } } virtual_server 192.168.229.250 80 { delay_loop 6 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.229.130 80 { weight 1 TCP_CHECK { connect_port 80 connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } real_server 192.168.229.148 80 { weight 1 TCP_CHECK { connect_port 80 connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } ## 设置开机自启 [root@master ~]# systemctl enable --now keepalived Created symlink /etc/systemd/system/multi-user.target.wants/keepalived.service → /usr/lib/systemd/system/keepalived.service. [root@master ~]# systemctl status keepalived ● keepalived.service - LVS and VRRP High Availability Monitor Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled) Active: active (running) since Wed 2022-08-31 19:12:53 CST; 9s ago
4.1.2 配置备keepalived
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57[root@slave ~]# cat /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { router_id lb02 } vrrp_instance VI_1 { state BACKUP interface ens160 virtual_router_id 51 priority 90 advert_int 1 authentication { auth_type PASS auth_pass agantkl } virtual_ipaddress { 192.168.229.250 } } virtual_server 192.168.229.250 80 { delay_loop 6 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.229.150 80 { weight 1 TCP_CHECK { connect_port 80 connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } real_server 192.168.229.151 80 { weight 1 TCP_CHECK { connect_port 80 connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } [root@slave ~]# systemctl enable --now keepalived Created symlink /etc/systemd/system/multi-user.target.wants/keepalived.service → /usr/lib/systemd/system/keepalived.service. [root@slave ~]# systemctl status keepalived ● keepalived.service - LVS and VRRP High Availability Monitor Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2022-08-30 19:15:56 CST; 9s ago
4.1.3 查看VIP在哪里
在MASTER上查看
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15[root@master ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:c7:d2:c9 brd ff:ff:ff:ff:ff:ff inet 192.168.229.150/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 1705sec preferred_lft 1705sec inet 192.168.229.250/32 scope global ens160 //可以看到此处有VIP valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fec7:d2c9/64 scope link noprefixroute valid_lft forever preferred_lft forever
在SLAVE上查看
1
2
3
4
5
6
7
8
9
10
11
12
13[root@slave ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:32:af:6d brd ff:ff:ff:ff:ff:ff inet 192.168.229.151/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 1676sec preferred_lft 1676sec inet6 fe80::20c:29ff:fe32:af6d/64 scope link noprefixroute valid_lft forever preferred_lft forever
4.4 修改内核参数,开启监听VIP功能
此步可做可不做,该功能可用于仅监听VIP的时候
在master上修改内核参数
1
2
3
4
5[root@master ~]# echo 'net.ipv4.ip_nonlocal_bind = 1' >>/etc/sysctl.conf [root@master ~]# sysctl -p net.ipv4.ip_nonlocal_bind = 1 [root@master ~]# cat /proc/sys/net/ipv4/ip_nonlocal_bind 1
在slave上修改内核参数
1
2
3
4
5[root@slave ~]# echo 'net.ipv4.ip_nonlocal_bind = 1' >>/etc/sysctl.conf [root@slave ~]# sysctl -p net.ipv4.ip_nonlocal_bind = 1 [root@slave ~]# cat /proc/sys/net/ipv4/ip_nonlocal_bind 1
4.5 让keepalived监控haproxy负载均衡机
keepalived通过脚本来监控haproxy负载均衡机的状态
在master上编写脚本
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40[root@master ~]# mkdir /scripts [root@master ~]# cd /scripts/ [root@master scripts]# vim check_n.sh #!/bin/bash haproxy_status=$(ps -ef|grep -Ev "grep|$0"|grep 'bhaproxyb'|wc -l) if [ $haproxy_status -lt 1 ];then systemctl stop keepalived fi [root@master scripts]# vim notify.sh #!/bin/bash case "$1" in master) haproxy_status=$(ps -ef|grep -Ev "grep|$0"|grep 'bhaproxyb'|wc -l) if [ $haproxy_status -lt 1 ];then systemctl start haproxy fi sendmail ;; backup) haproxy_status=$(ps -ef|grep -Ev "grep|$0"|grep 'bhaproxyb'|wc -l) if [ $haproxy_status -gt 0 ];then systemctl stop haproxy fi ;; *) echo "Usage:$0 master|backup VIP" ;; esac [root@master scripts]# chmod +x notify.sh [root@master scripts]# ll total 8 -rwxr-xr-x. 1 root root 149 Sep 1 16:23 check_n.sh -rwxr-xr-x. 1 root root 459 Sep 1 16:32 notify.sh
在slave上编写脚本
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28[root@slave ~]# mkdir /scripts [root@slave ~]# cd /scripts/ [root@slave scripts]# vim notify.sh #!/bin/bash case "$1" in master) haproxy_status=$(ps -ef|grep -Ev "grep|$0"|grep 'bhaproxyb'|wc -l) if [ $haproxy_status -lt 1 ];then systemctl start haproxy fi sendmail ;; backup) haproxy_status=$(ps -ef|grep -Ev "grep|$0"|grep 'bhaproxyb'|wc -l) if [ $haproxy_status -gt 0 ];then systemctl stop haproxy fi ;; *) echo "Usage:$0 master|backup VIP" ;; esac [root@slave scripts]# chmod +x notify.sh [root@slave scripts]# ll total 4 -rwxr-xr-x. 1 root root 459 Sep 1 16:34 notify.sh
4.6 配置keepalived加入监控脚本的配置
配置主keepalived
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61[root@master ~]# vim /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { router_id lb01 } vrrp_script nginx_check { script "/scripts/check_n.sh" interval 1 weight -20 } vrrp_instance VI_1 { state MASTER interface ens160 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass agantkl } virtual_ipaddress { 192.168.229.250 } track_script { nginx_check } notify_master "/scripts/notify.sh master 192.168.229.250" } virtual_server 192.168.229.250 80 { delay_loop 6 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.229.130 80 { weight 1 TCP_CHECK { connect_port 80 connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } real_server 192.168.229.148 80 { weight 1 TCP_CHECK { connect_port 80 connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } [root@master ~]# systemctl restart keepalived
配置备keepalived
backup无需检测haproxy是否正常,当升级为MASTER时启动haproxy,当降级为BACKUP时关闭
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53[root@slave ~]# vim /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { router_id lb02 } vrrp_instance VI_1 { state BACKUP interface ens160 virtual_router_id 51 priority 90 advert_int 1 authentication { auth_type PASS auth_pass agantkl } virtual_ipaddress { 192.168.229.250 } notify_master "/scripts/notify.sh master 192.168.229.250" notify_backup "/scripts/notify.sh backup 192.168.229.250" } virtual_server 192.168.229.250 80 { delay_loop 6 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.229.130 80 { weight 1 TCP_CHECK { connect_port 80 connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } real_server 192.168.229.148 80 { weight 1 TCP_CHECK { connect_port 80 connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } [root@slave ~]# systemctl restart keepalived
使用VIP 访问 192.268.229.250
模拟master 宕机后 slave 会继续提供业务
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17## 停掉 haproxy [root@master ~]# systemctl stop haproxy [root@master ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:f6:e7:cf brd ff:ff:ff:ff:ff:ff inet 192.168.229.130/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 954sec preferred_lft 954sec inet6 fe80::20c:29ff:fef6:e7cf/64 scope link noprefixroute valid_lft forever preferred_lft forever ### master的主机上已经没有 VIP了
查看 slave 、VIP 是否已过来
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17[root@slave ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:57:8f:93 brd ff:ff:ff:ff:ff:ff inet 192.168.229.148/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 925sec preferred_lft 925sec inet 192.168.229.250/32 scope global ens160 // 此处VIP valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe57:8f93/64 scope link noprefixroute valid_lft forever preferred_lft forever
访问
把宕机的 master 修复好之后启动服务 slave 的VIP就会被 master 抢过去
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23[root@master ~]# systemctl start haproxy keepalived [root@master ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN gro link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel st link/ether 00:0c:29:f6:e7:cf brd ff:ff:ff:ff:ff:ff inet 192.168.229.130/24 brd 192.168.229.255 scope global dynamic no valid_lft 1772sec preferred_lft 1772sec inet 192.168.229.250/32 scope global ens160 // 此处VIP valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fef6:e7cf/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@master ~]# ss -antl State Recv-Q Send-Q Local Address:Port LISTEN 0 128 0.0.0.0:80 LISTEN 0 128 0.0.0.0:22 LISTEN 0 128 0.0.0.0:8189 LISTEN 0 128 [::]:22
查看 slave
1
2
3
4
5
6
7
8
9
10
11
12
13
14[root@slave ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:57:8f:93 brd ff:ff:ff:ff:ff:ff inet 192.168.229.148/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 1727sec preferred_lft 1727sec inet6 fe80::20c:29ff:fe57:8f93/64 scope link noprefixroute valid_lft forever preferred_lft forever
访问
5. 监控脑裂
在上面的基础上使用zabbix 监控脑裂
启动一台已部署好的zabbix 服务端
平台 | IP | 服务 |
---|---|---|
centos8 | 192.168.229.129 | zabbix_server(lamp) |
在这里部署zabbix服务端的操作省略
在备主机上安装zabbix客户端 、 使用xftp 把软件传到 slave 主机上
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53[root@slave ~]# tar xf zabbix-5.0.25.tar.gz [root@slave ~]# ls anaconda-ks.cfg zabbix-5.0.25 zabbix-5.0.25.tar.gz # 创建zabbix 系统用户 [root@slave ~]# useradd -r -M -s /sbin/nologin zabbix [root@slave ~]# id zabbix uid=994(zabbix) gid=991(zabbix) groups=991(zabbix) # 下载依赖包 [root@slave ~]# dnf -y install gcc gcc-c++ make vim wget pcre-devel # 进入解压目录,编译 [root@slave ~]# cd zabbix-5.0.25 [root@slave zabbix-5.0.25]# ./configure --enable-agent .....省略N *********************************************************** * Now run 'make install' * * * * Thank you for using Zabbix! * * <http://www.zabbix.com> * *********************************************************** # make install [root@slave zabbix-5.0.25]# make install .....省略N # /usr/local/etc/ [root@slave ~]# cd /usr/local/etc/ [root@slave etc]# ls zabbix_agentd.conf zabbix_agentd.conf.d # 进入zabbix_agentd.conf 编辑,进入之后查找对应的,然后修改 [root@slave etc]# vim zabbix_agentd.conf Server=192.168.229.129 # Server要指向服务端ip ServerActive=192.168.229.129 # 这个也是要指向服务端ip Hostname=tkl # 这个必须唯一,要么使用本机ip,要么,使用随机的,这里我使用随机的agan # 启动服务 [root@slave ~]# zabbix_agentd [root@slave ~]# ss -antl State Recv-Q Send-Q Local Address:Port LISTEN 0 128 0.0.0.0:10050 LISTEN 0 128 0.0.0.0:22 LISTEN 0 128 [::]:22
slave 主机编写服务脚本
脚本显示输出1则代表vip存在,则master服务器已经出现了问题
脚本显示输出0则代表vip不存在,则master服务器没有出现问题
1
2
3
4
5
6
7
8
9
10
11
12[root@slave scripts]# cat test.sh #!/bin/bash if [ `ip a show ens160 |grep 192.168.229.250|wc -l` -ne 0 ] then echo "1" else echo "0" fi [root@slave scripts]# chmod +x test.sh #如果检测到vip的ip地址则会显示数字1,检测不到则显示0。。。1表示有问题,0表示没问题
修改/usr/local/etc/zabbix_agentd.conf文件
1
2
3
4
5
6
7
8[root@slave ~]# vim /usr/local/etc/zabbix_agentd.conf UnsafeUserParameters=1 #启用自定义监控项{1|0} UserParameter=check_process,/scripts/test.sh #指定自定义监控脚本参数 #重启zabbix服务 [root@slave ~]# pkill zabbix_agentd [root@slave ~]# zabbix_agentd
查看备 IP
1
2
3
4
5
6
7
8[root@slave ~]# ip addr show ens160 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:57:8f:93 brd ff:ff:ff:ff:ff:ff inet 192.168.229.148/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 1686sec preferred_lft 1686sec inet6 fe80::20c:29ff:fe57:8f93/64 scope link noprefixroute valid_lft forever preferred_lft forever
服务端上简单测试
1
2
3[root@zabbix ~]# zabbix_get -s 192.168.229.148 -k check_process 0 //证明备用服务器上没有存在了VIP
现在把主 的haproxy关闭让其VIP到备上是否能检查出来
1
2
3
4
5
6
7
8
9[root@master ~]# systemctl stop haproxy [root@master ~]# ip addr show ens160 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:f6:e7:cf brd ff:ff:ff:ff:ff:ff inet 192.168.229.130/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 1568sec preferred_lft 1568sec inet6 fe80::20c:29ff:fef6:e7cf/64 scope link noprefixroute valid_lft forever preferred_lft forever
查看备
1
2
3
4
5
6
7
8
9
10[root@slave ~]# ip addr show ens160 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:57:8f:93 brd ff:ff:ff:ff:ff:ff inet 192.168.229.148/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 1543sec preferred_lft 1543sec inet 192.168.229.250/32 scope global ens160 // 此处 VIP valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe57:8f93/64 scope link noprefixroute valid_lft forever preferred_lft forever
在 zabbix 服务端上测试
1
2
3
4[root@zabbix ~]# zabbix_get -s 192.168.229.148 -k check_process 1 // 证明 脚本能用
zabbix 监控
5.1添加主机
5.2添加模板
5.3zabbix 添加监控项
5.4添加触发器
5.5 报警发邮件
添加完成就报警了,因为刚才测试的时候没有再把备上的VIP切换到主上,所以添加完成就会报警
查看邮件
模拟把主修复好了 VIP到主上
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17[root@master ~]# systemctl start haproxy keepalived [root@master ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 00:0c:29:f6:e7:cf brd ff:ff:ff:ff:ff:ff inet 192.168.229.130/24 brd 192.168.229.255 scope global dynamic noprefixroute ens160 valid_lft 1014sec preferred_lft 1014sec inet 192.168.229.250/32 scope global ens160 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fef6:e7cf/64 scope link noprefixroute valid_lft forever preferred_lft forever
这是查看zabbix 的web页面
最后
以上就是粗心石头最近收集整理的关于zabbix监控脑裂的全部内容,更多相关zabbix监控脑裂内容请搜索靠谱客的其他文章。
发表评论 取消回复