1、简介
LVS+Keepalived 能实现的功能:利用 LVS 控制器主备模式避免单点故障以及自动删除故障 WEB 服务器结点并当它恢复后再自动添加到群集中。拓扑图: 2、系统环境系统平台:RHEL6.4硬件平台:dell R720*4硬件参数:cpu(e5-2609)两颗;内存 32G;磁盘 600G*6;RAID 5 ;双电源LVS 版本:ipvsadm-1.25Keepalived 版本:keepalived-1.2.73、IP 地址规划:VIP:111.13.6.77LVS-MASTER:111.13.6.75LVS-BACKUP:111.13.6.76WEB1:111.13.6.73WEB2:111.13.6.74这些 IP 地址根据实际情况而定4、安装及配置 keepalived1)下载源码包并在两台 LVS 服务器上进行编译安装#wget http://www.keepalived.org/software/keepalived-1.2.7.tar.gz[root@LVS-MASTER ~]# tar-zxvf keepalived-1.2.7.tar.gz[root@LVS-MASTER ~]# cd keepalived-1.2.7[root@LVS-MASTER keepalived-1.2.7]# ./configure省略......checking for gcc... nochecking for cc... Nochecking for cc... nochecking for cl... noconfigure: error: no acceptable C compiler found in $PATHSee `config.log' for more details.如果出现上面的提示说明缺少 gcc 编译工具,执行下面命令进行安装:[root@LVS-MASTER keepalived-1.2.7]# yum install gcc -y[root@LVS-MASTER keepalived-1.2.7]# ./configure省略....checking for sys/ioctl.h... yeschecking sys/time.h usability... yeschecking sys/time.h presence... yeschecking for sys/time.h... yeschecking openssl/ssl.h usability... nochecking openssl/ssl.h presence... nochecking for openssl/ssl.h... noconfigure: error:!!! OpenSSL is not properly installed on your system. !!!!!! Can not include OpenSSL headers files.!!!如果出现上面的提示说明缺少 openssl-devel 包,安装软件包如下:[root@LVS-MASTER keepalived-1.2.7]# yum install openssl-devel -y[root@LVS-MASTER keepalived-1.2.7]# ./configure省略....checking for openssl/md5.h... yeschecking openssl/err.h usability... yeschecking openssl/err.h presence... yeschecking for openssl/err.h... yeschecking for MD5_Init in -lcrypto... yeschecking for SSL_CTX_new in -lssl... yeschecking for poptGetContext in -lpopt... noconfigure: error: Popt libraries is required如果出现上面提示说明缺少 popt-devel 包,安装软件包如下:[root@LVS-MASTER keepalived-1.2.7]# yum install popt-devel -y[root@LVS-MASTER keepalived-1.2.7]# ./configureconfig.status: creating Makefileconfig.status: creating genhash/Makefileconfig.status: creating keepalived/core/Makefileconfig.status: creating keepalived/include/config.hconfig.status: creating keepalived.specconfig.status: creating keepalived/Makefileconfig.status: creating lib/Makefileconfig.status: creating keepalived/vrrp/MakefileKeepalived configuration------------------------Keepalived versionCompiler: 1.2.7: gccCompiler flags: -g -O2Extra Lib : -lpopt -lssl -lcryptoUse IPVS Framework : NoIPVS sync daemon support : NoUse VRRP Framework : YesUse LinkWatch : NoUse Debug flags: No如果出现上面信息说明编译环境检查成功,编译命令如下:[root@LVS-MASTER keepalived-1.2.7]# yum install make -y[root@LVS-MASTER keepalived-1.2.7]# make && make install################ 将 keepalived 做成启动服务 #################[root@LVS-MASTER ~]#cp /usr/local/etc/rc.d/init.d/keepalived /etc/init.d/[root@LVS-MASTER ~]#cp /usr/local/etc/sysconfig/keepalived /etc/sysconfig/[root@LVS-MASTER ~]#mkdir /etc/keepalived[root@LVS-MASTER ~]#cp /usr/local/etc/keepalived/keepalived.conf /etc/keepalived/[root@LVS-MASTER ~]# cp /usr/local/sbin/keepalived /usr/sbin/[root@LVS-MASTER ~]# /etc/init.d/keepalived start
正在启动 keepalived: [确定][root@LVS-MASTER ~]# chkconfig --add keepalived[root@LVS-MASTER ~]# chkconfig keepalived on[root@LVS-MASTER ~]# cd /etc/keepalived/[root@LVS-MASTER keepalived]# vim keepalived.conf2)Keepalived 主节点配置如下,红色表示需要配置项:##################### LVS-MASTER ####################! Configuration File for keepalivedglobal_defs { notification_email { renlifeng@redflag-linux.com #指定 keepalived 在发生切换时需要发 送 email 到的对象,一行一个。}notification_email_from Alexandre.Cassen@firewall.loc #指定发件人smtp_server 127.0.0.1 #指定 smtp 服务器地址smtp_connect_timeout 30 #指定 smtp 连接超时时间router_id LVS_DEVEL #运行 keepalived 机器的一个标识}vrrp_instance VI_1 { #监控多个网段的实例state MASTER #指定那个为 master,那个为 backup,如果设置了
nopreempt 这个值不起作用,主备靠 priority 决定interface eth0 #设置实例绑定的网卡virtual_router_id 51priority 100 #优先级,高优先级竟先为 masteradvert_int 1 #检查间隔,默认为 1 秒authentication { #设置认证auth_type PASS #认证方式auth_pass 1111 #认证密码}virtual_ipaddress { #设置 VIP111.13.6.77}}virtual_server 111.13.6.77 80 { delay_loop 6 #健康检查时间间隔lb_algo rr #LVS 调度算法 rr|wrr|lc|wlc|lblc|sh|dhlb_kind DR #负载均衡转发规则 NAT|DR|TUN,默认为 NATpersistence_timeout 50 #会话保持时间protocol TCP #使用的协议real_server 111.13.6.73 80 { #真实服务器 IP 地址和端口weight 3 #权重 默认为 1,0 为失效
TCP_CHECK { connect_timeout 10 #连接超时时间nb_get_retry 3 #重试次数
delay_before_retry 3 #重试时间间隔
connect_port 80 #健康检查的端口
}
}real_server 111.13.6.74 80 { weight 3TCP_CHECK { connect_timeout 10nb_get_retry 3delay_before_retry 3connect_port 80}}######################### END ########################3)Keepalived 备节点配置如下,红色表示配置项,蓝色表示与 MASTER 不同之处:########################### BACKUP#########################! Configuration File for keepalivedglobal_defs { notification_email { renlifeng@redflag-linux.com #指定 keepalived 在发生切换时需要发 送 email 到的对象,一行一个。}notification_email_from Alexandre.Cassen@firewall.loc #指定发件人smtp_server 127.0.0.1 #指定 smtp 服务器地址smtp_connect_timeout 30 #指定 smtp 连接超时时间router_id LVS_DEVEL #运行 keepalived 机器的一个标识}vrrp_instance VI_1 { #监控多个网段的实例state BACKUP #指定那个为 master,那个为 backup,如果设置了
nopreempt 这个值不起作用,主备靠 priority 决定interface eth0 #设置实例绑定的网卡virtual_router_id 51priority 99 #优先级,高优先级竟先为 masteradvert_int 1 #检查间隔,默认为 1 秒authentication { #设置认证auth_type PASS #认证方式auth_pass 1111 #认证密码}virtual_ipaddress { #设置 VIP111.13.6.77}}virtual_server 111.13.6.77 80 { delay_loop 6 #健康检查时间间隔lb_algo rr #LVS 调度算法 rr|wrr|lc|wlc|lblc|sh|dhlb_kind DR #负载均衡转发规则 NAT|DR|TUN,默认为 NATpersistence_timeout 50 #会话保持时间(同一 IP 的连接 50 秒内被分配到同一台 realserver)protocol TCP #使用的协议real_server 111.13.6.73 80 { #真实服务器 IP 地址和端口
weight 3 #权重 默认为 1,0 为失效
TCP_CHECK { connect_timeout 10 #连接超时时间nb_get_retry 3 #重试次数
delay_before_retry 3 #重试时间间隔
connect_port 80 #健康检查的端口
}
}real_server111.13.6.74 80 { weight 3TCP_CHECK { connect_timeout 10nb_get_retry 3delay_before_retry 3connect_port 80}}######################### END #########################5、安装及配置 LVS(DR)1)安装 ipvsadm[root@LVS-MASTER ~]# yum install ipvsadm -y2)分别在 LVS-MASTER 和 LVS-BACKUP 上执行 director.sh 脚本,脚本内容如下:[root@LVS-MASTER ~]# cat director.sh#!/bin/bashVIP=111.13.6.77RIP1=111.13.6.73RIP2=111.13.6.74# Open IP Forwardingecho "1"> /proc/sys/net/ipv4/ip_forward#ifconfig eth0 172.16.86.167 netmask 255.255.248.0 upifconfig eth0:0 $VIP netmask 255.255.255.0 broadcast $VIP upipvsadm -Cipvsadm -A -t $VIP:80 -s rripvsadm -a -t $VIP:80 -r $RIP1 -g -w 1ipvsadm -a -t $VIP:80 -r $RIP2 -g -w 1service ipvsadm save[root@LVS-MASTER ~]#3)在两台 WEB Server 服务器上执行 realserver.sh 脚本,为 lo:0 绑定 VIP 地址111.13.6.77,抑制 ARP 广播,脚本内容如下:[root@WEB1 ~]# cat realserver.sh#!/bin/bash#description: Config realserverVIP=111.13.6.77#/etc/rc.d/init.d/functionscase "$1" instart)/sbin/ifconfig lo:0 $VIP netmask 255.255.255.255 broadcast $VIP/sbin/route add -host $VIP dev lo:0echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignoreecho "2" >/proc/sys/net/ipv4/conf/lo/arp_announceecho "1" >/proc/sys/net/ipv4/conf/all/arp_ignoreecho "2" >/proc/sys/net/ipv4/conf/all/arp_announcesysctl -p >/dev/null 2>&1echo "RealServer Start OK";;stop)/sbin/ifconfig lo:0 down/sbin/route del $VIP >/dev/null 2>&1echo "0" >/proc/sys/net/ipv4/conf/lo/arp_ignoreecho "0" >/proc/sys/net/ipv4/conf/lo/arp_announceecho "0" >/proc/sys/net/ipv4/conf/all/arp_ignoreecho "0" >/proc/sys/net/ipv4/conf/all/arp_announceecho "RealServer Stoped";;*)echo "Usage: $0 {start|stop}" exit 1esacexit 0[root@WEB1 ~]# sh realserver.sh start 启动命令6、重启两台 LVS 服务器的 keepalived 服务并做相关测试1)重启 keepalived 服务[root@LVS-SERVER ~]# /etc/init.d/keepalived restart停止 keepalived: [确定]正在启动 keepalived: [确定]2)通过浏览器访问页面,查看 LVS-MASTER 状态[root@LVS-MASTER ~]# ipvsadm -L -nIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags-> RemoteAddress:PortForward Weight ActiveConn InActConnTCP111.13.6.77:80 rr persistent 50-> 111.13.6.73:80 Route 3 0 15-> 111.13.6.74:80 Route 3 0 153)通过浏览器访问页面,查看 LVS-BACKUP 状态[root@LVS-BACKUP ~]# ipvsadm -L -nIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags-> RemoteAddress:PortTCPForward Weight ActiveConn InActConn111.13.6.77:80 rr persistent 50-> 111.13.6.73:80 Route 3 0 0-> 111.13.6.74:80 Route 3 0 0通过对比发现在 MASTER 上有数据包,在 BACKUP 上没有任何数据包4)接下来做高可用性测试和故障切换测试##################### 高可用性测试 #################模拟故障, 将LVS-MASTER 上的 keepalived 服务停掉,然后观察 LVS-BACKUP上的日志,内容如下:[root@LVS-BACKUP ~]# tail -0f /var/log/messagesAug 16 18:47:46 LVS-BACKUP Keepalived_vrrp[2061]: VRRP_Instance(VI_1)Transition to MASTER STATEAug 16 18:47:47 LVS-BACKUP Keepalived_vrrp[2061]: VRRP_Instance(VI_1)Entering MASTER STATEAug 16 18:47:47 LVS-BACKUP Keepalived_vrrp[2061]: VRRP_Instance(VI_1) settingprotocol VIPs.Aug 16 18:47:47 LVS-BACKUP Keepalived_healthcheckers[2060]: Netlink reflectorreports IP 172.16.86.164 addedAug 16 18:47:47 LVS-BACKUP Keepalived_vrrp[2061]: VRRP_Instance(VI_1)Sending gratuitous ARPs on eth0 for 111.13.6.77Aug 16 18:47:52 LVS-BACKUP Keepalived_vrrp[2061]: VRRP_Instance(VI_1)Sending gratuitous ARPs on eth0 for 111.13.6.77从日志中可知,主机出现故障后,备机立刻检测到,此时备机变为 MASTER 角色,并且接管了主机的虚拟 IP 资源,最后将虚拟 IP 绑定到 eth0 设备上。将 LVS-MASTER 上的 keepalived 服务开启后,LVS-BACKUP 的日志状态:[root@LVS-BACKUP ~]# tail -0f /var/log/messagesAug 16 18:57:34 LVS-BACKUP Keepalived_vrrp[2061]: VRRP_Instance(VI_1)Received higher prio advertAug 16 18:57:34 LVS-BACKUP Keepalived_vrrp[2061]: VRRP_Instance(VI_1)Entering BACKUP STATEAug 16 18:57:34 LVS-BACKUP Keepalived_vrrp[2061]: VRRP_Instance(VI_1)removing protocol VIPs.Aug 16 18:57:34 LVS-BACKUP Keepalived_healthcheckers[2060]: Netlink reflectorreports IP 111.13.6.77 removed从日志可知,备机在检测到主机重新恢复正常后,释放了虚拟 IP 资源重新成为BACKUP 角色。#################### 故障切换测试 ##################故障切换是测试当某个节点出现故障后,keepalived 监控模块是否能及时发现然后屏蔽故障节点,同时将服务器转移到正常节点来执行。将 WEB2 节点停掉,假设这个节点出现故障,然后查看主、备机日志信息如下:[root@LVS-MASTER ~]# tail -0f /var/log/messagesAug 16 19:10:02 LVS-MASTER Keepalived_healthcheckers[2060]: TCP connection to[111.13.6.74]:80 failed !!!Aug 16 19:10:02 LVS-MASTER Keepalived_healthcheckers[2060]: Removing service[111.13.6.74]:80 from VS [111.13.6.77]:80Aug 16 19:10:02 LVS-MASTER Keepalived_healthcheckers[2060]: Remote SMTPserver [127.0.0.1]:25 connected.Aug 16 19:10:03 LVS-MASTER Keepalived_healthcheckers[2060]: SMTP alertsuccessfully sent.[root@LVS-BACKUP ~]# ipvsadm -L -nIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags-> RemoteAddress:PortTCPForward Weight ActiveConn InActConn111.13.6.77:80 rr persistent 50-> 111.13.6.73:80 Route 3 0 0从以上信息可以看出,keeplived 监控模块检测到 111.13.6.74 这台主机出现故障后, 将WEB2 从集群中踢除出去,此时访问 http://111.13.6.77 就只能访问到 WEB1的内容了。重新启动 WEB2 节点的服务,日志信息如下:[root@LVS-MASTER sbin]# tail -0f /var/log/messagesAug 16 20:11:48 LVS-MASTER Keepalived_healthcheckers[3230]: TCP connection to[111.13.6.74]:80 success.Aug 16 20:11:48 LVS-MASTER Keepalived_healthcheckers[3230]: Adding service[111.13.6.74]:80 to VS [111.13.6.77]:80Aug 16 20:11:48 LVS-MASTER Keepalived_healthcheckers[3230]: Remote SMTPserver [127.0.0.1]:25 connected.Aug 16 20:11:49 LVS-MASTER Keepalived_healthcheckers[3230]: SMTP alertsuccessfully sent.[root@director1 ~]# ipvsadm -L -nIP Virtual Server version 1.2.1 (size=4096)Prot LocalAddress:Port Scheduler Flags-> RemoteAddress:PortTCPForward Weight ActiveConn InActConn111.13.6.77:80 rr persistent 50-> 111.13.6.73:80 Route 3 0 0-> 111.13.6.74:80 Route 3 0 0当 keepalived 监控模块检测到 111.13.6.74 这台主机恢复正常后,又将此节点加入集群中,再次访问就可以访问到 WEB2 页面了。注意:如果在 keepalived 配置文件中设置了 persistence_timeout 后,你通过 ipvsadm -L -n命令时会发现一直连接着某个 realserver,而不是平均分配资源,其实就是因为persistence_timeout 这个参数的导致的。具体解释如下:Keepalived 的 tcp 长连接问题(persistence_timeout)虽然应用 keepalived 搞定了后端服务负载均衡和高可用性问题,但是在具体应用的时候,还是要注意很多问题。很多应用都用 tcp 或者 http 的长连接,因为建立tcp 连接或者 http 连接开销比较大,而应用端其实是需要频繁跟 server 端通讯的,这时候保持长连接无疑是非常合适的。LVS 的设置:查看命令是 ipvsadm --list --timeout 比如我的机器就会返回如下结果:Timeout (tcp tcpfin udp): 900 120 300这就表明我的 tcp session 的 timeout 时间是 900 秒。设置 timeout:#ipvsadm --set 7200 120 300这个值如果设置太小,你的 client 将收到 connection reset by peer 此类的错误提示。Keepalived 的配置:就是 virtual_server 的 persistence_timeout,意思就是在一定时间内会有来自同一用户(根据 IP 来判断的)route 到同一个 realserver。对于长连接类的应用,你肯定需要这么做。配置值最好跟 LVS 的配置的 timeout 一致。