集群服务器:高可用集群搭建详细步骤,集群服务器怎么部署?
高可用集群(High Availability Cluster)的核心目标是通过冗余和故障转移机制确保服务持续可用。以下是基于不同场景的高可用集群搭建详细步骤和部署方法,涵盖Web服务、数据库和通用应用场景:
一、基础环境准备
服务器规划
-
至少2台物理/虚拟机(推荐3节点避免脑裂)
-
示例IP分配:
-
Node1: 192.168.1.10
-
Node2: 192.168.1.11
-
VIP(虚拟IP): 192.168.1.100
-
系统初始化
# 所有节点执行
systemctl stop firewalld && systemctl disable firewalld # 关闭防火墙(生产环境需配置规则)
setenforce 0 && sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config # 关闭SELinux
yum install -y vim net-tools ntp
ntpdate pool.ntp.org # 时间同步(建议部署chrony或NTP服务)
SSH互信配置
# 所有节点生成密钥并互相分发
ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1
ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2
二、Web服务器高可用方案(Keepalived + Nginx/HAProxy)
1. 负载均衡器高可用
-
安装Keepalived
yum install -y keepalived
-
配置Keepalived(主节点)
# /etc/keepalived/keepalived.conf
global_defs {
router_id LVS_MASTER
}vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.100/24 dev eth0
}
} -
备节点配置
state BACKUP
priority 90 # 优先级低于主节点 -
启动服务
systemctl enable keepalived && systemctl start keepalived
2. 负载均衡配置(HAProxy示例)
# 安装HAProxy
yum install -y haproxy
# 配置/etc/haproxy/haproxy.cfg
frontend http_front
bind 192.168.1.100:80
default_backend http_back
backend http_back
balance roundrobin
server web1 192.168.1.10:80 check
server web2 192.168.1.11:80 check
三、数据库高可用方案(MySQL Galera Cluster)
1. 安装Galera Cluster
# 所有节点执行
yum install -y mariadb-server galera
# 配置/etc/my.cnf.d/galera.cnf
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://node1,node2"
binlog_format=row
default_storage_engine=InnoDB
2. 初始化集群
# 在第一个节点启动
galera_new_cluster
# 其他节点加入
systemctl start mariadb
3. 验证集群状态
SHOW STATUS LIKE 'wsrep_cluster_size';
四、通用应用高可用(Pacemaker + Corosync)
1. 安装集群组件
yum install -y pacemaker corosync pcs
systemctl enable pcsd && systemctl start pcsd
passwd hacluster # 设置集群认证密码(所有节点一致)
2. 配置Corosync
pcs cluster auth node1 node2 # 认证节点
pcs cluster setup –name my_cluster node1 node2
pcs cluster start –all
pcs cluster enable –all
3. 配置资源
# 创建虚拟IP资源
pcs resource create Virtual_IP ocf:heartbeat:IPaddr2 ip=192.168.1.100 cidr_netmask=24 op monitor interval=30s
# 创建Apache服务资源
pcs resource create Web_Server systemd:httpd op monitor interval=60s
# 绑定资源组
pcs resource group add Web_Group Virtual_IP Web_Server
# 配置故障转移策略
pcs constraint colocation add Web_Group Virtual_IP INFINITY
pcs constraint order Virtual_IP then Web_Group
五、存储高可用(DRBD + Heartbeat)
1. 安装DRBD
yum install -y drbd kmod-drbd84
# 配置/etc/drbd.d/global_common.conf
resource r0 {
protocol C;
disk { on-io-error detach; }
on node1 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.1.10:7788;
}
on node2 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.1.11:7788;
}
}
2. 初始化并启用DRBD
drbdadm create-md r0
systemctl enable drbd && systemctl start drbd
# 主节点初始化
drbdadm primary r0 –force
mkfs.xfs /dev/drbd0
六、验证与测试
故障转移测试
# 手动触发主节点故障
pcs node standby node1 # Pacemaker集群
systemctl stop keepalived # Keepalived测试
脑裂防护配置
# Pacemaker中配置quorum
pcs property set no-quorum-policy=freeze
网络隔离处理
-
配置STONITH(Shoot The Other Node In The Head)
pcs stonith create fence_node1 fence_ipmilan ipaddr=192.168.1.10 login=admin passwd=password
七、监控与维护
集群状态检查
pcs status # Pacemaker
crm_mon -1 # 实时监控
drbdadm status # DRBD状态
日志管理
-
Corosync日志:/var/log/cluster/corosync.log
-
Pacemaker日志:/var/log/pacemaker.log
推荐工具
-
Web管理界面:hawk(Pacemaker)
-
监控告警:Prometheus + Grafana + Alertmanager
注意事项
网络延迟:节点间延迟需<1ms,建议万兆网络+独立心跳线
数据一致性:异步复制场景需评估RPO(恢复点目标)
版本一致性:所有节点软件版本需严格一致
压力测试:使用sysbench或jmeter模拟高并发场景
通过以上步骤,您可以根据业务需求选择合适的高可用方案。实际部署中建议先在测试环境验证,生产环境操作时保留完整回滚方案。
评论前必须登录!
注册