哨兵的使用和实现机制

Sentinel进程是用于监控redis集群中Master主服务器工作的状态，在Master主服务器发生故障的时候，可以实现Master和Slave服务器的切换，保证系统的高可用，此功能在redis2.6+的版本已引用，Redis的哨兵模式到了2.8版本之后就稳定了下来。一般在生产环境也建议使用Redis的2.8版本的以后版本

哨兵(Sentinel)是一个分布式系统，可以在一个架构中运行多个哨兵(sentinel)进程，这些进程使用流言协议(gossip protocols)来接收关于Master主服务器是否下线的信息，并使用投票协议(AgreementProtocols)来决定是否执行自动故障迁移,以及选择哪个Slave作为新的Master

每个哨兵(Sentinel)进程会向其它哨兵(Sentinel)、Master，Slave定时发送消息，以确认对方是否"活”"着，如果发现对方在指定配置时间(此项可配置)内未得到回应，则暂时认为对方已离线，也就是所谓的”主观认为宕机”(主观:是每个成员都具有的独自的而且可能相同也可能不同的意识)，英文名称:Subjective Down，简称SDOWN

有主观宕机，对应的有客观宕机。当"哨兵群"中的多数Sentinel进程在对Master主服务器做出SDOWN的判断，并且通过SENTINEL is-master-down-by-addr命令互相交流之后，得出的Master Server下线判断，这种方式就是"客观宕机"(客观:是不依赖于某种意识而已经实际存在的一切事物)，英文名称是:Objectively Down，简称ODOWN

通过一定的vote算法，从剩下的slave从服务器节点中，选一台提升为Master服务器节点，然后自动修改相关配置，并开启故障转移(failover)

Sentinel机制可以解决master和slave角色的自动切换问题，但单个Master的性能瓶颈问题无法解决,类似于MySQL中的MHA功能

Redis Sentinel中的Sentinel节点个数应该为大于等于3且最好为奇数

客户端初始化时连接的是Sentinel节点集合，不再是具体的Redis节点，但Sentinel只是配置中心不是代理。

Redis Sentinel节点与普通redis没有区别,要实现读写分离依赖于客户端程序

redis 3.0之前版本中,生产环境一般使用哨兵模式,但3.0后推出redis cluster功能后,可以支持更大规模的生产环境

#主节点redis配置
bind 0.0.0.0
protected-mode yes
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize no
supervised no
pidfile /var/run/redis_6379.pid
loglevel notice
logfile /var/log/redis/redis.log
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
masterauth 123456
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
replica-priority 100
requirepass 123456
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
replica-lazy-flush no
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
stream-node-max-bytes 4096
stream-node-max-entries 100
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
dynamic-hz yes
aof-rewrite-incremental-fsync yes
rdb-save-incremental-fsync yes

#从节点redis配置
bind 0.0.0.0
protected-mode yes
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize no
supervised no
pidfile /var/run/redis_6379.pid
loglevel notice
logfile /var/log/redis/redis.log
databases 16
always-show-logo yes
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
replicaof 10.0.0.8 6379 #指定主节点IP和端口
masterauth 123456 #指定主节点连接密码
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
replica-priority 100
requirepass 123456
lazyfree-lazy-eviction no
lazyfree-lazy-expire no
lazyfree-lazy-server-del no
replica-lazy-flush no
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
aof-use-rdb-preamble yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
stream-node-max-bytes 4096
stream-node-max-entries 100
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
dynamic-hz yes
aof-rewrite-incremental-fsync yes
rdb-save-incremental-fsync yes

[19:10:03 root@localhost ~]#grep -Ev '^#|^$' /etc/redis-sentinel.conf #修改哨兵的配置文件,所有的哨兵配置都一样
port 26379
daemonize no
pidfile /var/run/redis-sentinel.pid
dir /tmp
sentinel monitor mymaster 10.0.0.8 6379 2 #指定mymaster集群master的IP和端口，当有两个以上的哨兵认为主节点宕机，执行故障迁移，哨兵数为总哨兵数的一半以上，redis和哨兵建议为奇数，否则容易出现脑裂现象
sentinel auth-pass mymaster 123456 #指定当前mymaster集群中master的密码
sentinel down-after-milliseconds mymaster 3000 #判断当前mymaster集群中所有节点的主观下线时间，建议设定为3000，单位是毫秒
sentinel parallel-syncs mymaster 1 #发生故障转移后，同时向新master同步数据的slave数量，越少越可以减轻新master的压力，但是相应的同步时间也会更长
sentinel failover-timeout mymaster 180000 #所有slave指向新的master的超时时间，单位是毫秒
sentinel deny-scripts-reconfig yes #禁止修改脚本
logfile /var/log/redis/sentinel.log #哨兵的日志

[20:13:11 root@localhost ~]#systemctl enable --now redis-sentinel.service #在所有节点上启动哨兵

哨兵的使用和实现机制

猜你喜欢