Redis、Zookeeper去服务单点问题实践

常见的单个服务要要完成某个核心功能,由于设计原因该服务智能部署一个不支持多个同时运行,一方面在服务可运行的情况下要在运维人员意外的启动多个服务节点的情况下服务仍然能够正常运行,另外单个节点运行的服务在主机发生故障的时候难免导致服务意外终止,这种场景下我们更希望采用一种更优雅的方式不中断服务。以下以两种场景为例说明。

Redis 实现分布式锁

业务的正常运行依赖于Redis集群,因此考虑使用Redis来实现服务的一主多备模式,当主节点失败后由多个备节点随机选取一个最为主节点继续提供服务。
代码实现如下:
import java.util.UUID;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.redis.connection.RedisClusterConnection;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.stereotype.Service;

@Service
public class MasterHeartbeat {
	private static final Log log = LogFactory.getLog(MasterHeartbeat.class);
	private static final String LOCK_KEY = "lock.synchronization";
	public static final AtomicBoolean isMaster = new AtomicBoolean(false);
	private static final ScheduledExecutorService service = Executors.newSingleThreadScheduledExecutor();
	@Autowired
	RedisTemplate<String, String> redisTemplate;
	// 节点标识
	private String id = UUID.randomUUID().toString();
	// 最大存活周期[秒]
	private long maxSurvivalTime = 30;
	// 心跳周期[毫秒]
	private long heartbeatPeriod = 3000;

	public void tryLock() {
		log.info(this.toString());
		RedisClusterConnection conn = null;
		try {
			conn = redisTemplate.getConnectionFactory().getClusterConnection();
			boolean isLock = conn.setNX(LOCK_KEY.getBytes(), id.getBytes());
			isMaster.compareAndSet(!isLock, isLock);
			if (isLock) {
				conn.expire(LOCK_KEY.getBytes(), maxSurvivalTime * 2);
				log.info("try lock and success i'm master ");
			} else {
				log.info("try lock fail  ");
			}
		} catch (Exception e) {
			log.error(e);
			tryLock();
		} finally {
			if (conn != null) {
				conn.close();
			}
		}
		try {
			service.scheduleAtFixedRate(new HeartbeatTask(), maxSurvivalTime, heartbeatPeriod, TimeUnit.MILLISECONDS);
			// result.get(heartbeatPeriod, TimeUnit.MILLISECONDS);
		} catch (Exception e) {
			log.error(e);
		}

	}

	class HeartbeatTask implements Runnable {
		@Override
		public void run() {
			RedisClusterConnection conn = null;
			try {
				conn = redisTemplate.getConnectionFactory().getClusterConnection();
				byte[] value = conn.get(LOCK_KEY.getBytes());
				
				if (value == null) {
					log.info("master losed ");
					tryLock();
				} else if (id.equals(new String(value))) {
					// is me contract extension
					conn.expire(LOCK_KEY.getBytes(), maxSurvivalTime);
					log.debug("contract extension success ");
					log.debug("master id is :"+new String(value)+",i id is :"+id);
				}else{
					log.debug("master id is :"+new String(value)+",i id is :"+id);
				}
			} catch (Exception e) {
				log.error(e);
			} finally {
				if (conn != null) {
					conn.close();
				}
			}
		}
	}

	@Override
	public String toString() {
		return "MasterHeartbeat [id=" + id + ", maxSurvivalTime=" + maxSurvivalTime + ", heartbeatPeriod="
				+ heartbeatPeriod + "]";
	}
}


做如下说明:
  • 该部分代码使用了Spring Redis模块根据实际情况替换即可
  • 服务启动时连接Redis集群向Redis写入锁该特性由setNX(byte[] key, byte[] value)方法提供支持,value设定为当前节点的唯一标识,同时为为避免长时间占用锁而导致死锁,获取锁成功后需要设定当前节点锁的失效时间,锁的续约由心跳进行续约 这里由ScheduledExecutorService 定期调度HeartbeatTask来执行。
  • 服务当前的角色状态由定义的原子变量AtomicBoolean isMaster 标识,业务类在执行前需要根据当前服务角色决定业务流程。

Zookeeper实现分布式锁

Zookeeper是一个分布式协同服务,使用Zookeeper实现分布式锁具有天然优势.参考以下代码
import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.concurrent.atomic.AtomicBoolean;

import javax.annotation.PostConstruct;

import org.I0Itec.zkclient.IZkDataListener;
import org.I0Itec.zkclient.ZkClient;
import org.I0Itec.zkclient.exception.ZkException;
import org.I0Itec.zkclient.exception.ZkNodeExistsException;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.context.embedded.EmbeddedServletContainerInitializedEvent;
import org.springframework.context.ApplicationListener;
import org.springframework.stereotype.Service;

import com.ehl.tvc.util.ZookeeperUtil;

@Service
public class MasterElect implements ApplicationListener<EmbeddedServletContainerInitializedEvent>{
	private static Log log = LogFactory.getLog(MasterElect.class);
	public static final String LOCK_ROOT = "/tvc-lock";
	public static final String LOCK_PATH = "/tvc-creonserver-master";
	/**
	 * 全局开关
	 * 
	 * @仅当前节点持有Master锁时才允许同步数据
	 */
	public static final AtomicBoolean SWITCH = new AtomicBoolean(false);

	@Autowired
	private ZookeeperUtil zookeeperUtil;

	@PostConstruct
	public void init() {
		ZkClient zkClient = zookeeperUtil.getZkClient();
		if (!zkClient.exists(LOCK_ROOT)) {
			zkClient.createPersistent(LOCK_ROOT);
		}
		zkClient.subscribeDataChanges(LOCK_ROOT + LOCK_PATH, new IZkDataListener() {
			@Override
			public void handleDataDeleted(String dataPath) throws Exception {
				// 节点被删除开始注册为Master
				electMaster();
			}

			@Override
			public void handleDataChange(String dataPath, Object data) throws Exception {

			}
		});
	}

	public void electMaster() {
		try {
			zookeeperUtil.getZkClient().createEphemeral(LOCK_ROOT + LOCK_PATH,hostAddress+":"+serverPort);
		} catch (ZkException e) {
			if (e instanceof ZkNodeExistsException) {
				log.info("try to get lock and field!!!");
				SWITCH.compareAndSet(true, false);
				return;
			} else {
				// 其它异常继续获取锁
				try {
					Thread.sleep(3 * 1000);
				} catch (InterruptedException e1) {
					e1.printStackTrace();
				}
				electMaster();
			}
		} catch (Exception e) {
			log.error(e);
			electMaster();
		}
		SWITCH.compareAndSet(false, true);
		log.info("tey get lock success and this is master !!!");
	}

	
	private int serverPort=0;
	private String hostAddress="";
	@Override
	public void onApplicationEvent(EmbeddedServletContainerInitializedEvent event) {
		serverPort=event.getEmbeddedServletContainer().getPort();
		try {
			hostAddress=InetAddress.getLocalHost().getHostAddress();
		} catch (UnknownHostException e) {
			e.printStackTrace();
		}
		electMaster();
	}
}


做如下说明:
  • 通过在Zookeeper中创建临时节点来获取锁,成功创建指定节点的节点为主节点,创建失败的节点则向该节点注册节点变更事件
  • 当主节点失败后,Zookeeper会话超时临时节点销毁,各备节点开始抢占锁成功获取到锁的节点成为主节点。
  • 为便于早Zookeeper中观察当前的主节点,在创建临时节点的时候将节点值设置为节点的IP+端口
  • 代码中的ZookeeperUtil为封装的Zookeeper相关路径的初始化,自行替换代码中可获取到ZkClient 对象即可。

其它说明

建议尽可能的使用Zookeeper来实现分布式协同服务,上文使用Redis的原因在于业务强依赖于一个Redis集群且服务并没有使用Zookeeper的意愿不妨可以试试使用Redis。

猜你喜欢

转载自blog.csdn.net/u010820702/article/details/78702031