前言
之前搭建好了一个HDFS的高可用,基于他点我!!!再搭建起Yarn的高可用!
常用脚本 直通车点我!!
修改配置并且分发
yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--启用resourcemanager ha-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!--声明两台resourcemanager的地址-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster-yarn1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop103</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop104</value>
</property>
<!--指定zookeeper集群的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
</property>
<!--启用自动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定resourcemanager的状态信息存储在zookeeper集群-->
<property>
<name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<!-- Site specific YARN configuration properties -->
<!-- 日志聚集功能使能 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 日志保留时间设置7天 -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
</configuration>
分发到各个主机,重启集群
群关
zhstop
群起
zhstart
注意! resoucemanager无法利用脚本来进行群起,必须手动单点启动没有起来的RM!
xcall jps 检验
xcall jps
状态检验
yarn rmadmin -getServiceState rm1
yarn rmadmin -getServiceState rm2
总结
2020-03-05 00:34:26,697 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Invalid configuration! Can not find valid RM_HA_ID. None of yarn.resourcemanager.address.rm1 yarn.resourcemanager.address.rm2 are matching the local address OR yarn.resourcemanager.ha.id is not specified in HA Configuration
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Invalid configuration! Can not find valid RM_HA_ID. None of yarn.resourcemanager.address.rm1 yarn.resourcemanager.address.rm2 are matching the local address OR yarn.resourcemanager.ha.id is not specified in HA Configuration
at org.apache.hadoop.yarn.conf.HAUtil.throwBadConfigurationException(HAUtil.java:43)
at org.apache.hadoop.yarn.conf.HAUtil.verifyAndSetCurrentRMHAId(HAUtil.java:125)
at org.apache.hadoop.yarn.conf.HAUtil.verifyAndSetConfiguration(HAUtil.java:81)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:223)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1177)
2020-03-05 00:34:26,699 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state
如果启动失败,则重新格式化NN zkfc 按照搭建HDFS-HA里的初始化步骤!!直通车链接点我!!here!!
关键配置信息解析
属性名 | 属性值 |
---|---|
yarn.resourcemanager.recovery.enabled | true(启动自动恢复!) |