hadoop-config.sh
hadoop-config.sh source the hadoop-env.sh
if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then
. "${HADOOP_CONF_DIR}/hadoop-env.sh"
fi
hadoop-daemon.sh
At top of hadoop-daemon.sh, it executes hadoop-config.sh, then source
DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh
hadoop-daemon.sh
After source hadoop-config.sh, it source the hadoop-env.sh agagin, cause the environment doubled.
if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then
. "${HADOOP_CONF_DIR}/hadoop-env.sh"
fi
hadoop-daemon.sh
Then hadoop execute hdfs command to start namenode
case $command in
namenode|secondarynamenode|datanode|journalnode|dfs|dfsadmin|fsck|balancer|zkfc)
if [ -z "$HADOOP_HDFS_HOME" ]; then
hdfsScript="$HADOOP_PREFIX"/bin/hdfs
else
hdfsScript="$HADOOP_HDFS_HOME"/bin/hdfs
fi
hdfs
In hdfs, it source hdfs-config.sh again, which source the hadoop-env.sh again, so hadoop-env.sh is sourced three times.
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hdfs-config.sh
Solution
Add the follwing code at the top of hadoop-env.sh. If the code is executed first time, it export HADOOP_ENV_LOADED variable and execute the following part, otherwise it just return.
if [ "${HADOOP_ENV_LOADED}" ]; then
return
else
export HADOOP_ENV_LOADED="true";
fi