版权声明:技术专栏,转载请注明! https://blog.csdn.net/wankunde/article/details/79194820
Hadoop audit 配置
修改log4j.properties
#
# hdfs audit logging
#
hdfs.audit.logger=INFO,NullAppender
hdfs.audit.log.maxfilesize=256MB
hdfs.audit.log.maxbackupindex=20
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender
log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize}
log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex}
#
# mapred audit logging
#
mapred.audit.logger=INFO,NullAppender
mapred.audit.log.maxfilesize=256MB
mapred.audit.log.maxbackupindex=20
log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger}
log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false
log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender
log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log
log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize}
log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex}
#
# RM audit logging
#
rm.audit.logger=INFO,NullAppender
rm.audit.log.maxfilesize=256MB
rm.audit.log.maxbackupindex=20
log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger=${rm.audit.logger}
log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger=false
log4j.appender.RMAUDIT=org.apache.log4j.RollingFileAppender
log4j.appender.RMAUDIT.File=${hadoop.log.dir}/resourcemanager-audit.log
log4j.appender.RMAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.RMAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.RMAUDIT.MaxFileSize=${rm.audit.log.maxfilesize}
log4j.appender.RMAUDIT.MaxBackupIndex=${rm.audit.log.maxbackupindex}
hadoop-env.sh
export HADOOP_NAMENODE_OPTS=" -Xms78g -Xmx78g -Xmn10g -XX:+UseG1GC -XX:G1HeapRegionSize=8m -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Xloggc:$HADOOP_LOG_DIR/gc-namenode.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M -Dhdfs.audit.logger=INFO,RFAAUDIT -Dhadoop.security.logger=INFO,RFAS "
export HADOOP_DATANODE_OPTS=" -Xms4g -Xmx4g -XX:+UseParNewGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Xloggc:$HADOOP_LOG_DIR/gc-datanode.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M -Dhadoop.security.logger=INFO,RFAS "
export YARN_RESOURCEMANAGER_OPTS=" -Xms20g -Xmx20g -XX:+UseG1GC -XX:G1HeapRegionSize=8m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/appcom/log/hadoop-yarn/resourcemanager.hprof -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Xloggc:/appcom/log/hadoop-yarn/gc-resourcemanager.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M -Drm.audit.logger=INFO,RMAUDIT -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY"
export YARN_NODEMANAGER_OPTS=" $YARN_NODEMANAGER_OPTS -Xms6g -Xmx8g "
export HADOOP_JOB_HISTORYSERVER_OPTS=" -Xms20g -Xmx20g "
Hive audit 配置
修改hive-log4j.properties
# audit log
log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender
metastore.audit.logger=INFO,NullAppender
log4j.appender.HAUDIT=org.apache.log4j.DailyRollingFileAppender
log4j.appender.HAUDIT.File=${hive.log.dir}/hive-audit.log
log4j.appender.HAUDIT.DatePattern=.yyyy-MM-dd
log4j.appender.HAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.HAUDIT.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
log4j.logger.org.apache.hadoop.hive.metastore.HiveMetaStore.audit=${metastore.audit.logger}
log4j.additivity.org.apache.hadoop.hive.metastore.HiveMetaStore.audit=false
修改 hive-env.sh,这样在Hive metastore 启动的时候会读取该变量,设置log4j的输出。其他Hive客户端使用NullAppender,就不会输出文件了。
HIVE_METASTORE_HADOOP_OPTS=" -Dmetastore.audit.logger=INFO,HAUDIT "
顺道补充一个hive-server2的配置
if [ "$SERVICE" = "hiveserver2" ];then
export HADOOP_HEAPSIZE=20480
export HADOOP_OPTS="$HADOOP_OPTS -XX:+UseG1GC -XX:G1HeapRegionSize=8m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/appcom/log/hive/hive-server2.hprof -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Xloggc:/appcom/log/hive/gc-hive-server2.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=64M"
fi
Log4j中环境变量
在log4j.properties配置文件中定义了一些变量。在/usr/lib/hadoop/libexec/hadoop-config.sh 脚本中会获取系统中环境变量,然后通过java -D参数传入进行覆盖。所以如果发现audit log的目录不是自己想要的,可以想办法传入系统变量HADOOP_LOG_DIR
进行配置即可。
hadoop-config.sh 脚本:
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.file=$HADOOP_LOGFILE"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.home.dir=$HADOOP_PREFIX"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console}"
例如:Hive的运行日志默认在hive.log.dir=${java.io.tmpdir}/${user.name}
,例如/tmp/dsp
。所以我们可以在${HIVE_CONF}/hive-env.sh
配置环境变量export HADOOP_LOG_DIR=${TMPDIR-/tmp}/${USER}
可以。
HDFS JMX 开启
配置JMX启动参数
hadoop-env.sh
export HDFS_JMX_OPTS=" -Dcom.sun.management.jmxremote.authenticate=true \
-Dcom.sun.management.jmxremote.ssl=false \
-Dcom.sun.management.jmxremote.password.file=/etc/hadoop/conf/hdfsjmxremote.password \
-Dcom.sun.management.jmxremote.access.file=/etc/hadoop/conf/hdfsjmxremote.access \
-Dcom.sun.management.jmxremote.port"
export HADOOP_NAMENODE_OPTS="$HADOOP_NAMENODE_OPTS $HDFS_JMX_OPTS=9191 "
export HADOOP_DATANODE_OPTS="$HADOOP_DATANODE_OPTS $HDFS_JMX_OPTS=9192 "
- com.sun.management.jmxremote.authenticate=true : 开启权限认证
- Dcom.sun.management.jmxremote.ssl=false : 不使用ssl认证
- com.sun.management.jmxremote.password.file=xx : 指定监控的用户和密码,用空格分开
- com.sun.management.jmxremote.access.file=yy : 指定监控用户的权限,用户名和权限用空格分开。权限有两种readonly 和readwrite。readonly只能看到基础的监控信息,readwrite权限还可以监控运行的线程信息和CPU和内存的抽样。
- 密码文件和权限控制文件读取权限都设置为 600