版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/liaynling/article/details/83016311
Dr.elephant是一款对hadoop、Hive和Spark任务进行性能监控和调优的工具,它由LinkedIn的团队于2016年开源。
一、环境搭建
整体环境:dr.elephant 2.0.13, hadoop 2.6.5, spark 2.2.3
1.jdk8安装
2.play framework安装
1)play framework下载解压
下载Play,在页面的最下部找到安装包链接进行下载。
$wget https://downloads.typesafe.com/play/2.2.6/play-2.2.6.zip
$unzip play-2.2.6.zip
2)play framework环境变量配置
$vim ~.bash_profile:
export JAVA_HOME=/opt/java
export PLAY_HOME=/opt/play-2.2.6
export PATH=$PATH:$PLAY_HOME:$JAVA_HOME/bin
3)playf ramework测试
创建一个新应用/项目
play new helloworld
3.dr.elephant编译
1)从github.com上下载dr-elephant-2.0.13
$wget https://github.com/linkedin/dr-elephant/archive/v2.0.13.zip
$unzip v2.0.13.zip
2)修改配置文件compile.conf
$cd ~/dr-elephant-2.0.13
$vim compile.conf
hadoop_version=2.6.5
spark_version=2.2.3
play_opts="-Dsbt.repository.config=app-conf/resolver.conf"
3)编译生成安装包
$cd ~/dr-elephant-2.0.13
#以下编译不成功,可以反复编译,或者换一个版本,直到编译成功。
$./compile.sh ./compile.conf
$ls dist
dr-elephant-2.0.13.zip
编译完成后,会有SUCCESS的提示。这时可以看到在源码文件夹中,多了一个目录dist,进入这个目录可以看到,里面有一个zip包dr-elephant-2.0.13.zip,解压缩这个zip包,生成dr-elephant-2.0.13的代码,可用于部署使用。
4.dr.elephant部署
1)dr.elephant部署
$cd ~/dr-elephant-2.0.13/dist
$mv dr-elephant-2.0.13.zip /opt/
$cd /opt/;unzip dr-elephant-2.0.13.zip
2)修复SQL文件的BUG
$cd /opt/dr-elephant-2.0.13
$vim conf/evolutions/default/1.sql
Replace lines 49-51, from
create index yarn_app_result_i4 on yarn_app_result (flow_exec_id);
create index yarn_app_result_i5 on yarn_app_result (job_def_id);
create index yarn_app_result_i6 on yarn_app_result (flow_def_id);
to
create index yarn_app_result_i4 on yarn_app_result (flow_exec_id(191));
create index yarn_app_result_i5 on yarn_app_result (job_def_id(191));
create index yarn_app_result_i6 on yarn_app_result (flow_def_id(191));
可以解决以下抛错:
[error] c.j.b.ConnectionHandle - Database access problem. Killing off this connection and all remaining connections in the connection pool. SQL State = HY000
mysql的字符集必须设置为UTF8
3)更改配置文件
设置mysql信息:vim app-conf/elephant.conf
调整采集线程数和时间间隔:vim app-conf/GeneralConf.xml
<configuration>
<property>
<name>drelephant.analysis.thread.count</name>
<value>15</value>
<description>Number of threads to analyze the completed jobs</description>
</property>
<property>
<name>drelephant.analysis.fetch.interval</name>
<value>60000</value>
<description>Interval between fetches in milliseconds</description>
</property>
<property>
<name>drelephant.analysis.retry.interval</name>
<value>60000</value>
<description>Interval between retries in milliseconds</description>
</property>
<property>
<name>drelephant.application.search.match.partial</name>
<value>true</value>
<description>If this property is "false", search will only make exact matches</description>
</property>
</configuration>
修改drelephant.analysis.thread.count,默认是3,建议修改到15,3的话从jobhistoryserver读取的速度太慢,高于15的话又读取的太快,会对jobhistoryserver造成很大压力。下面两个一个是读取的时间周期,一个是重试读取的间隔时间周期。
4)启动进程
$cd /opt/dr-elephant-2.0.13
$sh -x bin/start.sh app-conf/
5)web浏览器访问
http://$ip:8080