最近在阅读hadoop源码,有时候为搞清楚来龙去脉,必要时得做debug。
在搭建调试环境的时候,遇到不少问题,最后逐一解决。在此分享给大家,以飨读者、同仁。
NoClassDefFoundError
第一个问题,莫名其妙,类找不到,代码都没标红,排查了很久以为环境没搭好。
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/Path
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:195)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:123)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.Path
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 3 more
最后才发现,pom.xml 文件多写了一行 “provided”,因为是从maven repo直接复制过来的,代码也没报错,就没注意,注释掉就好了。
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0</version>
<!--<scope>provided</scope>-->
</dependency>
Could not locate executable null\bin\winutils.exe in the Hadoop binaries
接下来遇到第二个问题
18/05/17 13:54:44 ERROR util.Shell: Failed to locate
the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable
null\bin\winutils.exe in the Hadoop binaries.
解决办法:在代码里 添加 hadoop_home 系统变量,并且下载winutils.exe,放到hadoop_home 下的 bin 目录
System.setProperty("hadoop.home.dir", "D:\\aws\\hadoop-2.6.0");
No FileSystem for scheme: hdfs
找不到相应的文件系统
java.io.IOException: No FileSystem for scheme: hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
代码虽然不报错,但是运行是缺少相应的依赖包
解决办法是,在pom.xml,增加依赖
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.6.0</version>
</dependency>
访问权限问题
org.apache.hadoop.security.AccessControlException: Permission denied:
user=XXX, access=WRITE, inode="/user":root:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.
checkFsPermission(FSPermissionChecker.java:271)
window7 客户端 的系统用户 和 hdfs 用户不一致,
解决办法是在代码里增加系统变量
System.setProperty("HADOOP_USER_NAME", "root");
java.net.UnknownHostException
Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: 1.txt
问题是 文件URI,hdfs 后面 要么 是“HDFS:” 后面直接加三个 斜杠,或者 是”IP:port” 再加斜杠, 千万别写成”hdfs://1.txt”。
正确的写法如下
fs = FileSystem.get(URI.create("hdfs:///1.txt"), conf);
或
fs = FileSystem.get(URI.create("hdfs://10.205.84.14:9000/1.txt"), conf);
总结
- 添加resource文件夹至客户端工程项目里,resource 文件夹里包含两个文件 “core-site.xml”、“log4j.properties”,这两个文件从hadoop集群中获取,保持一致
- 编写好pom文件,主要是 “hadoop-common” “hadoop-hdfs”
- 在客户端代码里配好环境变量(环境变量也可以通过其他方式获得)
客户端代码样例
package test.test;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import java.io.IOException;
import java.net.URI;
/**
* Created by shen.xiangxiang on 2018/3/30.
*/
public class TestHdfs {
public static void main( String[] args )
{
System.setProperty("hadoop.home.dir", "D:\\aws\\hadoop-2.6.0");
System.setProperty("HADOOP_USER_NAME", "root");
Configuration conf = new Configuration();
FileSystem fs = null;
try {
fs = FileSystem.get(URI.create("hdfs:///1.txt"), conf);
Path path = new Path("3.txt");
FSDataOutputStream out = fs.create(path); //创建文件
out.write("hello".getBytes("UTF-8"));
out.writeUTF("da jia hao,cai shi zhen de hao!");
out.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
pom文件样例
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>test.test</groupId>
<artifactId>test_debug_hadoop</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>testtest</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0</version>
<!--<scope>provided</scope>-->
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>