环境准备
我选择的是centos7docker pull centos:centos7
,已经自带python27,只需要安装jdk8。
Oracle官网下载jdk,注意选择ARM、x86、x64,我的是x64
docker run -it --name hello_datax centos:centos7
启动容器mkdir modules
和mkdir softwares
,创建目录放压缩包和加压后的文件夹docker cp E:\MyWork\MyDevelopmentTools\Java\jdk-8u271-linux-x64.tar.gz hello_datax:/opt/softwares
将压缩包传入容器tar -zxvf /opt/softwares/jdk-8u271-linux-x64.tar.gz -C /opt/modules/
解压到指定位置- 添加环境变量
vi /root/.bashrc
,不能放到/etc/profile,因为docker每次重启容器会失效
# JAVA_HOME
export JAVA_HOME=/opt/modules/jdk1.8.0_271
export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/sbin
java -version
成功安装了jdk8
下载安装
- 下载DataX,
docker cp E:\MyWork\MyDevelopmentTools\datax.tar.gz hello_datax:/opt/softwares
复制到容器中tar -zxvf /opt/softwares/datax.tar.gz -C /opt/modules/
解压到指定位置cd /opt/modules/datax
进入到data下的目录python ./bin/datax.py ./job/job.json
执行测试job
CentOS7中解决中文乱码
编写测试
vi ./job/stream2stream.json
同样是在datax目录的job下建立一个文件,写入如下内容
{
"job": {
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"sliceRecordCount": 10,
"column": [
{
"type": "long",
"value": "10"
},
{
"type": "string",
"value": "hello,你好,世界-DataX"
}
]
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"encoding": "UTF-8",
"print": true
}
}
}
],
"setting": {
"speed": {
"channel": 5
}
}
}
}
python ./bin/datax.py ./job/stream2stream.json
执行