Spark可以在Mesos或者YARN集群管理平台运行,也可以使用spark的standalone(独立)部署模式,主要用于本机测试用。
安装部署Spark需要是编译后的版本,可以到spark官网下载http://spark.apache.org/downloads.html
解压开(部署之前需要准备JAVA环境,配置JAVA_HOME)
运行脚本:
./sbin/start-master.sh
启动master以后,可以通过webUI访问spark的master节点,默认端口号是8080,链接:
http://localhost:8080
运行脚本启动worker:
./sbin/start-slave.sh <master-spark-URL>
参数设置:
Argument | Meaning |
---|---|
-h HOST , --host HOST |
Hostname to listen on |
-i HOST , --ip HOST |
Hostname to listen on (deprecated, use -h or --host) |
-p PORT , --port PORT |
Port for service to listen on (default: 7077 for master, random for worker) |
--webui-port PORT |
Port for web UI (default: 8080 for master, 8081 for worker) |
-c CORES , --cores CORES |
Total CPU cores to allow Spark applications to use on the machine (default: all available); only on worker |
-m MEM , --memory MEM |
Total amount of memory to allow Spark applications to use on the machine, in a format like 1000M or 2G (default: your machine's total RAM minus 1 GB); only on worker |
-d DIR , --work-dir DIR |
Directory to use for scratch space and job output logs (default: SPARK_HOME/work); only on worker |
--properties-file FILE |
Path to a custom Spark properties file to load (default: conf/spark-defaults.conf) |