在使用spark将数据保存到MongoDB时报错:
java.lang.IllegalArgumentException: Missing collection name. Set via the 'spark.mongodb.output.uri' or 'spark.mongodb.output.collection' property
提示需要设置spark.mongodb.output.uri或者spark.mongodb.output.collection
添加依赖:
<dependency>
<groupId>org.mongodb.spark</groupId>
<artifactId>mongo-spark-connector_2.11</artifactId>
<version>2.2.3</version>
</dependency>
MongoSpark插件使用的配置是从sparksession中获得的,因此sparksession创建语句需要将conf配置加进来
SparkSession sparkSession = SparkSession.builder()
.appName("LogAnalyse")
.master("local")
.config("spark.mongodb.output.uri","mongodb://name:[email protected]:27017/database_name.table_name")
.getOrCreate();
JavaSparkContext jsc = new JavaSparkContext(sparkContext);
Map<String, String> writeOverrides = new HashMap<String, String>();
writeOverrides.put("collection", "table_name");
writeOverrides.put("writeConcern.w", "majority");
WriteConfig defaultWriteConfig = WriteConfig.create(jsc).withOptions(writeOverrides);
//保存
MongoSpark.save(dataFrame.select("pd_id","sqljb"),defaultWriteConfig);
参考文档:
https://docs.mongodb.com/spark-connector/v2.0/java/read-from-mongodb/
https://stackoverflow.com/questions/58153363/missing-database-name-set-via-the-spark-mongodb-output-uri-or-spark-mongodb
https://www.cnblogs.com/feiyumo/p/7346530.html