源数据:
hello spark
hello python
hello java
hello word
hello spark
hello python
hello java
hello word
hello spark
hello python
hello java
hello word
hello spark
hello python
hello java
hello word
hello spark
hello python
hello java
hello word
Python代码:
#conding:utf-8
from pyspark.conf import SparkConf
from pyspark.context import SparkContext
def show(one):
print(one)
if __name__ == '__main__':
conf = SparkConf()
conf.setAppName("test")
conf.setMaster("local")
sc=SparkContext(conf=conf)
lines = sc.textFile("./wc")
words = lines.flatMap(lambda line:line.split(" "))
pairwords = words.map(lambda word:(word,1))
result=pairwords.reduceByKey(lambda v1,v2:v1+v2)
result.foreach(lambda one:show(one))