我是靠谱客的博主 飘逸板栗,这篇文章主要介绍saprkStreaming NetworkWordCount案例,现在分享给大家,希望可以做个参考。

NetworkWordCount.scala源码

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import org.apache.spark.SparkConf import org.apache.spark.streaming.{Seconds, StreamingContext} import org.apache.spark.streaming.StreamingContext._ import org.apache.spark.storage.StorageLevel /** * Counts words in UTF8 encoded, 'n' delimited text received from the network every second. * * Usage: NetworkWordCount <hostname> <port> * <hostname> and <port> describe the TCP server that Spark Streaming would connect to receive data. * * To run this on your local machine, you need to first run a Netcat server * `$ nc -lk 9999` * and then run the example * `$ bin/run-example org.apache.spark.examples.streaming.NetworkWordCount localhost 9999` */ object NetworkWordCount { def main(args: Array[String]) { if (args.length < 2) { System.err.println("Usage: NetworkWordCount <hostname> <port>") System.exit(1) } StreamingExamples.setStreamingLogLevels() // Create the context with a 1 second batch size val sparkConf = new SparkConf().setAppName("NetworkWordCount") val ssc = new StreamingContext(sparkConf, Seconds(1)) // Create a socket stream on target ip:port and count the // words in input stream of n delimited text (eg. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. val lines = ssc.socketTextStream(args(0), args(1).toInt, StorageLevel.MEMORY_AND_DISK_SER) val words = lines.flatMap(_.split(" ")) val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _) wordCounts.print() ssc.start() ssc.awaitTermination() } }

使用spark-submit来提交我们的spark应用程序运行的脚本(生产)

复制代码
1
2
3
4
./spark-submit --master local[2] --class org.apache.spark.examples.streaming.NetworkWordCount --name NetworkWordCount /home/hadoop/app/spark-2.2.0-bin-2.6.0-cdh5.7.0/examples/jars/spark-examples_2.11-2.2.0.jar localhost 9999

复制代码
1
2
3
4
5
6
7
8
9
10
11
12
> /home/hadoop/app/spark-2.2.0-bin-2.6.0-cdh5.7.0/examples/jars/spark-examples_2.11-2.2.0.jar localhost 9999 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 17/12/29 09:14:46 INFO StreamingExamples: Setting log level to [WARN] for streaming example. To override add a custom log4j.properties to the classpath. 17/12/29 09:14:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/12/29 09:14:47 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. ------------------------------------------- Time: 1514510089000 ms ------------------------------------------- ------------------------------------------- Time: 1514510090000 ms ------------------------------------------- -------------------------------------------

复制代码
1
在本地终端执行如下命令

复制代码
1
nc -lk 9999





最后

以上就是飘逸板栗最近收集整理的关于saprkStreaming NetworkWordCount案例的全部内容,更多相关saprkStreaming内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(90)

评论列表共有 0 条评论

立即
投稿
返回
顶部