- 演示在Intellij中开发Spark,使用sbt构建工具,scala语言。
- 运行方式:1.在intellij中直接运行spark。2.打包运行(参考官网快速开始中Self-Contained Applications节的内容)
新建工程SparkStudy
1.新建sbt项目
2.打开SBT视图
开发
在build.sbt中添加spark依赖
lazy val commonSettings=Seq(
version := "1.0",
scalaVersion := "2.10.6"
)
lazy val root = (project in file("."))
.settings(
commonSettings,
name:="SparkStudy",
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.1"
)
-
此时需要等依赖都下载完成,如果下载失败需要重新下载,下载完成后项目的External Libraries中会出现下载好的依赖。
编写WordCount对象
import org.apache.spark.{SparkConf, SparkContext}
/**
* Created by zhouliang6 on 2017/6/20.
*/
object WordCount {
def main(args: Array[String]) {
val logFile ="D:\work\SparkStudy\\application.log";
val conf = new SparkConf().setAppName("WordCount").setMaster("local");
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
print("---------------");
}
}
1.配置直接在intellij中运行
点击运行
运行结果
2.配置打成jar包在外部运行
- 1.直接点击SBT视图中--》SBT Tasks-》package--》打包完成
- 2.或者配置run Configuration,然后在点击运行--》打包完成
命令行中运行
- 进入sparkstudy_2.10-1.0.jar目录下运行如下命令。
spark-submit --class "WordCount" --master local[4] sparkstudy_2.10-1.0.jar