Spark Broadcast Variables

这篇文章详细的介绍了spark广播变量,值得一看

https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-broadcast.html

在此只摘录其中的Example

Let’s start with an introductory example to check out how to use broadcast variables and build your initial understanding.

You’re going to use a static mapping of interesting projects with their websites, i.e. Map[String, String] that the tasks, i.e. closures (anonymous functions) in transformations, use.

scala> val pws = Map("Apache Spark" -> "http://spark.apache.org/", "Scala" -> "http://www.scala-lang.org/")
pws: scala.collection.immutable.Map[String,String] = Map(Apache Spark -> http://spark.apache.org/, Scala -> http://www.scala-lang.org/)

scala> val websites = sc.parallelize(Seq("Apache Spark", "Scala")).map(pws).collect
...
websites: Array[String] = Array(http://spark.apache.org/, http://www.scala-lang.org/)

It works, but is very ineffective as the pws map is sent over the wire to executors while it could have been there already. If there were more tasks that need the pws map, you could improve their performance by minimizing the number of bytes that are going to be sent over the network for task execution.

Enter broadcast variables.

val pwsB = sc.broadcast(pws)
val websites = sc.parallelize(Seq("Apache Spark", "Scala")).map(pwsB.value).collect
// websites: Array[String] = Array(http://spark.apache.org/, http://www.scala-lang.org/)

Semantically, the two computations - with and without the broadcast value - are exactly the same, but the broadcast-based one wins performance-wise when there are more executors spawned to execute many tasks that use pws map.

总结

通过这篇文章可以知道,如果在driver中定义一个普通的变量,也是可以在不同的task中传递的,只不过是通过拷贝一个副本的方式传递。为了提高性能通过定义广播变量,在每个机器上只生成一个只读变量,共享给这个机器上所有的task。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 太极印的创意: 徐怀清//文 印,有统印、率印、信印之意,太极印是天道与人道的自然融合,有其至高无上,印证天下的妙...
    焦作太极徐阅读 1,466评论 0 0
  • 很早就知道这部电影,而真正要看,却也是机缘巧合。 前天晚上跟同事聊天,说起高中时期的事。说起当初的选择,说起如今。...
    深夜芝士阅读 299评论 0 0
  • 我对着铜镜中自己的面庞,不由微笑。在发髻上插了一支羊脂白玉的簪子,我转头问:“好看吗?”夜里,我不由得浑身冷战。虽...
    DU嘟嘟阅读 524评论 0 1
  • 你所学到的,是需内化成自己的,营养,让自己茁壮成长,而不是模仿别人,记住,要做出自己的独立人格,和自己的路。
    薛功灿阅读 266评论 0 0