SparkSql 读取文件/读取hdfs文件
读取本地:
val spark =
SparkSession.builder()
.appName("SQL-JSON")
.master("local[4]")
.getOrCreate()
import spark.implicits._
// easy enough to query flat JSON
val people = spark.read.json("./data/people.json")
people.printSchema()
people.createOrReplaceTempView("people")
val young = spark.sql("SELECT * FROM people ")
young.foreach(r => println(r))
people.select("name").show()
读取hdfs上的文件:
这两个文件从hdfs配置文件中拿下来放在这里。
object ReadJson {
def main(args: Array[String]): Unit = {
val spark =
SparkSession.builder()
.appName("SQL-JSON")
.master("local[4]")
.getOrCreate()
import spark.implicits._
// easy enough to query flat JSON
val people = spark.read.json("/usr/data/people.json")
people.printSchema()
people.createOrReplaceTempView("people")
val young = spark.sql("SELECT * FROM people ")
young.foreach(r => println(r))
people.select("name").show()
}
}