1 安装Scala
下载scala:
hadoop@master:~$ wget https://scala-lang.org/files/archive/scala-2.10.4.tgz
安装scala:
hadoop@master:~$ tar zxvf scala-2.10.4.tgz -C bigdata/
hadoop@master:~$ cd bigdata/
hadoop@master:~/bigdata$ mv scala-2.10.4/ scala
环境变量:
hadoop@master:~/bigdata$ vi /home/hadoop/.bashrc
export SCALA_HOME=/home/hadoop/bigdata/scala
export PATH=$SCALA_HOME/bin:$PATH
使环境变量设置生效:
hadoop@master:~$ source /home/hadoop/.bashrc
验证:
hadoop@master:~/bigdata$ scala -version
Scala code runner version 2.10.4 -- Copyright 2002-2013, LAMP/EPFL
2 安装Spark
下载spark:
hadoop@master:~$ wget https://archive.apache.org/dist/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz
安装spark:
hadoop@master:~$ tar zxvf spark-1.5.1-bin-hadoop2.6.tgz -C bigdata/
hadoop@master:~$ cd bigdata/
hadoop@master:~/bigdata$ mv spark-1.5.1-bin-hadoop2.6/ spark
环境变量:
hadoop@master:~/bigdata$ vi /home/hadoop/.bashrc
export SPARK_HOME=/home/hadoop/bigdata/spark
export PATH=$SPARK_HOME/bin:$PATH
使环境变量设置生效:
hadoop@master:~$ source /home/hadoop/.bashrc
验证:
hadoop@master:~/bigdata$ env | grep SPARK
SPARK_HOME=/home/hadoop/bigdata/spark
修改配置参数文件:
spark-env.sh
hadoop@master:~/bigdata/spark/conf$ cp spark-env.sh.template spark-env.sh
hadoop@master:~/bigdata/spark/conf$ vi spark-env.sh
export SCALA_HOME=/home/hadoop/bigdata/scala
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_131
export HADOOP_HOME=/home/hadoop/bigdata/hadoop
export HADOOP_CONF_DIR=/home/hadoop/bigdata/hadoop/etc/hadoop
export SPARK_MASTER_IP=master
export SPARK_LOCAL_DIRS=/home/hadoop/bigdata/spark
export SPARK_DRIVER_MEMORY=512M
slaves
hadoop@master:~/bigdata/spark/conf$ cp slaves.template slaves
hadoop@master:~/bigdata/spark/conf$ vi slaves
slave01
slave02
3 启动
hadoop@master:~$ cd /home/hadoop/bigdata/spark/sbin/
hadoop@master:~/bigdata/spark/sbin$ ./start-all.sh
starting org.apache.spark.deploy.master.Master, logging to /home/hadoop/bigdata/spark/sbin/../logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-master.out
slave02: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/bigdata/spark/sbin/../logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-slave02.out
slave01: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/bigdata/spark/sbin/../logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-slave01.out
slave02: failed to launch org.apache.spark.deploy.worker.Worker:
slave01: failed to launch org.apache.spark.deploy.worker.Worker:
slave01: # An error report file with more information is saved as:
slave01: # /home/hadoop/bigdata/spark/hs_err_pid7817.log
slave02: # An error report file with more information is saved as:
slave02: # /home/hadoop/bigdata/spark/hs_err_pid8151.log
slave01: full log in /home/hadoop/bigdata/spark/sbin/../logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-slave01.out
slave02: full log in /home/hadoop/bigdata/spark/sbin/../logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-slave02.out
hadoop@master:~/bigdata/spark/sbin$ jps
3456 ResourceManager
3298 SecondaryNameNode
3090 NameNode
3924 Jps
3822 Master
hadoop@master:~/bigdata/spark/sbin$
查看错误日志
hadoop@slave01:~$ cat /home/hadoop/bigdata/spark/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-slave01.out
Spark Command: /usr/lib/jvm/jdk1.8.0_131/bin/java -cp /home/hadoop/bigdata/spark/sbin/../conf/:/home/hadoop/bigdata/spark/lib/spark-assembly-1.5.1-hadoop2.6.0.jar:/home/hadoop/bigdata/spark/lib/datanucleus-core-3.2.10.jar:/home/hadoop/bigdata/spark/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/bigdata/spark/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/bigdata/hadoop/etc/hadoop/ -Xms1g -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://master:7077
========================================
Java HotSpot(TM) Server VM warning: INFO: os::commit_memory(0x58740000, 715915264, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 715915264 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/hadoop/bigdata/spark/hs_err_pid7817.log
hadoop@slave01:~$
由于slave01、slave02虚拟机分配的内存不足,所以这两个节点上的Worker没有启动起来。在此只记录错误,不做更改演示。
4 验证
hadoop@master:~$ cd /home/hadoop/bigdata/spark/bin/
hadoop@master:~/bigdata/spark/bin$ ./spark-shell
......
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.5.1
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information.
......
scala>
5 总结
安装包 百度网盘链接: https://pan.baidu.com/s/1Nxd82L800_JAWqTlZrDSOA 提取码: xwbu
配置参考代码 github: https://github.com/zhixingkad/bigdata