一、下载安装包
http://archive.cloudera.com/kafka/parcels/4.1.0/
http://archive.cloudera.com/spark2/csd/SPARK2_ON_YARN-2.4.0.cloudera2.jar
http://archive.cloudera.com/spark2/parcels/2.4.0.cloudera2/
二、安装CDK
- 安装httpd服务
yum install -y httpd
service httpd start
- 移动kafka_parcel文件夹,里面包含了CDK相关文件
mv kafka_parcel /var/www/html/
可以通过ip/kafka_parcel
查看web页面是否有显示
- 在CM web页面配置
三、调试CDK
/opt/cloudera/parcels/KAFKA/lib/kafka/bin/kafka-topics.sh
四、安装CDS
- 创建CSD文件夹
mkdir /opt/cloudera/csd
*将下载的jar包放入CSD文件夹中
mv SPARK2_ON_YARN-2.4.0.cloudera2.jar /opt/cloudera/csd/
chown cloudera-scm:cloudera-scm /opt/cloudera/csd/SPARK2_ON_YARN-2.4.0.cloudera2.jar
chmod 644 /opt/cloudera/csd/SPARK2_ON_YARN-2.4.0.cloudera2.jar
- 重启service服务
- 移动spark2_parcel文件夹,里面包含了CDS相关文件
- 在CM web页面配置
五、测试CDS
spark2-submit \
--master yarn \
--num-executors 1 \
--executor-cores 1 \
--executor-memory 1G \
--class org.apache.spark.examples.SparkPi \
/opt/cloudera/parcels/SPARK2/lib/spark2/examples/jars/spark-examples_2.11-2.4.0.cloudera2.jar
六、常见问题
Exception in thread "main" org.apache.hadoop.security.AccessControlException:
Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
解决方案
su - hdfs