一、Carbondata下载编译
1、下载Carbondata代码包
https://github.com/apache/carbondata/archive/apache-carbondata-1.3.1-rc1.tar.gz
2、解压编译
tar -zxvf apache-carbondata-1.3.1-rc1.tar.gz
cd apache-carbondata-1.3.1-rc1
mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 -Dhadoop.version=2.7.3 clean package
编译完成后会在git代码路径下生成一个carbondata的jar包
$CARBONDATA_HOME/assembly/target/scala-2.11/apache-carbondata-1.3.1-bin-spark2.1.0-hadoop2.7.3.jar
二、 HDP 2.6.2 spark2.1.1 更换 spark2.1.0版本
1、下载spark2.1.0-hadoop2.7.2以上版本的tar
https://archive.apache.org/dist/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz
2、替换hdp spark版本
mv spark2 spark2_bak
tar -zxvf spark-2.1.0-bin-hadoop2.7.tgz
mv spark-2.1.0-bin-hadoop2.7 spark2
mv spark2/conf spark2/conf_bak
cp -r spark2_bak/conf spark2
三、 上传carbondata jar包到spark2目录
mkdir $SPARK_HOME/carbonlib
cd $SPARK_HOME
tar -zcvf carbondata.tar.gz carbonlib/
mv carbondata.tar.gz carbonlib/
四、 配置carbondata
1、spark-defaults.conf
spark.master yarn
spark.yarn.dist.files /usr/hdp/2.6.2.0-205/spark2/conf/carbon.properties spark.yarn.dist.archives /usr/hdp/2.6.2.0-205/spark2/carbonlib/carbondata.tar.gz spark.executor.extraJavaOptions -Dcarbon.properties.filepath=carbon.properties -XX:+OmitStackTraceInFastThrow -XX:+UseGCOverheadLimit spark.executor.extraClassPath carbondata.tar.gz/carbonlib/* spark.driver.extraClassPath /usr/hdp/2.6.2.0-205/spark2/carbonlib/* spark.driver.extraJavaOptions -Dcarbon.properties.filepath=/usr/hdp/2.6.2.0-205/spark2/conf/carbon.properties -Dhdp.version=current spark.yarn.executor.memoryOverhead 1024 spark.yarn.am.extraJavaOptions -Dhdp.version=current
2、carbondata.properties
carbon.storelocation=hdfs://vigor-dc-10:8020/Opt/CarbonStore
3、创建carbondata hdfs存储目录
hadoop fs -mkdir /Opt/CarbonStore
hadoop fs -chmod -R 777 /Opt/CarbonStore
五、拷贝依赖jar包到spark2/jars目录
jersey-client-1.9.jar
jersey-core-1.9.jar
apache-carbondata-1.3.1-bin-spark2.1.0-hadoop2.7.3.jar
六、 运行thrift server
./bin/spark-submit --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer /usr/hdp/2.6.2.0-205/spark2/carbonlib/apache-carbondata-1.3.1-bin-spark2.1.0-hadoop2.7.3.jar hdfs://192.168.2.10:8020/user/hive/warehouse/carbon.store --num-executors 3 --driver-memory 2g --executor-memory 3g --executor-cores 8
七、使用beeline连接
./bin/beeline -u jdbc:hive2://192.168.2.10:10000