Kylin的部署
要准备的环境包括:
HDFS的安装和启动: 特别注意启动job-historyserver服务,并开放10020端口;
HBASE的安装和启动
hive的安装; 保证hive 脚本能正常运行;
spark的安装和配置SPARK_HOME, 已经确保$SPARK_HOME/jars目录的存在;
Hadoop等环境准备
启动job-historyserver服务;如何开启hadoop和 yarn的 jobhistory服务
1. mapred-site.xml 中设置history的端口;
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>0.0.0.0:19888</value>
</property>
// 2. 配置开启yarn的history日志: vim yarn-site.xml
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 聚合日志,保留2天-->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>172800</value>
</property>
<!--指定文件压缩类型用于压缩汇总日志-->
<property>
<name>yarn.nodemanager.log-aggregation.compression-type</name>
<value>gz</value>
</property>
<!-- nodemanager本地文件存储目录, 默认就是:${hadoop.tmp.dir}/nm-local-dir -->
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>${hadoop.tmp.dir}/nm-local-dir</value>
</property>
<!-- resourceManager 保存最大的任务完成个数 -->
<property>
<name>yarn.resourcemanager.max-completed-applications</name>
<value>100</value>
</property>
// 设置jobhistory的堆大小,最好不要乱设置,采用默认的;
// export HADOOP_JOB_HISTORYSERVER_HEAPSIZE=2000
// 启动historyserver
mr-jobhistory-daemon.sh start historyserver
// 停止 history Server
mr-jobhistory-daemon.sh stop historyserver
配置kylin的参数
// 1. 改默认配置;
vim conf/kylin.properties
# 必须修改的参数
kylin.env.hadoop-conf-dir=/home/bigdata/app/hadoop-release/etc/hadoop
# 建议修改的参数
kylin.env.hdfs-working-dir=/kylin
// 2. 修改 conf/setenv.sh 环境变量文件
export SPARK_HOME=/home/bigdata/app/spark-release
// 以上配置完成后, 即可启动Kylin
kylin.sh start
// 停止麒麟服务
kylin.sh stop
Kylin部署时报错
spark not found, set SPARK_HOME, 1.6版本的spark没有/jars这个目录和依赖;
解决办法:
cd $SPARK_HOME
mkdir jars
cp lib/*.jar ../jars
IllegalArgumentException: Failed to find metadata store by url: kylin_metadata@hbase
hbase没开? start hbase的服务, 就可用了;
NoClassDefFoundError: org/apache/spark/scheduler/SparkListenerInterface
controller.CubeController:398 : org/apache/spark/scheduler/SparkListenerInterface
java.lang.NoClassDefFoundError: org/apache/spark/scheduler/SparkListenerInterface
at org.apache.kylin.engine.spark.SparkBatchCubingJobBuilder2.configureSparkJob(SparkBatchCubingJobBuilder2.java:142)
at org.apache.kylin.engine.spark.SparkBatchCubingJobBuilder2.addLayerCubingSteps(SparkBatchCubingJobBuilder2.java:133)
at org.apache.kylin.engine.spark.SparkBatchCubingJobBuilder2.build(SparkBatchCubingJobBuilder2.java:88)
at org.apache.kylin.engine.spark.SparkBatchCubingEngine2.createBatchCubingJob(SparkBatchCubingEngine2.java:44)
at org.apache.kylin.engine.EngineFactory.createBatchCubingJob(EngineFactory.java:60)
at org.apache.kylin.rest.service.JobService.submitJobInternal(JobService.java:233)
at org.apache.kylin.rest.service.JobService.submitJob(JobService.java:201)
Caused by: java.lang.NoClassDefFoundError: org/apache/spark/scheduler/SparkListenerInterface
at org.apache.kylin.engine.spark.SparkBatchCubingJobBuilder2.configureSparkJob(SparkBatchCubingJobBuilder2.java:142)
at org.apache.kylin.engine.spark.SparkBatchCubingJobBuilder2.addLayerCubingSteps(SparkBatchCubingJobBuilder2.java:133)
at org.apache.kylin.engine.spark.SparkBatchCubingJobBuilder2.build(SparkBatchCubingJobBuilder2.java:88)
at org.apache.kylin.engine.spark.SparkBatchCubingEngine2.createBatchCubingJob(SparkBatchCubingEngine2.java:44)
at org.apache.kylin.engine.EngineFactory.createBatchCubingJob(EngineFactory.java:60)
at org.apache.kylin.rest.service.JobService.submitJobInternal(JobService.java:233)
at org.apache.kylin.rest.service.JobService.submitJob(JobService.java:201)
at org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:395)
怎么解决的?
InternalErrorException: Could not find Kafka dependency
构建Kafka Streaming的 clube时, 报此错误;
网上说是没有加Kafka_HOME
[FetcherRunner 1660626211-42] threadpool.DefaultFetcherRunner:85 : Job Fetcher: 0 should running, 0 actual running, 0 stopped, 0 ready, 5 already succeed, 0 error, 0 discarded, 0 others
2020-07-05 07:29:59,745 ERROR [http-bio-7070-exec-4] controller.BasicController:63 :
org.apache.kylin.rest.exception.InternalErrorException: Could not find Kafka dependency
at org.apache.kylin.rest.controller.CubeController.build2(CubeController.java:371)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)