一、环境准备
1.1 安装GIT (略)
1.2 安装JDK 1.8 (151+)(略)
1.3 安装MAVEN 3.1.x +
具体步骤自行百度,这里配置阿里源,如下为具体配置
<mirrors>
<mirror>
<id>alimaven</id>
<mirrorOf>central</mirrorOf>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/repositories/central/</url>
</mirror>
<!-- 中央仓库1 -->
<mirror>
<id>repo1</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://repo1.maven.org/maven2/</url>
</mirror>
<!-- 中央仓库2 -->
<mirror>
<id>repo2</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://repo2.maven.org/maven2/</url>
</mirror>
<!-- mirror
| Specifies a repository mirror site to use instead of a given repository. The repository that
| this mirror serves has an ID that matches the mirrorOf element of this mirror. IDs are used
| for inheritance and direct lookup purposes, and must be unique across the set of mirrors.
|
<mirror>
<id>mirrorId</id>
<mirrorOf>repositoryId</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://my.repository.com/repo/path</url>
</mirror>
-->
</mirrors>
二、下载源码解压,并修改部分内容
wget https://mirrors.tuna.tsinghua.edu.cn/apache/zeppelin/zeppelin-0.9.0/zeppelin-0.9.0.tgz
tar -zxf zeppelin-0.9.0.tgz
# 修改spark部分内容
cd zeppelin-0.9.0/spark
## 修改 pom 文件,注释部分内容
<modules>
<module>interpreter</module>
<module>spark-scala-parent</module>
<!-- <module>scala-2.10</module> -->
<module>scala-2.11</module>
<!-- <module>scala-2.12</module>-->
<module>spark-dependencies</module>
<module>spark-shims</module>
<!-- <module>spark1-shims</module> -->
<module>spark2-shims</module>
<!-- <module>spark3-shims</module> -->
</modules>
#修改flink版本号,并修改资源URL
cd zeppelin-0.9.0/flink
# 编辑pom文件,修改flink版本
<properties>
<flink1.10.version>1.10.3</flink1.10.version>
<flink1.11.version>1.11.3</flink1.11.version>
<flink1.12.version>1.12.2</flink1.12.version>
</properties>
# 修改下载URL
cd zeppelin-0.9.0/flink/interpreter
#修改pom 中 flink.bin.download.url 值,如下所示
https://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-${flink.version}/flink-${flink.version}-bin-scala_${scala.binary.version}.tgz</flink.bin.download.url>
三、编译
cd zeppelin-0.9.0
mvn clean package -Dspark.version=2.3.2 -Pscala-2.11 -Dhbase.version=2.1.6 -Dhadoop.version=3.1.1.3.1.5.0-152 -Pvendor-repo -DskipTests
[ERROR] error reading /home/luke/.m2/repository/org/bouncycastle/bcprov-jdk15on/1.52/bcprov-jdk15on-1.52.jar; error in opening zip file
[ERROR] error reading /home/luke/.m2/repository/org/bouncycastle/bcprov-jdk15on/1.52/bcprov-jdk15on-1.52.jar; error in opening zip file
怀疑jar包下载不完整,删除之后重新编译
rm -f /home/luke/.m2/repository/org/bouncycastle/bcprov-jdk15on/1.52/bcprov-jdk15on-1.52.jar
mvn clean package -Dspark.version=2.3.2 -Pscala-2.11 -Dhbase.version=2.1.6 -Dhadoop.version=3.1.1.3.1.5.0-152 -Pvendor-repo -DskipTests -rf :r
[ERROR]
org.apache.http.ConnectionClosedException: Premature end of Content-Length delimited message body (expected: 290239990; received: 10206892
flink资源下载失败重试
mvn clean package -Dspark.version=2.3.2 -Pscala-2.11 -Dhbase.version=2.1.6 -Dhadoop.version=3.1.1.3.1.5.0-152 -Pvendor-repo -DskipTests -rf :zeppelin-flink
fatal: unable to access 'https://github.com/sachinchoolur/ngclipboard.git/': Empty reply from server\n"
是因为国内访问github的问题,对git做如下配置即可
git config --global url."git://".insteadOf=https://
mvn clean package -Dspark.version=2.3.2 -Pscala-2.11 -Dhbase.version=2.1.6 -Dhadoop.version=3.1.1.3.1.5.0-152 -Pvendor-repo -DskipTests -rf :zeppelin-web
[INFO]
[INFO] Zeppelin: web Application .......................... SUCCESS [06:25 min]
[INFO] Zeppelin: Server ................................... FAILURE [03:23 min]
[INFO] Zeppelin: Plugins Parent ........................... SKIPPED
[INFO] Zeppelin: Plugin S3NotebookRepo .................... SKIPPED
[INFO] Zeppelin: Plugin GitHubNotebookRepo ................ SKIPPED
[INFO] Zeppelin: Plugin AzureNotebookRepo ................. SKIPPED
[INFO] Zeppelin: Plugin GCSNotebookRepo ................... SKIPPED
[INFO] Zeppelin: Plugin ZeppelinHubRepo ................... SKIPPED
[INFO] Zeppelin: Plugin FileSystemNotebookRepo ............ SKIPPED
[INFO] Zeppelin: Plugin MongoNotebookRepo ................. SKIPPED
[INFO] Zeppelin: Plugin OSSNotebookRepo ................... SKIPPED
[INFO] Zeppelin: Plugin Kubernetes StandardLauncher ....... SKIPPED
[INFO] Zeppelin: Plugin Flink Launcher .................... SKIPPED
[INFO] Zeppelin: Plugin Docker Launcher ................... SKIPPED
[INFO] Zeppelin: Plugin Cluster Launcher .................. SKIPPED
[INFO] Zeppelin: Plugin Yarn Launcher ..................... SKIPPED
[INFO] Zeppelin: Packaging distribution ................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 09:49 min
[INFO] Finished at: 2021-04-08T16:19:57+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project zeppelin-server: Compilation failure: Compilation failure:
[ERROR] /home/luke/zeppelin/zeppelin-0.9.0/zeppelin-server/src/main/java/org/apache/zeppelin/realm/kerberos/KerberosUtil.java:[41,58] package org.apache.directory.server.kerberos.shared.keytab does not exist
[ERROR] /home/luke/zeppelin/zeppelin-0.9.0/zeppelin-server/src/main/java/org/apache/zeppelin/realm/kerberos/KerberosUtil.java:[42,58] package org.apache.directory.server.kerberos.shared.keytab does not exist
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <args> -rf :zeppelin-server
经过反复测试是因为指定的 HDP HADOOP 版本的原因,没办法只能剩下的 module 逐个单独编译,在编译 zeppelin-server 时,将特定HADOOP 版本取消,采用zeppelin默认,其他模块仍然指定 HDP 的HADOOP版本进行编译
据我判断应该是, hadoop 3.1.1.3.1.5.0-152 和zeppelin 默认的 hadop 2.7.7 的 keytab有关源码,有差别,导致 zeppelin-server无法引用,应该可以通过修改源码去解决,这里暂时不考虑这种方案
# 编译zeppelin-server module
cd zeppelin-server/
mvn clean package -Dspark.version=2.3.2 -Pscala-2.11 -Dhbase.version=2.1.6 -Pvendor-repo -DskipTests
# 编译 zeppelin-plugins module
cd zeppelin-plugins/
mvn clean package -Dspark.version=2.3.2 -Pscala-2.11 -Dhbase.version=2.1.6 -Dhadoop.version=3.1.1.3.1.5.0-152 -Pvendor-repo -DskipTests
# 编译 zeppelin-distribution
cd zeppelin-distribution
mvn clean package -Pbuild-distr -Dspark.version=2.3.2 -Pscala-2.11 -Dhbase.version=2.1.6 -Dhadoop.version=3.1.1.3.1.5.0-152 -Pvendor-repo -DskipTests
[INFO] Building tar: /home/luke/zeppelin/zeppelin-0.9.0/zeppelin-distribution/target/zeppelin-0.9.0.tar.gz
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:49 min
[INFO] Finished at: 2021-04-08T17:14:10+08:00
[INFO] ------------------------------------------------------------------------