sparkMagic : https://github.com/jupyter-incubator/sparkmagic
下载sparkMagic
由于是离线环境, 至https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/ 下载
sparkmagic-0.12.5-py36h8c657a7_0.tar.bz2使用conda 安装 :
conda install *.bz2
- 检查ipywidgets 已经正确安装 (这应该是在安装jupyther时安装的)
jupyter nbextension enable --py --sys-prefix widgetsnbextension
- 使用 pip show sparkmagic 显示安装包位置:
[root@node203 offlinePython3Pkg]# pip show sparkmagic
Name: sparkmagic
Version: 0.12.5
Summary: SparkMagic: Spark execution via Livy
Home-page: https://github.com/jupyter-incubator/sparkmagic
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.org
License: BSD 3-clause
Location: /usr/anaconda3/lib/python3.6/site-packages
- 至安装目录安装打包好的kernels
jupyter-kernelspec install sparkmagic/kernels/sparkkernel
jupyter-kernelspec install sparkmagic/kernels/pysparkkernel
jupyter-kernelspec install sparkmagic/kernels/pyspark3kernel
jupyter-kernelspec install sparkmagic/kernels/sparkrkernel
配置~/.magic/config.json
开启服务扩展
jupyter serverextension enable --py sparkmagic
报错:
[root@node203 jupyter]# jupyter serverextension enable --py sparkmagic
Traceback (most recent call last):
File "/usr/anaconda3/bin/jupyter-serverextension", line 11, in <module>
sys.exit(main())
File "/usr/anaconda3/lib/python3.6/site-packages/jupyter_core/application.py", line 266, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/usr/anaconda3/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()
File "/usr/anaconda3/lib/python3.6/site-packages/notebook/serverextensions.py", line 293, in start
super(ServerExtensionApp, self).start()
File "/usr/anaconda3/lib/python3.6/site-packages/jupyter_core/application.py", line 255, in start
self.subapp.start()
File "/usr/anaconda3/lib/python3.6/site-packages/notebook/serverextensions.py", line 210, in start
self.toggle_server_extension_python(arg)
File "/usr/anaconda3/lib/python3.6/site-packages/notebook/serverextensions.py", line 199, in toggle_server_extension_python
m, server_exts = _get_server_extension_metadata(package)
File "/usr/anaconda3/lib/python3.6/site-packages/notebook/serverextensions.py", line 327, in _get_server_extension_metadata
m = import_item(module)
File "/usr/anaconda3/lib/python3.6/site-packages/traitlets/utils/importstring.py", line 42, in import_item
return __import__(parts[0])
File "/usr/anaconda3/lib/python3.6/site-packages/sparkmagic/__init__.py", line 3, in <module>
from sparkmagic.serverextension.handlers import load_jupyter_server_extension
File "/usr/anaconda3/lib/python3.6/site-packages/sparkmagic/serverextension/handlers.py", line 9, in <module>
from sparkmagic.kernels.kernelmagics import KernelMagics
File "/usr/anaconda3/lib/python3.6/site-packages/sparkmagic/kernels/__init__.py", line 1, in <module>
from sparkmagic.kernels.kernelmagics import *
File "/usr/anaconda3/lib/python3.6/site-packages/sparkmagic/kernels/kernelmagics.py", line 12, in <module>
from hdijupyterutils.utils import generate_uuid
ModuleNotFoundError: No module named 'hdijupyterutils'
缺少python依赖 : 下载 hdijupyterutils-0.12.5-py36hc0bb8fd_0.tar.bz2
离线安装缺少的包太多啦, 依赖不好管理 !!!!
(要么通过proxy联网安装, 要么有个环境, 将相应的包下载下来!)
- 联网环境下安装sparkMagic :
## Package Plan ##
environment location: /usr/anaconda3
added / updated specs:
- sparkmagic
The following packages will be downloaded:
package | build
---------------------------|-----------------
pykerberos-1.1.14 | py36_0 46 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
requests-kerberos-0.11.0 | py36_0 15 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
plotly-2.0.11 | py36_0 937 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
autovizwidget-0.12.1 | py36_0 21 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
ca-certificates-2017.08.26 | h1d4fec5_0 263 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
hdijupyterutils-0.12.1 | py36_0 13 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
sparkmagic-0.12.1 | py36_0 64 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
certifi-2018.1.18 | py36_0 144 KB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
openssl-1.0.2n | hb7f436b_0 3.4 MB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
krb5-1.13.2 | 0 3.5 MB https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
------------------------------------------------------------
Total: 8.5 MB
The following NEW packages will be INSTALLED:
autovizwidget: 0.12.1-py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
hdijupyterutils: 0.12.1-py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
krb5: 1.13.2-0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
plotly: 2.0.11-py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
pykerberos: 1.1.14-py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
requests-kerberos: 0.11.0-py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
sparkmagic: 0.12.1-py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
The following packages will be UPDATED:
ca-certificates: 2017.08.26-h1d4fec5_0 defaults --> 2017.08.26-h1d4fec5_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
certifi: 2018.1.18-py36_0 defaults --> 2018.1.18-py36_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
openssl: 1.0.2n-hb7f436b_0 defaults --> 1.0.2n-hb7f436b_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
Proceed ([y]/n)? y
Downloading and Extracting Packages
pykerberos 1.1.14: ###################################################################################################################################################################################### | 100%
requests-kerberos 0.11.0: ############################################################################################################################################################################### | 100%
plotly 2.0.11: ########################################################################################################################################################################################## | 100%
autovizwidget 0.12.1: ################################################################################################################################################################################### | 100%
ca-certificates 2017.08.26: ############################################################################################################################################################################# | 100%
hdijupyterutils 0.12.1: ################################################################################################################################################################################# | 100%
sparkmagic 0.12.1: ###################################################################################################################################################################################### | 100%
certifi 2018.1.18: ###################################################################################################################################################################################### | 100%
openssl 1.0.2n: ######################################################################################################################################################################################### | 100%
krb5 1.13.2: ############################################################################################################################################################################################ | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
- 联网安装后 检查成功
[root@repo site-packages]# jupyter serverextension enable --py sparkmagic
Enabling: sparkmagic
- Writing config: /root/.jupyter
- Validating...
sparkmagic OK
10 . 安装配置 Livy
- 配置 ~/.sparkmagic/config.json
使用的是git上的示例配置文件;
注意: master:8998 中的master替换成livy所在的主机名
- 配置 jupyter
使用命令 生成配置文件:
jupyter notebook --generate-config
配置文件目录在: ~/.jupyter/
- 修改notebook的初始化目录 , 在 jupyter_notebook_config.json 中
"notebook_dir":"/usr/anaconda3/dubook"
- 启动
jupyter notebook --no-browser --allow-root --ip=node203.hmbank.com --port=8888 &
修改初始密码
开始使用。
问题:
- 导入 pysaprk 报错如下:
import pyspark
"The code failed because of a fatal error:\n",
"\tError sending http request and maximum retry encountered..\n",
"\n",
"Some things to try:\n",
"a) Make sure Spark has enough available resources for Jupyter to create a Spark context.\n",
"b) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.\n",
"c) Restart the kernel.\n"
解决:
- ~/.sparkmagic/config.json 中配置的livy的地址写错了 , 可以通过该目录下的logs日志发现。
2018-05-28 18:56:52,884 ERROR ReliableHttpClient Request to 'http://node202.hmbank.com:8998/sessions' failed with 'HTTPConnectionPool(host='node202.hmbank.com', port=8998): Max retries exceeded with url: /sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fef9964f7f0>: Failed to establish a new connection: [Errno 111] Connection refused',))'
2018-05-28 18:56:52,888 INFO EventsHandler InstanceId: 4301a914-5087-4d77-a82b-19d6b2d7be7d,EventName: notebookSessionCreationEnd,Timestamp: 2018-05-28 10:56:52.888202,SessionGuid: 0131d7ad-d110-4559-9c76-549ba0916eae,LivyKind: pyspark3,SessionId: -1,Status: not_started,Success: False,ExceptionType: HttpClientException,ExceptionMessage: Error sending http request and maximum retry encountered.
2018-05-28 18:56:52,888 ERROR SparkMagics Error creating session: Error sending http request and maximum retry encountered.
- 部署spark所在的机器上启动python , 执行 import pyspark 报错 :
>>> import pyspark
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'pyspark'
解决:
添加相关环境变量:
export SPARK_HOME=/usr/lib/apacheori/spark-2.3.0-bin-hadoop2.6
export PYSPARK_PYTHON=/usr/anaconda3/bin/python
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/pyspark.zip:$SPARK_HOME/python/lib/py4j-0.10.6-src.zip:$PYTHONPATH
py4j-0.10.6-src.zip 和 pyspark.zip 在spark安装目录的python下。