原文: https://www.tensorflow.org/tfx/tutorials/tfx/airflow_workshop
TFX是TF的一个扩展,用于构建自己的ml pipe的工具。
环境配置
MacOS sometimes has problems forking threads when running Airflow, depending on the configuration. To avoid those problems you should edit your ~/.bash_profile and add the following line to the end of the file:
在~/.bash_profile加入这一行
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
创建新py环境
conda create -n tfx python=3.7 # 注意python版本!
conda activate tfx
TFX安装
如果clone的很慢请参考: github提速
git clone https://github.com/tensorflow/tfx.git
cd ./tfx/tfx/examples/airflow_workshop/setup
./setup_demo.sh
启动webserver
airflow webserver -p 8080
启动scheduler
这一步不能忽略 我发现如果不启动scheduler不能正常执行DAG
airflow scheduler
启动notebook
# 开一个新终端
# source conda activate tfx
cd tfx/tfx/examples/airflow_workshop/notebooks
jupyter notebook
安装完后会出现一个目录 ~/airflow 其中airflow/dags是tfx流程的基础脚本之后的文章都基于此demo介绍
DONE!
问题解决
如果在setup时有一些包没装上,一般情况下是pypi源的问题:
配置见:https://www.jianshu.com/p/cfabac1849c7
sqlalchemy.exc.NoInspectionAvailable: No inspection system is available for object of type <class 'method'>
办法
pip3 uninstall SQLAlchemy
pip3 install SQLAlchemy==1.3.15
https://github.com/apache/airflow/issues/8211
reference
https://www.tensorflow.org/tfx/tutorials/tfx/airflow_workshop