To Be Continued...
Brief Intro to Airflow
Airflow is a platform to programmatically author, schedule and monitor workflows like Oozie which was more famous in workflow industry.Airflow is a incubating project which is very new .But the project has not been hidden for his very huge advantage.
Airflow Advantages:
- airflow is developed by Python .it destined that airflow is well maintained and second-developed
- airflow has nice UI for controlling,displaying,monitoring workflow
- airflow has been running in the backend of Electron Project which is bigddata log analyser application in Youzu .the airflow has been proved to be stable and smooth
Installation
airflow needs a home, ~/airflow is the default,
but you can lay foundation somewhere else if you prefer
(optional)
export AIRFLOW_HOME=~/airflow
install from pypi using pip
pip install airflow
pip install airflow[mysql]
initialize the database
airflow initdb
start the web server, default port is 8080
airflow webserver -p 8080
install celery when you intend to use celery executor
pip install airflow[celery]
Airflow Case
alter two lines in airflow.cfg
executor = CeleryExecutor
store metadata using mysql
sql_alchemy_conn = mysql://username:password@ipaddress/dbname?charset=utf8
start airflow webserver,airflow celery worker
airflow webserver
airflow worker
write dag file in dag_folder which can be modify in airflow.cfg setting file
eg: $AIRFLOW_HOME/dags/example.py #example.py is dag file
submit dag file to airflow for generating airflow task
airflow trigger_dag example #example is #eg: $AIRFLOW_HOME/dags/example.py