python & airflow

1.mysql连接报错

Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/engine/base.py", line 2147, in _wrap_pool_connect
    return fn()
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/pool.py", line 387, in connect
    return _ConnectionFairy._checkout(self)
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/pool.py", line 766, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/pool.py", line 516, in checkout
    rec = pool._do_get()
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/pool.py", line 1229, in _do_get
    return self._create_connection()
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/pool.py", line 333, in _create_connection
    return _ConnectionRecord(self)
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/pool.py", line 461, in __init__
    self.__connect(first_connect_check=True)
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/pool.py", line 651, in __connect
    connection = pool._invoke_creator(self)
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/engine/strategies.py", line 105, in connect
    return dialect.connect(*cargs, **cparams)
  File "/usr/local/lib/python3.4/dist-packages/sqlalchemy/engine/default.py", line 393, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/__init__.py", line 90, in Connect
    return Connection(*args, **kwargs)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 706, in __init__
    self.connect()
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 932, in connect
    self._request_authentication()
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 1152, in _request_authentication
    auth_packet = self._read_packet()
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 987, in _read_packet
    packet_header = self._read_bytes(4)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 1033, in _read_bytes
    CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')

网上的部分建议是检查max_allowed_packet的值,然后改得尽量大一些,我查看当前的值信息如下:

mysql> show global variables like 'max_allowed_packet';
+--------------------+-----------+
| Variable_name      | Value     |
+--------------------+-----------+
| max_allowed_packet | 2635456 | 
+--------------------+-----------+
1 row in set (0.00 sec)

我这里的值比较小,把它改大了点

mysql> set global max_allowed_packet = 2*1024*1024

还一种做法是把timeout的值调大

查看timeout数值
mysql> show global variables like '%timeout%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| connect_timeout            | 10    | 
| delayed_insert_timeout     | 300   | 
| innodb_lock_wait_timeout   | 100   | 
| innodb_rollback_on_timeout | OFF   | 
| interactive_timeout        | 28800 | 
| net_read_timeout           | 30    | 
| net_write_timeout          | 60    | 
| slave_net_timeout          | 3600  | 
| table_lock_wait_timeout    | 200   | 
| wait_timeout               | 28800 | 
+----------------------------+-------+
10 rows in set (0.00 sec)

修改数值

mysql> set global net_read_timeout = 120; 
Query OK, 0 rows affected (0.03 sec)

mysql> set global net_write_timeout = 900;
Query OK, 0 rows affected (0.00 sec)

mysql> show global variables like '%timeout%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| connect_timeout            | 10    | 
| delayed_insert_timeout     | 300   | 
| innodb_lock_wait_timeout   | 100   | 
| innodb_rollback_on_timeout | OFF   | 
| interactive_timeout        | 28800 | 
| net_read_timeout           | 120   | 
| net_write_timeout          | 900   | 
| slave_net_timeout          | 3600  | 
| table_lock_wait_timeout    | 200   | 
| wait_timeout               | 28800 | 
+----------------------------+-------+
10 rows in set (0.00 sec)

效果尚未验证

2.airflow deadlock

执行 backfill 命令后,运行了很久,最后报错

Traceback (most recent call last):
 File "/anaconda3/bin/airflow", line 28, in <module>
   args.func(args)
 File "/anaconda3/lib/python3.5/site-packages/airflow/bin/cli.py", line 167, in backfill
   pool=args.pool)
 File "/anaconda3/lib/python3.5/site-packages/airflow/models.py", line 3330, in run
   job.run()
 File "/anaconda3/lib/python3.5/site-packages/airflow/jobs.py", line 200, in run
   self._execute()
 File "/anaconda3/lib/python3.5/site-packages/airflow/jobs.py", line 2021, in _execute
   raise AirflowException(err)
airflow.exceptions.AirflowException: ---------------------------------------------------
Here is output about tasks.

BackfillJob is deadlocked. These tasks have succeeded:
set()
These tasks have started:
{}
These tasks have failed:
set()
These tasks are skipped:
set()
These tasks are deadlocked:

方案1
给出的解决方案是

To resolve this situation you can do one of the following:

1.use airflow clear <<dag_id>> This will resolve the deadlock and allow future runs of the DAG/task
2.If above does not solve the issue, you would need to use airflow resetdb This would clear the airflow database and hence resolve the issue
In future,

try and use execution_timeout=timedelta(minutes=2) set some timeout so that you have explicit control on operator
Also, do provide a on_failure_callback=handle_failure which would cleanly exist the operator on failure

我的感觉是虽然backfill在跑,但要注意scheduler 的retry 和 backfill之前多个执行的冲突。首先要保证这两个只有一个在跑,可以等
scheduler retry结束,再backfill。或者停掉scheduler,直接backfill

方案2

Try after deleting the dags entries from dag_run table and restarting the scheduler after that

我的方法是
0.首先停掉scheduler
1.进入dag runs


dag-run.png

2.找到相关dag,打勾


屏幕快照 2018-05-18 下午6.11.33.png

3.删掉
屏幕快照 2018-05-18 下午6.11.44.png

4.重启scheduler
发现已经开始running了

这回比较幸运,已经没跑完的任务终于跑成功了。
但是对于我的任务序列,需要顺序执行,发现第一个成功之后就不动了。于是停掉scheduler,又重新启动scheduler
果然会保留上次的运行结果,直接跳过执行过的那个顺序执行了,最后成功了。

3. Can 't connect to local MySQL server through socket '/tmp/mysql.sock '(2) "

直接执行mysql 会报错:

Can 't connect to local MySQL server through socket '/tmp/mysql.sock '(2) ";

应该执行下面的

# mysql -uroot -h 127.0.0.1 -p 

详情见 文章

4. 后台运行airflow相关命令

airflow kerberos -D
airflow scheduler -D
airflow webserver -D
Here's airflow webeserver --help output (from version 1.8):

-D, --daemon Daemonize instead of running in the foreground

https://stackoverflow.com/questions/46476246/issues-running-airflow-scheduler-as-a-daemon-process/46479069#46479069

5. backfill 和scheduler之间的关系

scheduler 会回溯以前的日期,自动起backfill来跑过去没记录在db的任务,可利用这点来通过删除记录实现backfill
https://stackoverflow.com/questions/39882204/airflow-backfill-clarification

When you change the scheduler toggle to "on" for a DAG, the scheduler will trigger a backfill of all dag run instances for which it has no status recorded, starting with the start_date you specify in your "default_args".

For example: If the start date was "2017-01-21" and you turned on the scheduling toggle at "2017-01-22T00:00:00" and your dag was configured to run hourly, then the scheduler will backfill 24 dag runs and then start running on the scheduled interval.
This is essentially what is happening in both of your question. In #1, it is filling in the 3 missing runs from the 30 seconds which you turned off the scheduler. In #2, it is filling in all of the DAG runs from start_date until "now".

There are 2 ways around this:

  1. Set the start_date to a date in the future so that it will only start scheduling dag runs once that date is reached. Note that if you change the start_date of a DAG, you must change the name of the DAG as well due to the way the start date is stored in airflow's DB.
  2. Manually run backfill from the command line with the "-m" flag which tells airflow not to actually run the DAG, rather just mark it as successful in the DB (https://airflow.incubator.apache.org/cli.html).
e.g. `airflow backfill MY_tutorial -m -s 2016-10-04 -e 2017-01-22T14:28:30`
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 212,542评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,596评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,021评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,682评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,792评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,985评论 1 291
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,107评论 3 410
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,845评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,299评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,612评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,747评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,441评论 4 333
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,072评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,828评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,069评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,545评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,658评论 2 350

推荐阅读更多精彩内容