Apscheduler库是轻量级的python定时任务框架。在docker容器环境中使用这个库时,遇到了一个问题:设置的trigger为cron,出发时机为day="1/*"时,触发的时间是每天的16点,而不是设想中的0点。原因是什么呢?
1. 定位问题
初步猜测是时区的问题。于是,查看了apscheduler的源码,发现该框架获取了本地时区。代码如下:
# apscheduler/apscheduler/triggers/date.py
from tzlocal import get_localzone
#...
class DateTrigger(BaseTrigger):
def __init__(self, run_date=None, timezone=None):
timezone = astimezone(timezone) or get_localzone()
if run_date is not None:
self.run_date = convert_to_datetime(run_date, timezone, 'run_date')
else:
self.run_date = datetime.now(timezone)
可以看到如果不指定时区的话,就会采用是系统时区。那么get_localzone有时怎么获取时区信息的呢?再次求助于源码。
# tzlocal/unix.py
def _get_localzone(_root='/'):
"""Tries to find the local timezone configuration.
This method prefers finding the timezone name and passing that to pytz,
over passing in the localtime file, as in the later case the zoneinfo
name is unknown.
The parameter _root makes the function look for files like /etc/localtime
beneath the _root directory. This is primarily used by the tests.
In normal usage you call the function without parameters."""
tzenv = _try_tz_from_env()
if tzenv:
return tzenv
# Now look for distribution specific configuration files
# that contain the timezone name.
for configfile in ('etc/timezone', 'var/db/zoneinfo'):
tzpath = os.path.join(_root, configfile)
if os.path.exists(tzpath):
with open(tzpath, 'rb') as tzfile:
data = tzfile.read()
# Issue #3 was that /etc/timezone was a zoneinfo file.
# That's a misconfiguration, but we need to handle it gracefully:
if data[:5] == 'TZif2':
continue
etctz = data.strip().decode()
# Get rid of host definitions and comments:
if ' ' in etctz:
etctz, dummy = etctz.split(' ', 1)
if '#' in etctz:
etctz, dummy = etctz.split('#', 1)
return pytz.timezone(etctz.replace(' ', '_'))
# CentOS has a ZONE setting in /etc/sysconfig/clock,
# OpenSUSE has a TIMEZONE setting in /etc/sysconfig/clock and
# Gentoo has a TIMEZONE setting in /etc/conf.d/clock
# We look through these files for a timezone:
zone_re = re.compile('\s*ZONE\s*=\s*\"')
timezone_re = re.compile('\s*TIMEZONE\s*=\s*\"')
end_re = re.compile('\"')
for filename in ('etc/sysconfig/clock', 'etc/conf.d/clock'):
tzpath = os.path.join(_root, filename)
if not os.path.exists(tzpath):
continue
with open(tzpath, 'rt') as tzfile:
data = tzfile.readlines()
for line in data:
# Look for the ZONE= setting.
match = zone_re.match(line)
if match is None:
# No ZONE= setting. Look for the TIMEZONE= setting.
match = timezone_re.match(line)
if match is not None:
# Some setting existed
line = line[match.end():]
etctz = line[:end_re.search(line).start()]
# We found a timezone
return pytz.timezone(etctz.replace(' ', '_'))
# systemd distributions use symlinks that include the zone name,
# see manpage of localtime(5) and timedatectl(1)
tzpath = os.path.join(_root, 'etc/localtime')
if os.path.exists(tzpath) and os.path.islink(tzpath):
tzpath = os.path.realpath(tzpath)
start = tzpath.find("/")+1
while start is not 0:
tzpath = tzpath[start:]
try:
return pytz.timezone(tzpath)
except pytz.UnknownTimeZoneError:
pass
start = tzpath.find("/")+1
# No explicit setting existed. Use localtime
for filename in ('etc/localtime', 'usr/local/etc/localtime'):
tzpath = os.path.join(_root, filename)
if not os.path.exists(tzpath):
continue
with open(tzpath, 'rb') as tzfile:
return pytz.tzfile.build_tzinfo('local', tzfile)
raise pytz.UnknownTimeZoneError('Can not find any timezone configuration')
def get_localzone():
"""Get the computers configured local timezone, if any."""
global _cache_tz
if _cache_tz is None:
_cache_tz = _get_localzone()
return _cache_tz
可以看到,apscheuler获取时区,对于ubuntu系统而言,首先看/etc/timezone文件,如果没有的话,从/etc/localtime文件获取。所以,我们可以去看下系统时区。
宿主机
$ cat /etc/timezone
Etc/UTC
$ zdump /etc/localtime
/etc/localtime Wed Nov 21 09:02:10 2018 UTC
容器里
# cat /etc/timezone
Asia/Shanghai
# zdump /etc/localtime
/etc/localtime Wed Nov 21 09:00:50 2018 UTC
可以看到,容器里的localtime用的时区和系统时区并不一致!
2. 解决方案
方案1:
在apscheduler中指定时区,如下
from apscheduler.schedulers.blocking import BlockingScheduler
import pytz
schedudler = BlockingScheduler()
def worker():
print "hello scheduler"
schedudler.add_job(worker,'cron',day="*/1", hour=8)
schedudler.add_job(worker,'cron',day="*/1", timezone=pytz.utc)
schedudler.start()
以上两种方式,都能在每天UTC零点执行。
方案2:
运行将宿主机的时区和时间文件映射到容器中
docker run -v /etc/timezone:/etc/timezone -v /etc/localtime:/etc/localtime -it ubuntu bash
或者直接在dockerfile中修改时区,在dockerfile中添加
RUN echo "Etc/UTC" > /etc/timezone