Celery 源码学习（一）架构分析

1.Celery 是一个简单、灵活且可靠的，处理大量消息的分布式系统，并且提供维护这样一个系统的必需工具。

简单说就是分布式的任务队列

2.消息队列与任务队列区别

可以看我另一篇文章

https://www.jianshu.com/p/cde93d4d00c8

3.在我看来，消息队列和任务队列主要能解决以下场景的问题：

非实时数据离线计算业务逻辑拆分，降低耦合异步响应数据，提高用户体验其实在知乎，很多行为，比如你点一个赞，写一篇回答，发表一篇文章，启用匿名等等，背后都会有很多消息发送出去。不同的业务方收到后做不同的处理，比如算法会计算回答、文章的分值，做一些标签的判定，后台系统可能会记录回答，文章的 meta 信息等等。

4. 什么是 celery？

接触 python 的同学肯定不会陌生，即便没有用过，但应该也会听说过。celery 是 python 世界中最有名的开源消息队列框架。他之所以特别火爆，主要在于它实现了以下几点：性能高，吞吐量大配置灵活，简单易用文档齐全，配套完善其实除了以上的优点外，celery 的源码也有很大的价值，可以说是软件工程设计模式的典范。不过在这篇文章中我不会涉及源码的解读，更多的会介绍下 celery 架构设计中的一些哲学。

5. 高性能：多进程事件驱动的异步模型说到高性能

尤其是对于消息队列来说，很多人不以为然，认为高性能是没有意义的。其实这也不无道理。主要表现为以下几点：离线任务本身不要求实时性，一秒处理 100 和一秒处理 1000 个任务没有本质的区别。如果非要提高吞吐量，可以通过扩容等更加灵活的手段。而且资源的开销并不是很大。吞吐量过高，反而有可能是坏事。比如把上游业务打崩，把 mysql 打崩，把系统连接打满等等。虽然对于离线队列来说，性能不重要。

但是，这并不妨碍我们从学习的角度看待 celery 的架构原则。

celery 的高性能主要靠两个方面来保证，一个是多进程，一个是事件驱动。

下面我分别来讲一下他们的设计思想。珠玉在前很多人应该对 nginx 不陌生，提到 nginx，我们首先想到的，就是 nginx 是一个高性能的反向代理服务器。而 celery，可以说相当程度上借鉴了 nginx。

6.消费模型celery 的核心架构，

分成了调度器（master/main process）和工作进程（slaves/worker processes），也就是我们常说的主从。

celery 的消费模型很简单，调度器负责任务的获取，分发，工作进程（slaves/worker processes）的管理（创建，增加，关闭，重启，丢弃等等），其他辅助模块的维护等等。

工作进程负责消费从调度器传递过来的任务。具体流程：调度器首先预生成（prefork）工作进程，做为一个进程池（mutiprocessing-pool），之后通过事件驱动（select/poll/epoll）的方式，监听内核的事件（读、写、异常等等），如果监听到就执行对应的回调，源源不断的从中间人（broker）那里提取任务，并通过管道（pipe）作为进程间通讯的方式，运用一系列的路由策略（round-robin、weight 等等）交给工作进程。

工作进程消费（ack）任务，再通过管道向调度器进行状态同步（sync），进程间通讯等等行为。

当然，这只是一个很粗粒度的描述，其实 celery 内部还实现了很多有趣的功能，比如 prefetch，集群监控与管理，auto-scaler，容灾恢复等等，这些非核心功能的模块暂时还不会涉及，以后可以单独拆出来看他是怎么实现的。

7.高效的理由可以思考一下，为什么这种架构方式性能非常高。

首先，我们分析下调度器。调度器是一个事件驱动模型，什么事事件驱动，其实就是它消灭了阻塞。正常的单线程模型，一次只能拿一条消息，每一次都要走一条来和回的链路，并且需要一个 while True 的循环不断的去检测，这样无疑是非常低效且开销大的。而事件驱动则不这样，他可以同时发送多个检测的信号，然后就直接挂起，等待内核进行提示，有提示再去执行对应的回调。这样既优雅的化解了单线程每次都要检测的 while True，又通过多次请求并发降低了重复链路。
然后，我们看一下工作进程用多进程的优势。业内有经验的工程师，在配置容器的时候，经常会使用 n 核，n*m worker 数的配置。这是因为，多进程可以良好的发挥每个核的计算能力。

而且多进程良好的分摊了并发请求的处理压力，同时，多进程内部，还可以使用多线程、异步等方式

这样，可以在充分利用多核计算优势的基础上，再充分利用单个线程非阻塞模型的优势。好，关于 celery 的设计架构大概就讲到这里，之后会从源码的角度分析下上面的那一系列流程是怎么实现的

[celery执行器--CeleryExecutor]

由于celery任务执行器不能并行执行，因此开启多进程进行执行

#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
"""Celery executor."""
import math
import os
import subprocess
import time
import traceback
from multiprocessing import Pool, cpu_count
from typing import Any, List, Optional, Tuple, Union

from celery import Celery, Task, states as celery_states
from celery.result import AsyncResult

from airflow.config_templates.default_celery import DEFAULT_CELERY_CONFIG
from airflow.configuration import conf
from airflow.exceptions import AirflowException
from airflow.executors.base_executor import BaseExecutor, CommandType
from airflow.models.taskinstance import SimpleTaskInstance, TaskInstanceKeyType, TaskInstanceStateType
from airflow.utils.log.logging_mixin import LoggingMixin
from airflow.utils.module_loading import import_string
from airflow.utils.timeout import timeout

# Make it constant for unit test.
CELERY_FETCH_ERR_MSG_HEADER = 'Error fetching Celery task state'

CELERY_SEND_ERR_MSG_HEADER = 'Error sending Celery task'

'''
To start the celery worker, run the command:
airflow celery worker
'''

if conf.has_option('celery', 'celery_config_options'):
    celery_configuration = import_string(
        conf.get('celery', 'celery_config_options')
    )
else:
    celery_configuration = DEFAULT_CELERY_CONFIG


#app实例
app = Celery(
    conf.get('celery', 'CELERY_APP_NAME'),
    config_source=celery_configuration)


@app.task
def execute_command(command_to_exec: CommandType) -> None:
    """
    需要执行的命令
    Executes command.

    subprocess.check_call(args, *, stdin = None, stdout = None, stderr = None, shell = False)
    与call方法类似，不同在于如果命令行执行成功，check_call返回返回码0，否则抛出subprocess.CalledProcessError异常。
    subprocess.CalledProcessError异常包括returncode、cmd、output等属性，其中returncode是子进程的退出码，cmd是子进程的执行命令，output为None。

    当子进程退出异常时，则报错

    """
    log = LoggingMixin().log
    log.info("Executing command in Celery: %s", command_to_exec)
    env = os.environ.copy()
    try:
        subprocess.check_call(command_to_exec, stderr=subprocess.STDOUT,
                              close_fds=True, env=env)
    except subprocess.CalledProcessError as e:
        log.exception('execute_command encountered a CalledProcessError')
        log.error(e.output)
        raise AirflowException('Celery command failed')


class ExceptionWithTraceback:
    """
    包装器
    Wrapper class used to propagate exceptions to parent processes from subprocesses.

    :param exception: The exception to wrap
    :type exception: Exception
    :param exception_traceback: The stacktrace to wrap
    :type exception_traceback: str
    """

    def __init__(self, exception: Exception, exception_traceback: str):
        self.exception = exception
        self.traceback = exception_traceback


def fetch_celery_task_state(celery_task: Tuple[TaskInstanceKeyType, AsyncResult]) \
        -> Union[TaskInstanceStateType, ExceptionWithTraceback]:
    """

    返回任务状态

    Fetch and return the state of the given celery task. The scope of this function is
    global so that it can be called by subprocesses in the pool.

    :param celery_task: a tuple of the Celery task key and the async Celery object used
        to fetch the task's state
    :type celery_task: tuple(str, celery.result.AsyncResult)
    :return: a tuple of the Celery task key and the Celery state of the task
    :rtype: tuple[str, str]
    """

    try:
        with timeout(seconds=2):
            # Accessing state property of celery task will make actual network request
            # to get the current state of the task.
            return celery_task[0], celery_task[1].state
    except Exception as e:  # pylint: disable=broad-except
        exception_traceback = "Celery Task ID: {}\n{}".format(celery_task[0],
                                                              traceback.format_exc())
        return ExceptionWithTraceback(e, exception_traceback)


# Task instance that is sent over Celery queues
# TaskInstanceKeyType, SimpleTaskInstance, Command, queue_name, CallableTask
TaskInstanceInCelery = Tuple[TaskInstanceKeyType, SimpleTaskInstance, CommandType, Optional[str], Task]


def send_task_to_executor(task_tuple: TaskInstanceInCelery) \
        -> Tuple[TaskInstanceKeyType, CommandType, Union[AsyncResult, ExceptionWithTraceback]]:
    """

    发送任务到celery执行器
    Sends task to executor."""
    key, _, command, queue, task_to_run = task_tuple
    try:
        with timeout(seconds=2):
            #异步执行
            result = task_to_run.apply_async(args=[command], queue=queue)
    except Exception as e:  # pylint: disable=broad-except
        exception_traceback = "Celery Task ID: {}\n{}".format(key, traceback.format_exc())
        result = ExceptionWithTraceback(e, exception_traceback)

    return key, command, result


class CeleryExecutor(BaseExecutor):
    """
    CeleryExecutor is recommended for production use of Airflow. It allows
    distributing the execution of task instances to multiple worker nodes.

    Celery is a simple, flexible and reliable distributed system to process
    vast amounts of messages, while providing operations with the tools
    required to maintain such a system.
    
    celery执行器只能同步执行（不能调用execute_async）,不能异步执行;由于开启了多进程， 因此加速了执行。
    由于celery本身是异步的，本质上来说，还是异步执行
    
    """

    def __init__(self):
        super().__init__()

        # Celery doesn't support querying the state of multiple tasks in parallel
        # (which can become a bottleneck on bigger clusters) so we use
        # a multiprocessing pool to speed this up.
        # How many worker processes are created for checking celery task state.
        self._sync_parallelism = conf.getint('celery', 'SYNC_PARALLELISM')
        if self._sync_parallelism == 0:
            self._sync_parallelism = max(1, cpu_count() - 1)

        self._sync_pool = None
        #正在运行的任务
        self.tasks = {}
        #最近的状态
        self.last_state = {}

    def start(self) -> None:
        self.log.debug(
            'Starting Celery Executor using %s processes for syncing',
            self._sync_parallelism
        )

    def _num_tasks_per_send_process(self, to_send_count: int) -> int:
        """
        每个进程多少个任务

        任务数量 / 并行度
        How many Celery tasks should each worker process send.

        :return: Number of tasks that should be sent per process
        :rtype: int
        """
        return max(1,
                   int(math.ceil(1.0 * to_send_count / self._sync_parallelism)))

    def _num_tasks_per_fetch_process(self) -> int:
        """
        How many Celery tasks should be sent to each worker process.
        一次发送多少任务

        :return: Number of tasks that should be used per process
        :rtype: int
        """
        return max(1, int(math.ceil(1.0 * len(self.tasks) / self._sync_parallelism)))

    def trigger_tasks(self, open_slots: int) -> None:
        """

        触发任务

        Overwrite trigger_tasks function from BaseExecutor

        :param open_slots: Number of open slots
        :return:
        """
        sorted_queue = self.order_queued_tasks_by_priority()

        task_tuples_to_send: List[TaskInstanceInCelery] = []

        for _ in range(min((open_slots, len(self.queued_tasks)))):
            key, (command, _, queue, simple_ti) = sorted_queue.pop(0)
            task_tuples_to_send.append((key, simple_ti, command, queue, execute_command))

        cached_celery_backend = None
        if task_tuples_to_send:
            tasks = [t[4] for t in task_tuples_to_send]

            # Celery state queries will stuck if we do not use one same backend for all tasks.
            cached_celery_backend = tasks[0].backend

        if task_tuples_to_send:
            # Use chunks instead of a work queue to reduce context switching
            # since tasks are roughly uniform in size
            chunksize = self._num_tasks_per_send_process(len(task_tuples_to_send))
            #进程数量
            num_processes = min(len(task_tuples_to_send), self._sync_parallelism)

            #进程池
            send_pool = Pool(processes=num_processes)
            key_and_async_results = send_pool.map(
                send_task_to_executor,
                task_tuples_to_send,
                chunksize=chunksize)

            send_pool.close()
            send_pool.join()
            self.log.debug('Sent all tasks.')

            for key, command, result in key_and_async_results:
                if isinstance(result, ExceptionWithTraceback):
                    self.log.error(
                        CELERY_SEND_ERR_MSG_HEADER + ":%s\n%s\n", result.exception, result.traceback
                    )
                elif result is not None:
                    # Only pops when enqueued successfully, otherwise keep it
                    # and expect scheduler loop to deal with it.
                    self.queued_tasks.pop(key)
                    result.backend = cached_celery_backend
                    self.running.add(key)
                    self.tasks[key] = result
                    self.last_state[key] = celery_states.PENDING

    def sync(self) -> None:
        """同步结果"""
        num_processes = min(len(self.tasks), self._sync_parallelism)
        if num_processes == 0:
            self.log.debug("No task to query celery, skipping sync")
            return

        self.log.debug("Inquiring about %s celery task(s) using %s processes",
                       len(self.tasks), num_processes)

        # Recreate the process pool each sync in case processes in the pool die
        self._sync_pool = Pool(processes=num_processes)

        # Use chunks instead of a work queue to reduce context switching since tasks are
        # roughly uniform in size
        chunksize = self._num_tasks_per_fetch_process()

        self.log.debug("Waiting for inquiries to complete...")
        #获取任务的结果
        task_keys_to_states = self._sync_pool.map(
            fetch_celery_task_state,
            self.tasks.items(),
            chunksize=chunksize)
        self._sync_pool.close()
        self._sync_pool.join()
        self.log.debug("Inquiries completed.")

        self.update_task_states(task_keys_to_states)

    def update_task_states(self,
                           task_keys_to_states: List[Union[TaskInstanceStateType,
                                                           ExceptionWithTraceback]]) -> None:
        """
         更新所有任务状态

        Updates states of the tasks."""
        for key_and_state in task_keys_to_states:
            if isinstance(key_and_state, ExceptionWithTraceback):
                self.log.error(
                    CELERY_FETCH_ERR_MSG_HEADER + ", ignoring it:%s\n%s\n",
                    repr(key_and_state.exception), key_and_state.traceback
                )
                continue
            key, state = key_and_state
            self.update_task_state(key, state)

    def update_task_state(self, key: TaskInstanceKeyType, state: str) -> None:
        """
        更新任务状态
        Updates state of a single task."""
        # noinspection PyBroadException
        try:
            #只有状态发生变化才处理
            if self.last_state[key] != state:
                if state == celery_states.SUCCESS:
                    self.success(key)
                    del self.tasks[key]
                    del self.last_state[key]
                elif state == celery_states.FAILURE:
                    self.fail(key)
                    del self.tasks[key]
                    del self.last_state[key]
                elif state == celery_states.REVOKED:
                    self.fail(key)
                    del self.tasks[key]
                    del self.last_state[key]
                else:
                    self.log.info("Unexpected state: %s", state)
                    self.last_state[key] = state
        except Exception:  # pylint: disable=broad-except
            self.log.exception("Error syncing the Celery executor, ignoring it.")

    def end(self, synchronous: bool = False) -> None:
        if synchronous:
            while any([task.state not in celery_states.READY_STATES for task in self.tasks.values()]):
                time.sleep(5)
        self.sync()

    def execute_async(self,
                      key: TaskInstanceKeyType,
                      command: CommandType,
                      queue: Optional[str] = None,
                      executor_config: Optional[Any] = None):
        """
        不允许异步运行
        Do not allow async execution for Celery executor."""
        raise AirflowException("No Async execution for Celery executor.")

    def terminate(self):
        pass