从中断开始

引子

中断对于纯软件工程师来说是一个熟悉而又陌生的概念。说它熟悉，在大学的课堂上，微机原理或者类似的课程中，中断是一个很重要的概率，很多电类学科的课程上都有提及，同时往往会配套相关的实验（8259a中断控制器是我记忆中曾经使用过的）。说它陌生，现代的操作系统内核往往已经对中断进行了各种各样的封装，如果不是驱动工程师往往很难有机会直接利用到。但是，中断这个对于我们理解软件是如何运行有非常重要的意义。

比如说：实际上现代操作系统实际上是由中断驱动运行的，Linux系统进程调度的时间轮片机制就是由定时器中断驱动的。各类外设比如说网卡，正常情况下就是由硬件驱动触发CPU进行收发包操作的。在旧版本内核里，系统调用也是由0X80软件中断触发实现由用户态陷入内核态的。

本文试图从架构师的角度来解释几件事情：

什么是中断，为什么要有中断
典型的中断的工作方式
几个与中断相关的问题

中断是什么

在计算机的世界里面，CPU是绝对的核心，CPU执行的是由01二进制构成的机器码，这对于人类而言过于抽象，难以理解。于是，后来又有了汇编，有了基于这之上的高级语言。其本质上是利用CPU的指令，对数据进行各类操作以达到我们的目的。机器没有智慧（即使是所谓人工智能，就其目前的发展而言反映的也只是人类的智慧），只能按照预设的流程执行计算任务，实际上早期的计算机程序只是一些打孔纸带，执行固定的任务。这种方式只能完成一些相对简单的机械的任务，如果要完成一些复杂的任务，其代码的复杂程度会超过人类能够处理的极限。那么如何来完成这样复杂的任务呢？

在我看来，一个能执行相对复杂的任务的系统，至少需要以下能力：

处理外部的输入输出的能力。
程序本身的异常处理能力。

在现代计算机体系中，第一点是有硬件中断实现的，以网卡为例，在数据包到达网卡，被通过DMA拷贝到相应缓冲区之后，会产生相应的硬件中断通知CPU进行处理。而第二点，是通过所谓的软件中断（注意是software interrupt，不是softirq），CPU提供了int指令进行触发，比较著名的有int3（调试），int14（缺页）等。

现在的中断机制相当复杂，考虑到响应的方式，又有同步中断和异步中断之分，由外部设备触发导致的硬件中断一般是属于同步中断，而为了处理异常的中断往往要求及时响应，属于同步中断。

考虑到处理效率和性能，硬件中断会被划分为上半部和下半部，其中上半部执行在硬中断上下文中，往往要求在极短的时间内处理完成，要求代码非常简洁精干，能够快速完成，往往只执行构建下半部分的工作，将大部分繁重的工作负担留给后半部分来完成。而后半部考虑到进程调度的实时性又可以分为在软中断上下文中执行的softirq和tasklet以及在进程上下文中执行的threaded irq和workqueue。而softirq的相应例程如果在10次循环执行后还不能完成的情况下，会由内核线程ksoftirqd继续执行，而ksoftirqd是运行在进程上下文中（但是ksoftirqd中仍然不能执行睡眠）。

来自硬件外设的中断是可以被软件设置为可屏蔽的，但其中也有不可屏蔽中断，比如NMI。

软件中断，又包括异常门，陷阱门和任务门，在intel CPU上占据0-31号中断向量，是不可屏蔽的。其中就包括上面提到的int 3，int 14。用于调试或者异常处理等任务，其定义在内核traps.h文件中。

典型的中断工作方式

上半部的工作方式

Linux系统目前支持最多256个中断号，其中0-31由英特尔定义，用来处理系统的故障（fault），陷阱（trap），终止（abort），异常（exception）等情况。32-127由外部设备使用，用来支持外设与CPU的通信，128号即0x80在低版本内核中用来支持系统调用由用户态到内核态的切换。下表来自互联网，描述了Linux中断的分布。

中断向量	用途
0-19	非屏蔽中断和异常
20-31	intel保留
32-127	外部中断
128	系统调用
129-238	外部中断
239	本地APIC时钟中断
240-250	Linux保留
251-255	处理器中断（SMP）

这个表格时间已经比较早了，到现在变化也不大，不一定非常准确，但是可以帮我们理解。

可以看到，系统的中断向量是有限的，而支持的外设可能很多。现代的设备，尤其是高速网卡，有可能是多通道的，会使用多个中断号，那有没有可能多个硬件中断会映射到同一个系统中断向量上面来呢？答案是肯定的，Linux定义了一个名为irq_desc的数据结构来封装每一个外部中断，我们称之为中断描述符。

struct irq_desc {
    struct irq_data     irq_data;
    unsigned int __percpu   *kstat_irqs;
    irq_flow_handler_t  handle_irq;
#ifdef CONFIG_IRQ_PREFLOW_FASTEOI
    irq_preflow_handler_t   preflow_handler;
#endif
    struct irqaction    *action;    /* IRQ action list */
    unsigned int        status_use_accessors;
    unsigned int        core_internal_state__do_not_mess_with_it;
    unsigned int        depth;      /* nested irq disables */
    unsigned int        wake_depth; /* nested wake enables */
    unsigned int        irq_count;  /* For detecting broken IRQs */
    unsigned long       last_unhandled; /* Aging timer for unhandled count */
    unsigned int        irqs_unhandled;
    atomic_t        threads_handled;
    int         threads_handled_last;
    raw_spinlock_t      lock;
    struct cpumask      *percpu_enabled;
#ifdef CONFIG_SMP
    const struct cpumask    *affinity_hint;
    struct irq_affinity_notify *affinity_notify;
#ifdef CONFIG_GENERIC_PENDING_IRQ
    cpumask_var_t       pending_mask;
#endif
#endif
    unsigned long       threads_oneshot;
    atomic_t        threads_active;
    wait_queue_head_t       wait_for_threads;
#ifdef CONFIG_PROC_FS
    struct proc_dir_entry   *dir;
#endif
    int         parent_irq;
    struct module       *owner;
    const char      *name;
} ____cacheline_internodealigned_in_smp;

其中struct irq_data irq_data封装了该中断和硬件相关的数据和对应的操作。

而struct irqaction *action是一个链表，如果该中断向量是共享的，会有多个成员，否则只会有一个成员。

struct irqaction {
    irq_handler_t       handler;
    void            *dev_id;
    void __percpu       *percpu_dev_id;
    struct irqaction    *next;
    irq_handler_t       thread_fn;
    struct task_struct  *thread;
    unsigned int        irq;
    unsigned int        flags;
    unsigned long       thread_flags;
    unsigned long       thread_mask;
    const char      *name;
    struct proc_dir_entry   *dir;
} ____cacheline_internodealigned_in_smp;

其中，需要注意下面几个成员：

irq_handler_t handler是该中断的上半部例程。
irq_handle_t thread_fn则是其线程化下半部的处理例程，如果此处不为空的话，其对应的内核线程的线程描述符会放置在指针*thread中。
dev_id用来放置对应外设的信息。

在实际发生硬件中断时，handle_IRQ函数会首先被调用：

/*
 * handle_IRQ handles all hardware IRQ's.  Decoded IRQs should
 * not come via this function.  Instead, they should provide their
 * own 'handler'.  Used by platform code implementing C-based 1st
 * level decoding.
 */
void handle_IRQ(unsigned int irq, struct pt_regs *regs)
{
    struct pt_regs *old_regs = set_irq_regs(regs);

    irq_enter();

    /*
     * Some hardware gives randomly wrong interrupts.  Rather
     * than crashing, do something sensible.
     */
    if (unlikely(irq >= nr_irqs)) {
        if (printk_ratelimit())
            printk(KERN_WARNING "Bad IRQ%u\n", irq);
        ack_bad_irq(irq);
    } else {
        generic_handle_irq(irq);
    }

    irq_exit();
    set_irq_regs(old_regs);
}

实际处理流程在generic_handle_irq函数中：

/**
 * generic_handle_irq - Invoke the handler for a particular irq
 * @irq:    The irq number to handle
 *
 */
int generic_handle_irq(unsigned int irq)
{
    struct irq_desc *desc = irq_to_desc(irq);

    if (!desc)
        return -EINVAL;
    generic_handle_irq_desc(irq, desc);
    return 0;
}

/*
 * Architectures call this to let the generic IRQ layer
 * handle an interrupt. If the descriptor is attached to an
 * irqchip-style controller then we call the ->handle_irq() handler,
 * and it calls __do_IRQ() if it's attached to an irqtype-style controller.
 */
static inline void generic_handle_irq_desc(unsigned int irq, struct irq_desc *desc)
{
    desc->handle_irq(irq, desc);
}

实际上最终调用的是中断描述符中的上半段例程通用处理例程handle_irq，这个例程是在初始化时设置的irq_set_handler --> __irq_set_handler：

void
__irq_set_handler(unsigned int irq, irq_flow_handler_t handle, int is_chained,
          const char *name)
{
    unsigned long flags;
    struct irq_desc *desc = irq_get_desc_buslock(irq, &flags, 0);

    if (!desc)
        return;

    if (!handle) {
        handle = handle_bad_irq;
    } else {
        if (WARN_ON(desc->irq_data.chip == &no_irq_chip))
            goto out;
    }

    /* Uninstall? */
    if (handle == handle_bad_irq) {
        if (desc->irq_data.chip != &no_irq_chip)
            mask_ack_irq(desc);
        irq_state_set_disabled(desc);
        desc->depth = 1;
    }
    desc->handle_irq = handle;
    desc->name = name;

    if (handle != handle_bad_irq && is_chained) {
        irq_settings_set_noprobe(desc);
        irq_settings_set_norequest(desc);
        irq_settings_set_nothread(desc);
        irq_startup(desc, true);
    }
out:
    irq_put_desc_busunlock(desc, flags);
}
EXPORT_SYMBOL_GPL(__irq_set_handler);

默认的触发方式一般有水平触发和边沿触发两种。一般情况下是水平触发。由于同一个中断向量可以挂载多个硬件中断号，我们可以推论得出，挂载在同一个中断向量上的物理设备，其触发方式必须是相同的。

接下来的处理，会在对应的水平或者边沿触发处理例程中:

void
handle_level_irq(unsigned int irq, struct irq_desc *desc)
{
    raw_spin_lock(&desc->lock);
    mask_ack_irq(desc);

    if (unlikely(irqd_irq_inprogress(&desc->irq_data)))
        if (!irq_check_poll(desc))
            goto out_unlock;

    desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
    kstat_incr_irqs_this_cpu(irq, desc);

    /*
     * If its disabled or no action available
     * keep it masked and get out of here
     */
    if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data))) {
        desc->istate |= IRQS_PENDING;
        goto out_unlock;
    }

    handle_irq_event(desc);

    cond_unmask_irq(desc);

out_unlock:
    raw_spin_unlock(&desc->lock);
}
EXPORT_SYMBOL_GPL(handle_level_irq);

在完成相关的状态检查和设置后，会调用handle_irq_event，这时候才会真正执行相关的上半部处理例程。

irqreturn_t handle_irq_event(struct irq_desc *desc)
{
    struct irqaction *action = desc->action;
    irqreturn_t ret;

    desc->istate &= ~IRQS_PENDING;
    irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
    raw_spin_unlock(&desc->lock);

    ret = handle_irq_event_percpu(desc, action);

    raw_spin_lock(&desc->lock);
    irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
    return ret;
}

实际处理会在handle_irq_event_percpu函数里，

irqreturn_t
handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action)
{
    irqreturn_t retval = IRQ_NONE;
    unsigned int flags = 0, irq = desc->irq_data.irq;

    do {
        irqreturn_t res;

        trace_irq_handler_entry(irq, action);
        res = action->handler(irq, action->dev_id);
        trace_irq_handler_exit(irq, action, res);

        if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pF enabled interrupts\n",
                  irq, action->handler))
            local_irq_disable();

        switch (res) {
        case IRQ_WAKE_THREAD:
            /*
             * Catch drivers which return WAKE_THREAD but
             * did not set up a thread function
             */
            if (unlikely(!action->thread_fn)) {
                warn_no_thread(irq, action);
                break;
            }

            __irq_wake_thread(desc, action);

            /* Fall through to add to randomness */
        case IRQ_HANDLED:
            flags |= action->flags;
            break;

        default:
            break;
        }

        retval |= res;
        action = action->next;
    } while (action);

    add_interrupt_randomness(irq, flags);

    if (!noirqdebug)
        note_interrupt(irq, desc, retval);
    return retval;
}

可以看到，在这个函数中，会遍历action链表，找到并执行相应的上半部处理例程并执行，由于中断在注册时，可以选择线程化，并提供相关的线程工作函数。在中断初始化过程中，这个内核线程会被初始化并处于睡眠状态。在上半段例程执行完成之后，如果注册过线程化中断的话，其返回值会是IRQ_WAKE_THREAD，并唤醒相关的处理线程执行下半部处理。以这种方式触发的下半部是在进程上下文中执行的，其执行是可以被抢占的。

        switch (res) {
        case IRQ_WAKE_THREAD:
            /*
             * Catch drivers which return WAKE_THREAD but
             * did not set up a thread function
             */
            if (unlikely(!action->thread_fn)) {
                warn_no_thread(irq, action);
                break;
            }

            __irq_wake_thread(desc, action);

下半部的工作方式

下半部的工作方式有如下几种：

工作在中断上下文中的有：

Softirq
Tasklet

其中Tasklet是一种特殊的softirq，与softirq相比，它的区别是，tasklet是不可重入的，也就是说在多核情况下，同时只能有一个同类型的tasklet在运行。

工作在进程上下文中的有：

threaded irq
workqueue

我这里着重说一下softirq。

Softirq目前定义有10种：

enum
{
    HI_SOFTIRQ=0,
    TIMER_SOFTIRQ,
    NET_TX_SOFTIRQ,
    NET_RX_SOFTIRQ,
    BLOCK_SOFTIRQ,
    BLOCK_IOPOLL_SOFTIRQ,
    TASKLET_SOFTIRQ,
    SCHED_SOFTIRQ,
    HRTIMER_SOFTIRQ,
    RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */

    NR_SOFTIRQS
};

看到tasklet占了其中两个HI_SOFTIRQ和TASKLET_SOFTIRQ。其他的预定义Softirq中，有用于网卡收发包的，有用于计算器和进程调度的，有用于块设备的，最后有一个是用于rcu的。

如果我们用ps查看系统进程，可以看到有多个ksoftirqd进程。内核为每个CPU都分配了一个这样的内核线程，用来处理每个核上的软中断。Softirq运行在软中断上下文中，不能被进程或者其他Softirq抢占，但是可以

那软中断的执行是怎么被触发的内，内核提供了APIraise_softirq来唤醒softirq。

void raise_softirq(unsigned int nr)
{
    unsigned long flags;

    local_irq_save(flags);
    raise_softirq_irqoff(nr);
    local_irq_restore(flags);
}

#define or_softirq_pending(x)   this_cpu_or(irq_stat.__softirq_pending, (x))

void __raise_softirq_irqoff(unsigned int nr)
{
    trace_softirq_raise(nr);
    or_softirq_pending(1UL << nr);
}

/*
 * This function must run with irqs disabled!
 */
inline void raise_softirq_irqoff(unsigned int nr)
{
    __raise_softirq_irqoff(nr);

    /*
     * If we're in an interrupt or softirq, we're done
     * (this also catches softirq-disabled code). We will
     * actually run the softirq once we return from
     * the irq or softirq.
     *
     * Otherwise we wake up ksoftirqd to make sure we
     * schedule the softirq soon.
     */
    if (!in_interrupt())
        wakeup_softirqd();
}

从上面代码中可以看到，其最重要的工作是在or_softirq_pending里面完成的，这个函数的主要作用是把相关CPU的__softirq_ending中对应的位给置1，然后唤醒softirqd运行。

至于什么时候会调用这个流程。

首先，在中断上半段结束的时候，会触发softirq的调用，那这个触发在哪里呢，看上面handle_IRQ的源码。其中有一个irq_exit()，对软中断的触发就藏在这里。

/*
 * Exit an interrupt context. Process softirqs if needed and possible:
 */
void irq_exit(void)
{
#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
    local_irq_disable();
#else
    WARN_ON_ONCE(!irqs_disabled());
#endif

    account_irq_exit_time(current);
    preempt_count_sub(HARDIRQ_OFFSET);
    if (!in_interrupt() && local_softirq_pending())
        invoke_softirq();

    tick_irq_exit();
    rcu_irq_exit();
    trace_hardirq_exit(); /* must be last! */
}

可以看到这里有一个invoke_softirq(),

static inline void invoke_softirq(void)
{
    if (!force_irqthreads) {
#ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
        /*
         * We can safely execute softirq on the current stack if
         * it is the irq stack, because it should be near empty
         * at this stage.
         */
        __do_softirq();
#else
        /*
         * Otherwise, irq_exit() is called on the task stack that can
         * be potentially deep already. So call softirq in its own stack
         * to prevent from any overrun.
         */
        do_softirq_own_stack();
#endif
    } else {
        wakeup_softirqd();
    }
}

可以看到这里最终通过wakeup_softirqd()调用，唤醒了ksoftirqd，来处理这个核上的软中断。

其次，软中断也是可以直接在上半部被中断例程唤醒的，比如用于定时进程调度的SCHED_SOFTIRQ，在完全公平调度器(CFS)中，它在trigger_load_balance函数中被调用：

/*
 * Trigger the SCHED_SOFTIRQ if it is time to do periodic load balancing.
 */
void trigger_load_balance(struct rq *rq)
{
    /* Don't need to rebalance while attached to NULL domain */
    if (unlikely(on_null_domain(rq)))
        return;

    if (time_after_eq(jiffies, rq->next_balance))
        raise_softirq(SCHED_SOFTIRQ);
#ifdef CONFIG_NO_HZ_COMMON
    if (nohz_kick_needed(rq))
        nohz_balancer_kick();
#endif
}

/*
 * This function gets called by the timer code, with HZ frequency.
 * We call it with interrupts disabled.
 */
void scheduler_tick(void)
{
    int cpu = smp_processor_id();
    struct rq *rq = cpu_rq(cpu);
    struct task_struct *curr = rq->curr;

    sched_clock_tick();

    raw_spin_lock(&rq->lock);
    update_rq_clock(rq);
    curr->sched_class->task_tick(rq, curr, 0);
    update_cpu_load_active(rq);
    raw_spin_unlock(&rq->lock);

    perf_event_task_tick();

#ifdef CONFIG_SMP
    rq->idle_balance = idle_cpu(cpu);
    trigger_load_balance(rq);
#endif
    rq_last_tick_reset(rq);
}

/*
 * Called from the timer interrupt handler to charge one tick to the current
 * process.  user_tick is 1 if the tick is user time, 0 for system.
 */
void update_process_times(int user_tick)
{
    struct task_struct *p = current;
    int cpu = smp_processor_id();

    /* Note: this timer irq context must be accounted for as well. */
    account_process_tick(p, user_tick);
    run_local_timers();
    rcu_check_callbacks(cpu, user_tick);
#ifdef CONFIG_IRQ_WORK
    if (in_irq())
        irq_work_run();
#endif
    scheduler_tick();
    run_posix_cpu_timers(p);
}

static irqreturn_t
timer_interrupt (int irq, void *dev_id)
{
    unsigned long new_itm;

    if (cpu_is_offline(smp_processor_id())) {
        return IRQ_HANDLED;
    }

    platform_timer_interrupt(irq, dev_id);

    new_itm = local_cpu_data->itm_next;

    if (!time_after(ia64_get_itc(), new_itm))
        printk(KERN_ERR "Oops: timer tick before it's due (itc=%lx,itm=%lx)\n",
               ia64_get_itc(), new_itm);

    profile_tick(CPU_PROFILING);

    if (paravirt_do_steal_accounting(&new_itm))
        goto skip_process_time_accounting;

    while (1) {
        update_process_times(user_mode(get_irq_regs()));

        new_itm += local_cpu_data->itm_delta;

        if (smp_processor_id() == time_keeper_id)
            xtime_update(1);

        local_cpu_data->itm_next = new_itm;

        if (time_after(new_itm, ia64_get_itc()))
            break;

        /*
         * Allow IPIs to interrupt the timer loop.
         */
        local_irq_enable();
        local_irq_disable();
    }

skip_process_time_accounting:

    do {
        /*
         * If we're too close to the next clock tick for
         * comfort, we increase the safety margin by
         * intentionally dropping the next tick(s).  We do NOT
         * update itm.next because that would force us to call
         * xtime_update() which in turn would let our clock run
         * too fast (with the potentially devastating effect
         * of losing monotony of time).
         */
        while (!time_after(new_itm, ia64_get_itc() + local_cpu_data->itm_delta/2))
            new_itm += local_cpu_data->itm_delta;
        ia64_set_itm(new_itm);
        /* double check, in case we got hit by a (slow) PMI: */
    } while (time_after_eq(ia64_get_itc(), new_itm));
    return IRQ_HANDLED;
}

static struct irqaction timer_irqaction = {
    .handler =  timer_interrupt,
    .flags =    IRQF_IRQPOLL,
    .name =     "timer"
};

其调用链是timer_interrupt-->update_process_times-->scheduler_tick-->trigger_load_balance。这个调用链是在IA64平台上的，x86上面具体调用路径我并没有找到是从哪个中断过来的，但是原理应该是差不多的。

中断上下文

上文我们讨论过，硬中断，软中断都是工作在各种不同的上下文中的，有硬中断上下文，有软中断上下文，还有一些下半部例程是工作在进程上下文中的。

在内核中，有一个每CPU变量preempt_count是用来，记录软中断，硬中断和内核抢占层数的。

如下图，这是一个32bit的整形，目前只用了其中的22位：

Linux-Kernel.png

硬中断上下文

硬中断处理过程中会调用irq_enter进入硬中断上下文：

/*
 * Enter an interrupt context.
 */
void irq_enter(void)
{
    rcu_irq_enter();
    if (is_idle_task(current) && !in_interrupt()) {
        /*
         * Prevent raise_softirq from needlessly waking up ksoftirqd
         * here, as softirq will be serviced on return from interrupt.
         */
        local_bh_disable();
        tick_irq_enter();
        _local_bh_enable();
    }

    __irq_enter();
}

/*
 * It is safe to do non-atomic ops on ->hardirq_context,
 * because NMI handlers may not preempt and the ops are
 * always balanced, so the interrupted value of ->hardirq_context
 * will always be restored.
 */
#define __irq_enter()                   \
    do {                        \
        account_irq_enter_time(current);    \
        preempt_count_add(HARDIRQ_OFFSET);  \// 增加hardirq计数
        trace_hardirq_enter();          \
    } while (0)

在退出时，会调用irq_exit退出硬中断上下文：

/*
 * Exit an interrupt context. Process softirqs if needed and possible:
 */
void irq_exit(void)
{
#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
    local_irq_disable();
#else
    WARN_ON_ONCE(!irqs_disabled());
#endif

    account_irq_exit_time(current);
    preempt_count_sub(HARDIRQ_OFFSET); // 减少hardirq计数
    if (!in_interrupt() && local_softirq_pending())
        invoke_softirq();

    tick_irq_exit();
    rcu_irq_exit();
    trace_hardirq_exit(); /* must be last! */
}

软中断上下文

同样，软中断的上下文可以通过，local_bh_disable进入，通过local_bh_enable退出。

#ifdef CONFIG_TRACE_IRQFLAGS
extern void __local_bh_disable_ip(unsigned long ip, unsigned int cnt);
#else
static __always_inline void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
{
    preempt_count_add(cnt);
    barrier();
}
#endif

static inline void local_bh_disable(void)
{
    __local_bh_disable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
}

extern void _local_bh_enable(void);
extern void __local_bh_enable_ip(unsigned long ip, unsigned int cnt);

static inline void local_bh_enable_ip(unsigned long ip)
{
    __local_bh_enable_ip(ip, SOFTIRQ_DISABLE_OFFSET);
}

static inline void local_bh_enable(void)
{
    __local_bh_enable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
}

其中的逻辑也是通过增加或者减少preempt_count的相关位的计数来实现的。Softirq本身不会有多重嵌套，只会使用第8位，其他9-15位是留给进程来使用的，在一些情况下，进行可能需要关闭软中断来防止被抢占，这是就会使用到9-15位。

判断当前上下文

我们可以通过判断当前CPU的preempt_count上的值来判断当前的状态

 * PREEMPT_MASK:    0x000000ff
 * SOFTIRQ_MASK:    0x0000ff00
 * HARDIRQ_MASK:    0x000f0000
 *     NMI_MASK:    0x00100000
 * PREEMPT_ACTIVE:  0x00200000

可以看到内核对这个值各个区域的划分。

判断是否处于硬中断上下文：

#define hardirq_count()  (preempt_count() & HARDIRQ_MASK)
#define in_irq()  (hardirq_count())

判断是否处于软中断上下文

#define softirq_count()  (preempt_count() & SOFTIRQ_MASK)
#define in_softirq()     (softirq_count())

由于软中断上下文可以用在一般进程中，内核另外提供了一个接口来看是否当前正在处理softirq：

#define SOFTIRQ_OFFSET  (1UL << 8)
#define in_serving_softirq()  (softirq_count() & SOFTIRQ_OFFSET)

硬中断和软中断一般统称中断上下文，可以独立判断：

#define irq_count()  (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK | NMI_MASK))            
#define in_interrupt()  (irq_count())

处于中断上下文中隐含了，不允许调度或者睡眠的发生。

与之对应的是进程上下文，

#define in_task()  (!(preempt_count() & (HARDIRQ_MASK | SOFTIRQ_OFFSET | NMI_MASK)))

小结

中断这个概念对我来说一直是模糊的，隐隐约约知道是什么样一个东西，但是要让我讲却是讲不出来，最近查一个问题，让我有机会能够深入探究一下中断到底是什么样一个东西。后来又陆陆续续读了很多网上大拿的文章，消化后写了这样一篇文章。我在这里忽略了很多细节，比如硬中断是如何处罚的，驱动是怎么调用的，中断控制器级联是怎么做的。因为我觉得，这块东西是驱动工程师需要的，而我的角色并不是这个，因此把这块先放下，先集中精力把对软件层面有用的东西做一下归纳。这也是我Linux内核系列的第一篇，写的其实还是太粗略了。后面有时间再增加内容吧。