java线程池(三):ThreadPoolExecutor源码分析

[toc]

在前面分析了Executors工厂方法类之后,我们来看看AbstractExecutorService的最主要的一种实现类,ThreadpoolExecutor。

1.类的结构及其成员变量

1.类的基本结构

ThreadPoolExecutor类是AbstractExecutorService的一个实现类。其类的主要结构如下所示:


ThreadPoolExecutor类的基本结构

我们可以看看这个类的注释:

/**
 * An {@link ExecutorService} that executes each submitted task using
 * one of possibly several pooled threads, normally configured
 * using {@link Executors} factory methods.
 *
 * <p>Thread pools address two different problems: they usually
 * provide improved performance when executing large numbers of
 * asynchronous tasks, due to reduced per-task invocation overhead,
 * and they provide a means of bounding and managing the resources,
 * including threads, consumed when executing a collection of tasks.
 * Each {@code ThreadPoolExecutor} also maintains some basic
 * statistics, such as the number of completed tasks.
 *
 * <p>To be useful across a wide range of contexts, this class
 * provides many adjustable parameters and extensibility
 * hooks. However, programmers are urged to use the more convenient
 * {@link Executors} factory methods {@link
 * Executors#newCachedThreadPool} (unbounded thread pool, with
 * automatic thread reclamation), {@link Executors#newFixedThreadPool}
 * (fixed size thread pool) and {@link
 * Executors#newSingleThreadExecutor} (single background thread), that
 * preconfigure settings for the most common usage
 * scenarios. Otherwise, use the following guide when manually
 * configuring and tuning this class:
 *
 * <dl>
 *
 * <dt>Core and maximum pool sizes</dt>
 *
 * <dd>A {@code ThreadPoolExecutor} will automatically adjust the
 * pool size (see {@link #getPoolSize})
 * according to the bounds set by
 * corePoolSize (see {@link #getCorePoolSize}) and
 * maximumPoolSize (see {@link #getMaximumPoolSize}).
 *
 * When a new task is submitted in method {@link #execute(Runnable)},
 * and fewer than corePoolSize threads are running, a new thread is
 * created to handle the request, even if other worker threads are
 * idle.  If there are more than corePoolSize but less than
 * maximumPoolSize threads running, a new thread will be created only
 * if the queue is full.  By setting corePoolSize and maximumPoolSize
 * the same, you create a fixed-size thread pool. By setting
 * maximumPoolSize to an essentially unbounded value such as {@code
 * Integer.MAX_VALUE}, you allow the pool to accommodate an arbitrary
 * number of concurrent tasks. Most typically, core and maximum pool
 * sizes are set only upon construction, but they may also be changed
 * dynamically using {@link #setCorePoolSize} and {@link
 * #setMaximumPoolSize}. </dd>
 *
 * <dt>On-demand construction</dt>
 *
 * <dd>By default, even core threads are initially created and
 * started only when new tasks arrive, but this can be overridden
 * dynamically using method {@link #prestartCoreThread} or {@link
 * #prestartAllCoreThreads}.  You probably want to prestart threads if
 * you construct the pool with a non-empty queue. </dd>
 *
 * <dt>Creating new threads</dt>
 *
 * <dd>New threads are created using a {@link ThreadFactory}.  If not
 * otherwise specified, a {@link Executors#defaultThreadFactory} is
 * used, that creates threads to all be in the same {@link
 * ThreadGroup} and with the same {@code NORM_PRIORITY} priority and
 * non-daemon status. By supplying a different ThreadFactory, you can
 * alter the thread's name, thread group, priority, daemon status,
 * etc. If a {@code ThreadFactory} fails to create a thread when asked
 * by returning null from {@code newThread}, the executor will
 * continue, but might not be able to execute any tasks. Threads
 * should possess the "modifyThread" {@code RuntimePermission}. If
 * worker threads or other threads using the pool do not possess this
 * permission, service may be degraded: configuration changes may not
 * take effect in a timely manner, and a shutdown pool may remain in a
 * state in which termination is possible but not completed.</dd>
 *
 * <dt>Keep-alive times</dt>
 *
 * <dd>If the pool currently has more than corePoolSize threads,
 * excess threads will be terminated if they have been idle for more
 * than the keepAliveTime (see {@link #getKeepAliveTime(TimeUnit)}).
 * This provides a means of reducing resource consumption when the
 * pool is not being actively used. If the pool becomes more active
 * later, new threads will be constructed. This parameter can also be
 * changed dynamically using method {@link #setKeepAliveTime(long,
 * TimeUnit)}.  Using a value of {@code Long.MAX_VALUE} {@link
 * TimeUnit#NANOSECONDS} effectively disables idle threads from ever
 * terminating prior to shut down. By default, the keep-alive policy
 * applies only when there are more than corePoolSize threads. But
 * method {@link #allowCoreThreadTimeOut(boolean)} can be used to
 * apply this time-out policy to core threads as well, so long as the
 * keepAliveTime value is non-zero. </dd>
 *
 * <dt>Queuing</dt>
 *
 * <dd>Any {@link BlockingQueue} may be used to transfer and hold
 * submitted tasks.  The use of this queue interacts with pool sizing:
 *
 * <ul>
 *
 * <li> If fewer than corePoolSize threads are running, the Executor
 * always prefers adding a new thread
 * rather than queuing.</li>
 *
 * <li> If corePoolSize or more threads are running, the Executor
 * always prefers queuing a request rather than adding a new
 * thread.</li>
 *
 * <li> If a request cannot be queued, a new thread is created unless
 * this would exceed maximumPoolSize, in which case, the task will be
 * rejected.</li>
 *
 * </ul>
 *
 * There are three general strategies for queuing:
 * <ol>
 *
 * <li> <em> Direct handoffs.</em> A good default choice for a work
 * queue is a {@link SynchronousQueue} that hands off tasks to threads
 * without otherwise holding them. Here, an attempt to queue a task
 * will fail if no threads are immediately available to run it, so a
 * new thread will be constructed. This policy avoids lockups when
 * handling sets of requests that might have internal dependencies.
 * Direct handoffs generally require unbounded maximumPoolSizes to
 * avoid rejection of new submitted tasks. This in turn admits the
 * possibility of unbounded thread growth when commands continue to
 * arrive on average faster than they can be processed.  </li>
 *
 * <li><em> Unbounded queues.</em> Using an unbounded queue (for
 * example a {@link LinkedBlockingQueue} without a predefined
 * capacity) will cause new tasks to wait in the queue when all
 * corePoolSize threads are busy. Thus, no more than corePoolSize
 * threads will ever be created. (And the value of the maximumPoolSize
 * therefore doesn't have any effect.)  This may be appropriate when
 * each task is completely independent of others, so tasks cannot
 * affect each others execution; for example, in a web page server.
 * While this style of queuing can be useful in smoothing out
 * transient bursts of requests, it admits the possibility of
 * unbounded work queue growth when commands continue to arrive on
 * average faster than they can be processed.  </li>
 *
 * <li><em>Bounded queues.</em> A bounded queue (for example, an
 * {@link ArrayBlockingQueue}) helps prevent resource exhaustion when
 * used with finite maximumPoolSizes, but can be more difficult to
 * tune and control.  Queue sizes and maximum pool sizes may be traded
 * off for each other: Using large queues and small pools minimizes
 * CPU usage, OS resources, and context-switching overhead, but can
 * lead to artificially low throughput.  If tasks frequently block (for
 * example if they are I/O bound), a system may be able to schedule
 * time for more threads than you otherwise allow. Use of small queues
 * generally requires larger pool sizes, which keeps CPUs busier but
 * may encounter unacceptable scheduling overhead, which also
 * decreases throughput.  </li>
 *
 * </ol>
 *
 * </dd>
 *
 * <dt>Rejected tasks</dt>
 *
 * <dd>New tasks submitted in method {@link #execute(Runnable)} will be
 * <em>rejected</em> when the Executor has been shut down, and also when
 * the Executor uses finite bounds for both maximum threads and work queue
 * capacity, and is saturated.  In either case, the {@code execute} method
 * invokes the {@link
 * RejectedExecutionHandler#rejectedExecution(Runnable, ThreadPoolExecutor)}
 * method of its {@link RejectedExecutionHandler}.  Four predefined handler
 * policies are provided:
 *
 * <ol>
 *
 * <li> In the default {@link ThreadPoolExecutor.AbortPolicy}, the
 * handler throws a runtime {@link RejectedExecutionException} upon
 * rejection. </li>
 *
 * <li> In {@link ThreadPoolExecutor.CallerRunsPolicy}, the thread
 * that invokes {@code execute} itself runs the task. This provides a
 * simple feedback control mechanism that will slow down the rate that
 * new tasks are submitted. </li>
 *
 * <li> In {@link ThreadPoolExecutor.DiscardPolicy}, a task that
 * cannot be executed is simply dropped.  </li>
 *
 * <li>In {@link ThreadPoolExecutor.DiscardOldestPolicy}, if the
 * executor is not shut down, the task at the head of the work queue
 * is dropped, and then execution is retried (which can fail again,
 * causing this to be repeated.) </li>
 *
 * </ol>
 *
 * It is possible to define and use other kinds of {@link
 * RejectedExecutionHandler} classes. Doing so requires some care
 * especially when policies are designed to work only under particular
 * capacity or queuing policies. </dd>
 *
 * <dt>Hook methods</dt>
 *
 * <dd>This class provides {@code protected} overridable
 * {@link #beforeExecute(Thread, Runnable)} and
 * {@link #afterExecute(Runnable, Throwable)} methods that are called
 * before and after execution of each task.  These can be used to
 * manipulate the execution environment; for example, reinitializing
 * ThreadLocals, gathering statistics, or adding log entries.
 * Additionally, method {@link #terminated} can be overridden to perform
 * any special processing that needs to be done once the Executor has
 * fully terminated.
 *
 * <p>If hook or callback methods throw exceptions, internal worker
 * threads may in turn fail and abruptly terminate.</dd>
 *
 * <dt>Queue maintenance</dt>
 *
 * <dd>Method {@link #getQueue()} allows access to the work queue
 * for purposes of monitoring and debugging.  Use of this method for
 * any other purpose is strongly discouraged.  Two supplied methods,
 * {@link #remove(Runnable)} and {@link #purge} are available to
 * assist in storage reclamation when large numbers of queued tasks
 * become cancelled.</dd>
 *
 * <dt>Finalization</dt>
 *
 * <dd>A pool that is no longer referenced in a program <em>AND</em>
 * has no remaining threads will be {@code shutdown} automatically. If
 * you would like to ensure that unreferenced pools are reclaimed even
 * if users forget to call {@link #shutdown}, then you must arrange
 * that unused threads eventually die, by setting appropriate
 * keep-alive times, using a lower bound of zero core threads and/or
 * setting {@link #allowCoreThreadTimeOut(boolean)}.  </dd>
 *
 * </dl>
 *
 * <p><b>Extension example</b>. Most extensions of this class
 * override one or more of the protected hook methods. For example,
 * here is a subclass that adds a simple pause/resume feature:
 *
 *  <pre> {@code
 * class PausableThreadPoolExecutor extends ThreadPoolExecutor {
 *   private boolean isPaused;
 *   private ReentrantLock pauseLock = new ReentrantLock();
 *   private Condition unpaused = pauseLock.newCondition();
 *
 *   public PausableThreadPoolExecutor(...) { super(...); }
 *
 *   protected void beforeExecute(Thread t, Runnable r) {
 *     super.beforeExecute(t, r);
 *     pauseLock.lock();
 *     try {
 *       while (isPaused) unpaused.await();
 *     } catch (InterruptedException ie) {
 *       t.interrupt();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void pause() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = true;
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void resume() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = false;
 *       unpaused.signalAll();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 * }}</pre>
 *
 * @since 1.5
 * @author Doug Lea
 */
public class ThreadPoolExecutor extends AbstractExecutorService {

}

其大意为:
一个ExecutorService的实现类,它使用一个具有多个线程的池中的一个线程来执行提交的任务,这些线程通常使用工厂方法Executors进行配置。
线程池用于解决两个不同的问题,由于减少了每个任务的调用开销,他们通常在执行大量异步任务的时候可提供改进的性能,并且他们提供了一种绑定和管理资源(包括线程)的方法。该资源在执行集合的执行时消耗任务,每个ThreadPoolExecutor还维护了一些基本的统计信息,例如已完成的任务数量。
为了在广泛的上下文中有用,该类提供了许多可调整的参数和可扩展的钩子函数。但是,强烈建议程序员使用Executors工厂方法的newCachedThreadPool,该方法的线程池无边界,具有自动的线程回收。newFixedThreadPool(固定大小的线程池)和newSingleThreadExecutor(单个后台线程)。可以为最常见的使用场景预配置设置。否则,在手动配置和调整此类时,请使用以下指南:

核心和最大池大小:
ThreadPoolExecutor将根据corePoolSize和maximumPoolSize设置范围自动调整池的大小。请参见getPoolSize。当在方法execute中提交新任务,并且正在运行的线程少于corePoolSize线程时,即使其他工作线程处于空闲状态,也会创建一个新的线程来处理请求。如果运行的线程数大于corePoolSize,但是小于maximumPoolSize。则仅在队列已满的时候才创建线程。通过corePoolSize和maximumPoolSize设置为相同,可以创建为固定大小的线程池。通过将maximumPoolSize设置为本质上不受限制的Integer.MAX_VALUE,则可以允许线程池容纳任意数量的并发任务,通常,核心线程和最大的池大小仅在构造的时候设置。但也可以使用setCorePoolSize和setMaximumPoolSize动态更改。

按需构建:
默认情况下,甚至只有在有新任务到达的时候才开始启动核心线程,但是可以使用方法prestartCoreThread或者prestartAllCoreThreads动态的进行覆盖。如果使用非空队列构造线程池,则可能需要预启动线程。

创建新线程:
使用ThreadFactory创建新线程,如果没有指定,那么采用Executors的defaultThreadFactory默认方法。该方法创建的全部线程具有相同的NORM_PRIORITY优先级和非守护线程的状态。通过提供不同的ThreadFactiry可以更改线程的名称,线程组、优先级、守护线程状态等。如果通过向newThread返回null时要求ThreadFactory创建线程失败,执行程序将继续,但可能无法执行任何任务。线程应具有modifyThread的RuntimePermission。如果使用该线程池的工作线程或者其他线程不具有此权限,则服务可能会降级,配置更改可能不会即时生效。并且关闭线程池可能保持在可能终止但是没有完成的状态。

Keep-alive时间:
如果当前的池中的线程数超过corePoolSize,则多余的线程将在空闲的时间超过keepAliveTime时终止。请参考getKeepAliveTime,当不积极使用线程池时,这提供了减少资源消耗的办法,也可以使用方法setKeepAliveTime动态调整(long,TimeUnit)动态调整此参数。如果使用Long.MAX_VALUE和TimeUnit#NANOSECONDS,则有效的使用空闲线程不会再线程池关闭之前关闭。默认情况下,仅当corePoolSize线程数多时,保持活动策略才适用。但是方法allowCoreThreadTimeOut也可以用于将此超时策略应用于核心线程,只要keepAliveTime值不为0即可。

排队:
任何BlockingQueue均可用于传输和保留提交的任务,此队列的使用与线程池的大小互相影响:如果正在运行的线程少于corePoolSize线程,则执行程序总是添加新的线程来执行任务,而不是排队。如果正在运行corePoolSize或者更多的线程,则执行程序总是喜欢对请求进行排队,而不是添加新线程。如果无法将请求放入队列中,则创建一个新线程,除非该线程超过了maximumPoolSize,在这种情况下,该任务将被拒绝。

排队一般有三种策略:

  • 直接传递:SynchronousQueue是工作队列的一个默认的选择。他可以将任务移交给线程,而不必另外保留。在这里,如果没有立即可用的线程来运行任务,则试图将任务进行排队的尝试将失败,因此需要构造一个新的线程。在处理可能具有内部依赖性的请求集的时候,此策略避免了锁。直接切换通常需要无限制的maximumPoolSizes以避免拒绝提交的新任务。反过来,当平均而言,命令继续以比其处理速度更快的到达时,这可能会带来无限线程增长的可能性。
  • 无界队列:当所有corePoolSize都处于忙的时候,使用无届队列,如没有预定容量的LinkedBlockingQueue。将导致新任务在队列中等待。因此,仅创建corePoolSize线程。maximumPoolSize将没有任何作用。当每个任务完全独立于其他任务的时候,这可能是适当的。因此任务不会影响彼此的执行,例如,在网页服务器中。尽管这种排队方式对于消除短暂的突发请求很有用,但它承认当命令平均继续以比处理速度更快的速度到达时,工作队列会无限增长,这可能会造成OOM。
  • 有界队列:与有限的maximumPoolSizes一起使用时,有界队列如ArrayBlockingQueue,有助于防止资源耗尽,但是调优和控制将会非常困难。队列大小和最大的线程池大小可能会互相折衷。使用大队列和小的池可以最大程度的减少CPU的使用率,操作系统和上下文切换的开销,但是会人为的导致吞吐量下降。如果任务频繁阻塞,如I/O,则系统可能认为你安排的线程调度的时间超出了允许的范围。使用小队列通常需要更大的池的大小,这会使得CPU繁忙,但是可能会遇到无法接受的调度开销,这也会降低吞吐量。

拒绝任务:
当执行器关闭时,并且执行器对最大线程数和工作队列容量使用有限范围时,在方法execute提交的新任务将被拒绝。处于饱和。无论那种情况,execute将用RejectedExecutionHandler的RejectedExecutionHandler#rejectedExecution(Runnable,ThreadPoolExecutor)方法。提供了4个预订的处理策略:

  • 默认的拒绝策略ThreadPoolExecutor.AbortPolicy,处理程序在拒绝的时候会抛出RejectedExecutionException异常。
  • 在ThreadPoolExecutor.CallerRunsPolicy中,调用execute本身的线程运行任务。这提供了一种简单的反馈机制,将降低新任务的提交速度。
  • 在ThreadPoolExecutor.DiscardPolicy中,简单的删除了无法执行的任务。
  • 在ThreadPoolExecutor.DiscardOldestPolicy策略中,如果未关闭执行程序,则将丢弃工作队列开头的任务,然后重试执行,该操作可能再次失败,导致重复执行此操作。
    也可以定义和使用其他类型的RejectedExecutionHandler,但是这样做需要格外小心,尤其在设计策略仅在特定容量或者排队策略下才能工作时。

Hook方法:
线程池类提供了protected权限的可重写的beforeExecute和afterExecute方法。这些方法在每个线程的之前前后被调用。这些可以用来对执行环境进行操作,例如,重新初始化ThreadLocals收集统计信息或者添加统计条目,此外,一旦线程完成终止,方法terminated可以被覆盖以执行需要执行的任何特殊处理。
如果钩子函数或者回掉方法出现异常,内部工作的线程可能失败并突然终止。

队列维护:
方法getQueue允许访问工作队列,以进行监视和调试,强烈建议不要将此方法用于任何其他目的,当取消大量排队任务的时候,可以使用提供的两种方法,remove和purge来帮助回收内存。

Finalization:
在一个程序中如果不再对线程池进行引用,或者没有剩余的线程,线程池将自动shutdown。如果即使用户忘记调用shutdown的情况下,如果你想确保回收未引用的线程池,则必须使用0核心线程的下限来设置适当的keepAlive时间,以使未使用的线程最终消亡。通过allowCoreThreadTimeOut设置。
扩展示例:此类的大多数扩展都覆盖一个或者多个受保护的hook方法,例如:以下是一个子类,它添加了一个简单的暂停/继续功能:

class PausableThreadPoolExecutor extends ThreadPoolExecutor {
   private boolean isPaused;
    private ReentrantLock pauseLock = new ReentrantLock();
    private Condition unpaused = pauseLock.newCondition();
 
    public PausableThreadPoolExecutor(...) { super(...); }
 
    protected void beforeExecute(Thread t, Runnable r) {
      super.beforeExecute(t, r);
      pauseLock.lock();
      try {
        while (isPaused) unpaused.await();
      } catch (InterruptedException ie) {
        t.interrupt();
      } finally {
        pauseLock.unlock();
      }
    }
 
    public void pause() {
      pauseLock.lock();
      try {
        isPaused = true;
      } finally {
        pauseLock.unlock();
      }
    }
 
    public void resume() {
      pauseLock.lock();
      try {
        isPaused = false;
        unpaused.signalAll();
      } finally {
        pauseLock.unlock();
      }
    }
  }}

1.2 成员变量及常量

在类的内部,还有大段关于线程池状态的注释:

/**
 * The main pool control state, ctl, is an atomic integer packing
 * two conceptual fields
 *   workerCount, indicating the effective number of threads
 *   runState,    indicating whether running, shutting down etc
 *
 * In order to pack them into one int, we limit workerCount to
 * (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2
 * billion) otherwise representable. If this is ever an issue in
 * the future, the variable can be changed to be an AtomicLong,
 * and the shift/mask constants below adjusted. But until the need
 * arises, this code is a bit faster and simpler using an int.
 *
 * The workerCount is the number of workers that have been
 * permitted to start and not permitted to stop.  The value may be
 * transiently different from the actual number of live threads,
 * for example when a ThreadFactory fails to create a thread when
 * asked, and when exiting threads are still performing
 * bookkeeping before terminating. The user-visible pool size is
 * reported as the current size of the workers set.
 *
 * The runState provides the main lifecycle control, taking on values:
 *
 *   RUNNING:  Accept new tasks and process queued tasks
 *   SHUTDOWN: Don't accept new tasks, but process queued tasks
 *   STOP:     Don't accept new tasks, don't process queued tasks,
 *             and interrupt in-progress tasks
 *   TIDYING:  All tasks have terminated, workerCount is zero,
 *             the thread transitioning to state TIDYING
 *             will run the terminated() hook method
 *   TERMINATED: terminated() has completed
 *
 * The numerical order among these values matters, to allow
 * ordered comparisons. The runState monotonically increases over
 * time, but need not hit each state. The transitions are:
 *
 * RUNNING -> SHUTDOWN
 *    On invocation of shutdown(), perhaps implicitly in finalize()
 * (RUNNING or SHUTDOWN) -> STOP
 *    On invocation of shutdownNow()
 * SHUTDOWN -> TIDYING
 *    When both queue and pool are empty
 * STOP -> TIDYING
 *    When pool is empty
 * TIDYING -> TERMINATED
 *    When the terminated() hook method has completed
 *
 * Threads waiting in awaitTermination() will return when the
 * state reaches TERMINATED.
 *
 * Detecting the transition from SHUTDOWN to TIDYING is less
 * straightforward than you'd like because the queue may become
 * empty after non-empty and vice versa during SHUTDOWN state, but
 * we can only terminate if, after seeing that it is empty, we see
 * that workerCount is 0 (which sometimes entails a recheck -- see
 * below).
 */

大意为;
主线程池的控制状态,ctl,是一个打包两个概念的字段workerCount的原子整数。以及指示线程有效运行数量的runState。用来指示是否运行和关闭等状态。为了将他们打包到一个int上,我们将workerCount限制为(2 ^ 29 )-1约为5亿个线程。而不是(2 ^ 31)-1(20亿)个线程。如果将来有问题,可以使用AtomicLong进行替换。并调整shift / mask常数。但是在需要之前,使用int可以使代码更快。更简单。workCount是允许被启动不停止的worker数量,该值可能与活动线程的实际数量有暂时的不同,例如,ThreadFactory在被请求的时候未能创建线程,以及当退出线程任在终止之前执行记账操作等,用户可见的池大小报告为工作集合的当前大小。
runState提供了主生命周期控制,其值为:

  • RUNNING: 接收新任务并处理排队的任务。
  • SHUTDOWN:不接收新任务,但是处理排队的任务。
  • STOP:不接收新任务,不处理排队任务,并中断正在进行的任务。
  • TIDYING:所有任务已终止,workerCount为零,转换到状态TIDYING的线程将运行Terminated()钩子方法。
  • TERMINATED:terminated()方法已完成。

这些值之间的数字顺序很重要,可以进行有序的比较。runState随时间单调增加,但不必达到每个状态。可以按如下方式过渡:

  • RUNNING -> SHUTDOWN:关于shutdown()的调用,可能在finalize()中隐式地调用。
  • (RUNNING or SHUTDOWN) -> STOP:shutdownNow()相关调用。
  • STOP -> TIDYING 当pool为空的时候。
  • TIDYING -> TERMINATED: terminated()被调用的时候。
    在waittermination()中等待的线程将返回状态到达终止。
    检测从SHUTDOWN到TIDYING的转换并不像您想要的那样简单,因为在SHUTDOWN状态期间,队列在非空之后可能变为空,反之亦然,但是只有在看到它为空之后才能看到workerCount为0(有时需要重新检查-参见下文)。

1.2.1 ctl

ctl是ThreadPoolExecutor对线程状态runState和线程池workerCount的一个综合字段。将这两个属性打包放置在AtomicInteger上,ctl长度为32位,前面3位用于存放runState,后面29位存放workerCount。因此线程池最大的线程数量位2^29-1。其代码如下:

private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

// runState is stored in the high-order bits
private static final int RUNNING    = -1 << COUNT_BITS;
private static final int SHUTDOWN   =  0 << COUNT_BITS;
private static final int STOP       =  1 << COUNT_BITS;
private static final int TIDYING    =  2 << COUNT_BITS;
private static final int TERMINATED =  3 << COUNT_BITS;

// Packing and unpacking ctl
private static int runStateOf(int c)     { return c & ~CAPACITY; }
private static int workerCountOf(int c)  { return c & CAPACITY; }
private static int ctlOf(int rs, int wc) { return rs | wc; }

/*
 * Bit field accessors that don't require unpacking ctl.
 * These depend on the bit layout and on workerCount being never negative.
 */

private static boolean runStateLessThan(int c, int s) {
    return c < s;
}

private static boolean runStateAtLeast(int c, int s) {
    return c >= s;
}

private static boolean isRunning(int c) {
    return c < SHUTDOWN;
}

/**
 * Attempts to CAS-increment the workerCount field of ctl.
 */
private boolean compareAndIncrementWorkerCount(int expect) {
    return ctl.compareAndSet(expect, expect + 1);
}

/**
 * Attempts to CAS-decrement the workerCount field of ctl.
 */
private boolean compareAndDecrementWorkerCount(int expect) {
    return ctl.compareAndSet(expect, expect - 1);
}

/**
 * Decrements the workerCount field of ctl. This is called only on
 * abrupt termination of a thread (see processWorkerExit). Other
 * decrements are performed within getTask.
 */
private void decrementWorkerCount() {
    do {} while (! compareAndDecrementWorkerCount(ctl.get()));
}

我们可以看看各状态的二进制情况:


状态计算

可以看到,通过位移操作,将最高的三位用于标识runState状态,而后面29位用于存放workerCount.
方法runStateOf和workerCountOf则可以将这两部分数据通过位运算的方式取出来:
如下图所示,假定需要计算的c为0110 0000 0000 0000 0000 1110 0110 0000,那么位移过程如下:


位移过程

这两个方法就可以很快的计算出结果。
而ctlOf方法,rs | wc,这就非常容易理解了。


rs|wc

上图就非常方便的将runState和workerCount进行了组合。
而根据第一张图可以看到,RUNNING状态为负数,是最小的,这些状态的全部ctl满足如下规则:

RUNNING < SHUTDOWN < STOP < TIDYING < TERMINATED

方法runStateLessThan和runStateAtLeast则是根据上述规则判断线程池当前所处的状态。
可以发现的是,比SHUTDOWN小的任何状态都是iRUNNING状态。这也是isRunning方法的原因。
此外还提供了基于CAS的增减WorkerCount的方法:
compareAndIncrementWorkerCount、compareAndDecrementWorkerCount和decrementWorkerCount。

1.2.2 workQueue

/**
 * The queue used for holding tasks and handing off to worker
 * threads.  We do not require that workQueue.poll() returning
 * null necessarily means that workQueue.isEmpty(), so rely
 * solely on isEmpty to see if the queue is empty (which we must
 * do for example when deciding whether to transition from
 * SHUTDOWN to TIDYING).  This accommodates special-purpose
 * queues such as DelayQueues for which poll() is allowed to
 * return null even if it may later return non-null when delays
 * expire.
 */
private final BlockingQueue<Runnable> workQueue;

workQueue是用于保留任务并移交给工作线程的队列,我们不要指望通过workQueue.poll返回为null来判断队列是否为空,我们仅仅通过workQueue.isEmpty方法来判断队列是否为空,(将SHUTDOWN状态过度到TIDYING状态的时候可以采用这样做。)因为一些特殊的队列如DelayQueues允许poll的时候返回空,即使在延迟后的档期可以非空。这个workQueue依赖于构造函数传入。

1.2.3 workers

/**
 * Set containing all worker threads in pool. Accessed only when
 * holding mainLock.
 */
private final HashSet<Worker> workers = new HashSet<Worker>();

线程池中所有工作线程的集合。访问workers的时候需要获得锁mainLock。这个workers是hashSet。

1.2.4 mainLock & termination

/**
 * Lock held on access to workers set and related bookkeeping.
 * While we could use a concurrent set of some sort, it turns out
 * to be generally preferable to use a lock. Among the reasons is
 * that this serializes interruptIdleWorkers, which avoids
 * unnecessary interrupt storms, especially during shutdown.
 * Otherwise exiting threads would concurrently interrupt those
 * that have not yet interrupted. It also simplifies some of the
 * associated statistics bookkeeping of largestPoolSize etc. We
 * also hold mainLock on shutdown and shutdownNow, for the sake of
 * ensuring workers set is stable while separately checking
 * permission to interrupt and actually interrupting.
 */
private final ReentrantLock mainLock = new ReentrantLock();
    /**
 * Wait condition to support awaitTermination
 */
private final Condition termination = mainLock.newCondition();

调用这个锁的时候需要锁定worker的位置,并进行相关的记录,尽管我们使用某种并发集,但是事实证明,使用锁的方式通常是可取的。原因之一是它可以对interruptIdleWorkers进行序列化。从而可以避免不必要的中断风暴。尤其是在关机期间,否则,退出线程将同时中断哪些尚未中断的线程,它也简化了一些相关的数据。如maximumPoolSize。我们还在shutdown和shutdownNow上保留了mainLock,以确保在单独检查中断和实际中断权限的时候,workerSet是稳定的。
termination是一个wait条件,以支持awaitTermination。

1.2.5 其他成员变量

/**
 * Tracks largest attained pool size. Accessed only under
 * mainLock.
 */
private int largestPoolSize;

/**
 * Counter for completed tasks. Updated only on termination of
 * worker threads. Accessed only under mainLock.
 */
private long completedTaskCount;

/*
 * All user control parameters are declared as volatiles so that
 * ongoing actions are based on freshest values, but without need
 * for locking, since no internal invariants depend on them
 * changing synchronously with respect to other actions.
 */

/**
 * Factory for new threads. All threads are created using this
 * factory (via method addWorker).  All callers must be prepared
 * for addWorker to fail, which may reflect a system or user's
 * policy limiting the number of threads.  Even though it is not
 * treated as an error, failure to create threads may result in
 * new tasks being rejected or existing ones remaining stuck in
 * the queue.
 *
 * We go further and preserve pool invariants even in the face of
 * errors such as OutOfMemoryError, that might be thrown while
 * trying to create threads.  Such errors are rather common due to
 * the need to allocate a native stack in Thread.start, and users
 * will want to perform clean pool shutdown to clean up.  There
 * will likely be enough memory available for the cleanup code to
 * complete without encountering yet another OutOfMemoryError.
 */
private volatile ThreadFactory threadFactory;

/**
 * Handler called when saturated or shutdown in execute.
 */
private volatile RejectedExecutionHandler handler;

/**
 * Timeout in nanoseconds for idle threads waiting for work.
 * Threads use this timeout when there are more than corePoolSize
 * present or if allowCoreThreadTimeOut. Otherwise they wait
 * forever for new work.
 */
private volatile long keepAliveTime;

/**
 * If false (default), core threads stay alive even when idle.
 * If true, core threads use keepAliveTime to time out waiting
 * for work.
 */
private volatile boolean allowCoreThreadTimeOut;

/**
 * Core pool size is the minimum number of workers to keep alive
 * (and not allow to time out etc) unless allowCoreThreadTimeOut
 * is set, in which case the minimum is zero.
 */
private volatile int corePoolSize;

/**
 * Maximum pool size. Note that the actual maximum is internally
 * bounded by CAPACITY.
 */
private volatile int maximumPoolSize;

/**
 * The default rejected execution handler
 */
private static final RejectedExecutionHandler defaultHandler =
    new AbortPolicy();

/**
 * Permission required for callers of shutdown and shutdownNow.
 * We additionally require (see checkShutdownAccess) that callers
 * have permission to actually interrupt threads in the worker set
 * (as governed by Thread.interrupt, which relies on
 * ThreadGroup.checkAccess, which in turn relies on
 * SecurityManager.checkAccess). Shutdowns are attempted only if
 * these checks pass.
 *
 * All actual invocations of Thread.interrupt (see
 * interruptIdleWorkers and interruptWorkers) ignore
 * SecurityExceptions, meaning that the attempted interrupts
 * silently fail. In the case of shutdown, they should not fail
 * unless the SecurityManager has inconsistent policies, sometimes
 * allowing access to a thread and sometimes not. In such cases,
 * failure to actually interrupt threads may disable or delay full
 * termination. Other uses of interruptIdleWorkers are advisory,
 * and failure to actually interrupt will merely delay response to
 * configuration changes so is not handled exceptionally.
 */
private static final RuntimePermission shutdownPerm =
    new RuntimePermission("modifyThread");

/* The context to be used when executing the finalizer, or null. */
private final AccessControlContext acc;

类中还有部分其他的成员变量,整理如下:

变量名 类型 说明
largestPoolSize int 线程池的大小,需要获得mainLock
completedTaskCount long 已完成的任务的计数器,在工作线程终止的时候更新,也需要获得mainLock
threadFactory volatile ThreadFactory 创建线程的工厂方法,需要在构造函数中传入
handler volatile RejectedExecutionHandler 拒绝策略的调用方法,在线程池饱和或者关闭的时候如果有任务传入就调用
keepAliveTime volatile long 以纳秒为单位的worker的等待超时时间。当当前大于corePoolSize或者allowCoreThreadTimeOut的时候,线程调用此超时,否则会永远等待新的worker
allowCoreThreadTimeOut volatile boolean 如果为false,则核心线程即使在空闲的时候也保持活动,否则,核心线程将使用keepAliveTime来超时等待。默认值为false
corePoolSize volatile int 核心池的大小是维持生存的worker的最小数量,除非设置了allowCoreThreadTimeOut,这种情况下,最小值为0
maximumPoolSize volatile int 线程池大小的最大值,需要注意的是最大容量在内部是由CAPACITY决定的
defaultHandler static final RejectedExecutionHandler 默认的拒绝策略:new AbortPolicy();
shutdownPerm static final RuntimePermission 调用shutdown和shutdownNow所需要的权限,请参阅checkShutdownAccess要求调用者有权实际中断工作程序集中的线程,由Thread.interrupt控制,依赖于ThreadGroup.checkAccess。依次依赖于SecurityManager.checkAccess。仅在这些检查通过之后,才尝试执行shutdown。Thread.interrupt的所有实际调用,都会忽略SecurityExceptions。这意味着尝试中断会以静默的方式失败。除非SecurityManager的策略不一致。否则它们不应失败。在这种情况下,无法真正中断线程可能会禁用或延迟完全终止。建议使用interruptIdleWorkers的其他用途,而实际中断的失败只会延迟对配置更改的响应,因此不会进行特殊处理。

2.重要的内部类

在ThreadPoolExecutor中,内部类有两类,一类是执行线程的Worker,还有一类是拒绝策略。

2.1 Worker

Worker这个类,继承了AbstractQueueSynchronizer。


worker类

其代码如下:

/**
 * Class Worker mainly maintains interrupt control state for
 * threads running tasks, along with other minor bookkeeping.
 * This class opportunistically extends AbstractQueuedSynchronizer
 * to simplify acquiring and releasing a lock surrounding each
 * task execution.  This protects against interrupts that are
 * intended to wake up a worker thread waiting for a task from
 * instead interrupting a task being run.  We implement a simple
 * non-reentrant mutual exclusion lock rather than use
 * ReentrantLock because we do not want worker tasks to be able to
 * reacquire the lock when they invoke pool control methods like
 * setCorePoolSize.  Additionally, to suppress interrupts until
 * the thread actually starts running tasks, we initialize lock
 * state to a negative value, and clear it upon start (in
 * runWorker).
 */
private final class Worker
    extends AbstractQueuedSynchronizer
    implements Runnable
{
    /**
     * This class will never be serialized, but we provide a
     * serialVersionUID to suppress a javac warning.
     */
    private static final long serialVersionUID = 6138294804551838833L;

    //当前worker工作的线程,如果factory方法失败则为空
    /** Thread this worker is running in.  Null if factory fails. */
    final Thread thread;
    //需要运行的初始化任务,可能是空
    /** Initial task to run.  Possibly null. */
    Runnable firstTask;
    //线程任务计数器
    /** Per-thread task counter */
    volatile long completedTasks;

    /**
     * Creates with given first task and thread from ThreadFactory.
     * @param firstTask the first task (null if none)
     */
     //从ThreadFactory创建线程给firstTask任务
    Worker(Runnable firstTask) {
       //在运行worker之前禁止中断
        setState(-1); // inhibit interrupts until runWorker
        this.firstTask = firstTask;
        //调用ThreadFactory
        this.thread = getThreadFactory().newThread(this);
    }

    /** Delegates main run loop to outer runWorker  */
    //将main的run委托给外部的runWorker方法。
    public void run() {
        runWorker(this);
    }

    // Lock methods
    //
    // The value 0 represents the unlocked state.
    // The value 1 represents the locked state.
    //0标识未锁定,1标识锁定
    protected boolean isHeldExclusively() {
        return getState() != 0;
    }

    //尝试获得锁
    protected boolean tryAcquire(int unused) {
        if (compareAndSetState(0, 1)) {
            setExclusiveOwnerThread(Thread.currentThread());
            return true;
        }
        return false;
    }
    //尝试释放锁
    protected boolean tryRelease(int unused) {
        setExclusiveOwnerThread(null);
        setState(0);
        return true;
    }

    public void lock()        { acquire(1); }
    public boolean tryLock()  { return tryAcquire(1); }
    public void unlock()      { release(1); }
    public boolean isLocked() { return isHeldExclusively(); }
    //将所有线程中断
    void interruptIfStarted() {
        Thread t;
        if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
            try {
                t.interrupt();
            } catch (SecurityException ignore) {
            }
        }
    }
}

其注释大意为:
worker类主要维护运行任务的线程的中断控制的状态,以及对各种情况进行记录,这个类继承了AbstractQueuedSynchronizer,以简化获取和释放围绕每个任务的锁,这样可以防止中断,这些中断旨在唤醒等待任务的工作线程,而不是中断正在运行的任务。我们实现了一个简单的不可重入锁,而不是使用ReentrantLock,因为我们不希望工作线程在调用诸如setCorePoolSize这样的池控制方法的时候能够重新获得锁。此外,为了在线程实际开始运行之前抑制中断。我们将锁的初始化状态设置为负值,并在启动的时候将其清除。
实际上,比较特殊的是Worker继承了AQS,并且实现了Runnable接口,使用firstTask来保存传入的任务,thread则是使用ThreadFactory来创建的线程。这个线程来处理任务。
在调用构造方法的时候,将任务传入,通过getThreadFactory().newThread(this);创建一个新线程。这个worker对象在启动的时候会调用其run方法,因为worker实现了Runnable接口,其本身也是一个线程。
为什么使用继承AQS而不是使用ReentrantLock呢,关键是在于tryAcquire这个方法是不允许重入的。而ReentrantLock则是允许重入。
lock方法一旦获取的独占锁,则标识当前线程正在执行任务。Worker继承了AQS,因此自身也是一把锁。在执行任务的过程中,不会释放锁。用来保证任务的正常执行。
因此,任务在运行的过程中,是不能被中断的。 如果Worker不是独占锁,也是空闲状态,则说明这个Worker没有处理任务,可以对其进行中断。线程池在执行shutdown和tryTerminate的时候会对空闲的线程进行中断。interruptIdleWorkers方法会使用tryLock方法来判断线程池中的线程是否是空闲状态。

2.2 拒绝策略类

在ThreadPoolExecutor中,支持的拒绝策略主要有4种,都是通过内部类提供的。分别是CallerRunsPolicy、AbortPolicy、DiscardPolicy、DiscardOldestPolicy。这些类都实现了RejectedExecutionHandler接口。


拒绝策略

2.2.1 CallerRunsPolicy

/**
 * A handler for rejected tasks that runs the rejected task
 * directly in the calling thread of the {@code execute} method,
 * unless the executor has been shut down, in which case the task
 * is discarded.
 */
public static class CallerRunsPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code CallerRunsPolicy}.
     */
    public CallerRunsPolicy() { }

    /**
     * Executes task r in the caller's thread, unless the executor
     * has been shut down, in which case the task is discarded.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
       //判断线程池的状态,如果没有关闭,则用提交任务的线程来执行run方法
        if (!e.isShutdown()) {
            r.run();
        }
    }
}

这个拒绝策略采用执行exec的线程来运行任务,除非当前线程池处于关闭状态。这个拒绝策略在正常情况下不会导致任务失败,可以有效的降低生产任务的速度。用户根据场景自行选择。

2.2.2 AbortPolicy

/**
 * A handler for rejected tasks that throws a
 * {@code RejectedExecutionException}.
 */
public static class AbortPolicy implements RejectedExecutionHandler {
    /**
     * Creates an {@code AbortPolicy}.
     */
    public AbortPolicy() { }

    /**
     * Always throws RejectedExecutionException.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     * @throws RejectedExecutionException always
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
      //抛出异常
        throw new RejectedExecutionException("Task " + r.toString() +
                                             " rejected from " +
                                             e.toString());
    }
}

这个拒绝策略会拒绝任务的执行,直接抛出异常。

2.2.3 DiscardPolicy

/**
 * A handler for rejected tasks that silently discards the
 * rejected task.
 */
public static class DiscardPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code DiscardPolicy}.
     */
    public DiscardPolicy() { }

    /**
     * Does nothing, which has the effect of discarding task r.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
    }
}

该拒绝策略会丢弃当前任务,并且什么都不做。

2.2.4 DiscardOldestPolicy

/**
 * A handler for rejected tasks that discards the oldest unhandled
 * request and then retries {@code execute}, unless the executor
 * is shut down, in which case the task is discarded.
 */
public static class DiscardOldestPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code DiscardOldestPolicy} for the given executor.
     */
    public DiscardOldestPolicy() { }

    /**
     * Obtains and ignores the next task that the executor
     * would otherwise execute, if one is immediately available,
     * and then retries execution of task r, unless the executor
     * is shut down, in which case task r is instead discarded.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
       //如果线程池未关闭,则将线程池任务队列中的旧任务poll丢弃,然后将当前任务调用execute添加到队列。
        if (!e.isShutdown()) {
            e.getQueue().poll();
            e.execute(r);
        }
    }
}

这个拒绝策略将当队列中的旧任务丢弃,之后将当前任务添加到队列,但是这个拒绝策略并不能保证当前任务能执行,还是会有可能被后续的任务继续丢弃。

3. 构造函数

ThreadpoolExecutor提供了4种构造函数,分别对前面可变的7个成员变量进行赋值。

3.1 ThreadPoolExecutor(7参数)

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
     //  corePoolSize必须大于等于0  maximumPoolSize必须大于0   且maximumPoolSize>=corePoolSize ,否则参数非法异常。                
    if (corePoolSize < 0 ||
        maximumPoolSize <= 0 ||
        maximumPoolSize < corePoolSize ||
        keepAliveTime < 0)
        throw new IllegalArgumentException();
    if (workQueue == null || threadFactory == null || handler == null)
        throw new NullPointerException();
    //访问控制权限
    this.acc = System.getSecurityManager() == null ?
            null :
            AccessController.getContext();
    //赋值
    this.corePoolSize = corePoolSize;
    this.maximumPoolSize = maximumPoolSize;
    this.workQueue = workQueue;
    this.keepAliveTime = unit.toNanos(keepAliveTime);
    this.threadFactory = threadFactory;
    this.handler = handler;
}

实际上对应系统的变量只有6个,这是因为,keepAliveTime和unit最终都转换为纳秒数unit.toNanos(keepAliveTime)。

变量 说明
corePoolSize 线程池常驻线程数,如果设置了allowCoreThreadTimeOut,则常驻线程在一定时间之后就会变成0,范围0<=corePoolSize<=maximumPoolSize
maximumPoolSize 线程池允许的最大线程数,其范围0<maximumPoolSize<=(2^29-1)
keepAliveTime 当线程数大于核心线程corePoolSize的时候,这些多余的线程在当前任务终止之后最长的驻留时间。
unit keepAliveTime的单位,最终以纳秒在系统中生效
workQueue 任务队列,依赖于构造函数传入
threadFactory 产生线程的工厂方法
handler 拒绝策略,如果任务达到任务队列的长度将会触发拒绝策略,拒绝策略有四种

3.2 ThreadPoolExecutor(默认threadFactory)

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          RejectedExecutionHandler handler) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         Executors.defaultThreadFactory(), handler);
}

实际上调用的还是3.1中的七参数的构造函数,只是使用了默认的Executors.defaultThreadFactory()做为线程产生的工厂方法。
这个方法实际上在Executors中。

public static ThreadFactory defaultThreadFactory() {
    return new DefaultThreadFactory();
}
DefaultThreadFactory() {
    SecurityManager s = System.getSecurityManager();
    group = (s != null) ? s.getThreadGroup() :
                          Thread.currentThread().getThreadGroup();
    namePrefix = "pool-" +
                  poolNumber.getAndIncrement() +
                 "-thread-";
}

3.3 ThreadPoolExecutor(默认defaultHandler)

这个方法将采用默认的拒绝策略:

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         threadFactory, defaultHandler);
}

而默认的拒绝策略在前面变量和常量中已经做了说明:

private static final RejectedExecutionHandler defaultHandler =
        new AbortPolicy();

实际上采用的是丢弃当前任务的策略。

3.4 ThreadPoolExecutor(默认defaultThreadFactory和defaultHandler)

那么自然根据上述两个构造函数可以得到:

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         Executors.defaultThreadFactory(), defaultHandler);
}

将defaultThreadFactory和defaultHandler都采用默认值。因此这个构造方法是我们自己手动new ThreadPoolExecutor的时候会经常使用的。

4.基本原理

实际上在了解了前面的成员变量以及注释和构造函数之后,ThreadPoolExecutor的基本结构就非常清除了。个人觉得相比concurrentHashMap要简单很多。ThreadpoolExecutor的组成如下:

基本原理

由一个HashSet为数据结构的workersPool来存放所有的线程。然后将所有Runnable的任务都放在构造函数定义的workQueue中,这个workQueue可以是任意的阻塞队列。总之可以自行定义。那么,当线程池正常启动之后,这个数据结构就开始工作。当外部线程添加一个Runnable的task提交sumbit方法的时候。此时,submit方法中有三个选项:

  • 1.新线程执行:一是判断,当前pool中的线程数量如果小于corePoolSize那么会调用构造函数定义的线程创建的工厂方法来创建一个线程。将这个线程添加到workers中。用这个新线程执行任务,即便works中有线程空闲。


    初始启动新线程
  • 2.加入队列等待:当提交任务的时候,当线程池中workers的数量达到corePoolSize的时候,如果此时workQueue中不满,则将任务存入workQueue中。添加到队列的尾部。


    加入队列等待
  • 3.如果任务队列已满,而corePoolSize小于maxmumCoreSize,则用构造函数提供的构造方法创建一个新的线程,用这个新线程来执行提交的任务。


    构造新线程
  • 4.线程池workPool中的workers在执行完当前任务之后处于空闲的情况下,将从workQueue中取出队首的任务。


    自动取出
  • 5.如果线程池workPool达到maxmumPoolSize,同时workQueue中的任务也存满,达到最大值的时候,此时将触发拒绝策略。执行任务的线程根据上述的4种拒绝策略来执行是否将任务丢弃或者用当前线程执行任务。


    触发拒绝策略
  • 6.如果allowCoreThreadTimeOut没有设置,默认为false,则空闲时间超过keepAliveTime的线程将会自动销毁。以保持线程池中的线程数维持在corePoolSize。


    维持corePoolSize
  • 7.如果设置了allowCoreThreadTimeOut为true,则所有的线程在时间超过keepAliveTime之后,都会自动销毁。如果线程池空闲,则线程池不会占用任何资源。


    销毁空闲线程

以上7点是ThreadPoolExecutor的核心。也是面试过程中经常被问到的部分。
结合在第一部分中线程的状态,各状态之间的切换如下图:


状态切换

5.重要方法

在理解了线程池工作的基本原理之后,现在对线程池的一些常用方法进行分析。

5.1 execute

public void execute(Runnable command) {
    //如果任务为空,抛出异常
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
     //对于任务的处理,有三个步骤来处理,如果当前工作线程低于corepoolSize,则创建一个新的线程
    int c = ctl.get();
    if (workerCountOf(c) < corePoolSize) {
    //创建新线程执行任务,成功则返回,否则再次获取线程池的状态和线程数
        if (addWorker(command, true))
            return;
        c = ctl.get();
    }
    //如果线程池处于run状态,且工作队列workQueue可以添加
    if (isRunning(c) && workQueue.offer(command)) {
      //再次获取线程池状态和线程数
        int recheck = ctl.get();
        //判断运行状态,如果不处于运行状态且可以从阻塞队列中删除任务。则执行拒绝策略
        if (! isRunning(recheck) && remove(command))
           //执行拒绝策略
            reject(command);
        //如果线程池中的线程数量为0,则创建线程
        else if (workerCountOf(recheck) == 0)
            addWorker(null, false);
    }
    //如果阻塞队列已满,无法添加,则执行拒绝策略
    else if (!addWorker(command, false))
        reject(command);
}

上述过程可以用流程图表示如下:


流程图

5.2 addWorker

private boolean addWorker(Runnable firstTask, boolean core) {
    retry:
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        //检查线程池状态,如果处于SHUTDOWN的时候firstTask为null和workQueue不为空。或者线程池状态大于SHUTDOWN,这些状态下都不允许创建线程,因此直接返回false。此处相当于double check。
        if (rs >= SHUTDOWN &&
            ! (rs == SHUTDOWN &&
               firstTask == null &&
               ! workQueue.isEmpty()))
            return false;
        //死循环 采用CAS的方式修改count
        for (;;) {
            //获取count 这个方法是采用位运算
            int wc = workerCountOf(c);
            //如果数量大于总体容量2^29-1
            if (wc >= CAPACITY ||
                //此处根据传入的core决定通过corePoolSize还是maxmumPoolSize判断
                wc >= (core ? corePoolSize : maximumPoolSize))
                //上述这些条件如果为真则返回false
                return false;
            //通过cas的方式修改count,如果修改成功,则跳出死循环
            if (compareAndIncrementWorkerCount(c))
                //注意此处break和continue的区别  break是跳出当前的retry而continue则是继续执行
                break retry;
            c = ctl.get();  // Re-read ctl
            if (runStateOf(c) != rs)
                continue retry;
            // else CAS failed due to workerCount change; retry inner loop
        }
    }
    //初始化变量
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        //创建一个worker 其task即是firstTask
        w = new Worker(firstTask);
        //线程t指向worker的线程
        final Thread t = w.thread;
        //如果线程不为空
        if (t != null) {
           //获取mainLock
            final ReentrantLock mainLock = this.mainLock;
            mainLock.lock();
            try {
                // Recheck while holding lock.
                // Back out on ThreadFactory failure or if
                // shut down before lock acquired.
                //获取线程池状态
                int rs = runStateOf(ctl.get());
                //如果rs为RUNNING状态或者处于SHUTDOWN的时候,firstTask为空
                if (rs < SHUTDOWN ||
                    (rs == SHUTDOWN && firstTask == null)) {
                    //线程是否处于活动状态  如果线程以及run了则说明该线程已经启动,则会出问题,因为这个worker是新new的,并没有运行线程的run
                    if (t.isAlive()) // precheck that t is startable
                        throw new IllegalThreadStateException();
                    //将worker添加到workers
                    workers.add(w);
                    //计算workers的size
                    int s = workers.size();
                    //判断size是否大于lagestPoolSize,那么修改LargestPoolSize
                    if (s > largestPoolSize)
                        largestPoolSize = s;
                    //worker的添加状态为true
                    workerAdded = true;
                }
            //不要忘记解锁
            } finally {
                mainLock.unlock();
            }
            //如果线程加入pool成功,则启动线程,并修改线程的启动状态为true
            if (workerAdded) {
                t.start();
                workerStarted = true;
            }
        }
    } finally {
        //如果线程启动状态不为true,则判定启动失败,将worker添加到失败的worker中
        if (! workerStarted)
            addWorkerFailed(w);
    }
    //返回worker的启动状态做为线程的执行结果
    return workerStarted;
}

经过这个方法,我们可以看到,实际上mainLock是在向workers的pool中添加和删除队列的时候才会用到。在添加或者删除之前要获取这个锁。
另外,本文开始定义了标签语句 retry: ,需要注意的是,通过continue会结束当前的循环重新开始retry。而break则会跳出retry。结束循环。

5.3 runWorker

我们在前面静态内部类部分,对worker进行了分析。那么实际上,worker在被启动之后,是怎么执行的呢?

public void run() {
    runWorker(this);
}

实际上调用的就是这个runWorker方法。

final void runWorker(Worker w) {
    //定义线程为wt
    Thread wt = Thread.currentThread();
    //定义任务为task 
    Runnable task = w.firstTask;
    w.firstTask = null;
    //此处调用unlock,这不是mainLock,这是worker继承了AQS,实际上是AQS中的release方法,还记得所有的worker中被初始化的状态是-1吗?此处调用release方法,就将-1改为了1,这样后面的中断方法就能对处于运行状态的线程进行中断
    w.unlock(); // allow interrupts
    //
    boolean completedAbruptly = true;
    try {
        //如果task不为null,或者task执行getTask也不为空。需要注意的是此处的getTask,是从队列中去获取的。因此,此处的逻辑就是,如果这个worker有任务那么就执行,执行完之后就遍历队列,继续执行任务。
        while (task != null || (task = getTask()) != null) {
            //执行任务的过程中加锁,那么加锁之后就不允许中断了
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            //判断线程池运行状态,并结合线程的中断状态,之后对线程进行中断
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                wt.interrupt();
            try {
                //前置操作
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    //调用run方法开始执行
                    task.run();
                //异常处理
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    //后置操作 
                    afterExecute(task, thrown);
                }
            } finally {
                task = null;
                w.completedTasks++;
                w.unlock();
            }
        }
        completedAbruptly = false;
    } finally {
        //处理完之后,当线程退出的时候,触发processWorkerExit方法,从workers中移除当前的worker
        processWorkerExit(w, completedAbruptly);
    }
}

5.4 processWorkerExit

我们再看看上述方法提到的线程退出的时候执行的方法

private void processWorkerExit(Worker w, boolean completedAbruptly) {
    if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
        decrementWorkerCount();
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //更新completedTaskCount 每个worker会维护一个completedTasks,当worker退出的时候,更新到线程池中
        completedTaskCount += w.completedTasks;
        //从队列中移除worker
        workers.remove(w);
    } finally {
        //解锁
        mainLock.unlock();
    }
    //尝试线程池是否可以进入terminate
    tryTerminate();
    //获得锁状态和线程数
    int c = ctl.get();
    //判断锁状态
    if (runStateLessThan(c, STOP)) {
        if (!completedAbruptly) {
            int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
            if (min == 0 && ! workQueue.isEmpty())
                min = 1;
            if (workerCountOf(c) >= min)
                return; // replacement not needed
        }
        //添加worker
        addWorker(null, false);
    }
}

5.5 tryTerminate

这个方法是每次线程退出的时候就触发,尝试判断线程池是否可以Terminate

final void tryTerminate() {
    //死循环 
    for (;;) {
        //获得运行状态和线程数,之后进行条件判断,如果添加不满足则退出
        int c = ctl.get();
        if (isRunning(c) ||
            runStateAtLeast(c, TIDYING) ||
            (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
            return;
        //如果条件满足,但是workercount不为0,则对所有空闲的线程执行中断方法
        if (workerCountOf(c) != 0) { // Eligible to terminate
            interruptIdleWorkers(ONLY_ONE);
            return;
        }
        //获得锁
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            //采用cas进行更新
            if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                try {
                    terminated();
                } finally {
                    ctl.set(ctlOf(TERMINATED, 0));
                    //这个termination 在awaitTermination中会等待keepAlive的时间。此处将唤醒这些等待的线程
                    termination.signalAll();
                }
                return;
            }
        } finally {
           //解锁
            mainLock.unlock();
        }
        // else retry on failed CAS
    }
}

5.6 awaitTermination

public boolean awaitTermination(long timeout, TimeUnit unit)
    throws InterruptedException {
    //将超时时间转换为纳秒
    long nanos = unit.toNanos(timeout);
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //死循环
        for (;;) {
            //判断运行状态,如果已经处于TERMINATED状态则直接退出
            if (runStateAtLeast(ctl.get(), TERMINATED))
                return true;
            //计算的纳秒时间必须大于0
            if (nanos <= 0)
                return false;
            //通过lock的条件变量等待,将当前线程变为TIME_WAITING状态
            nanos = termination.awaitNanos(nanos);
        }
    } finally {
        //解锁
        mainLock.unlock();
    }
}

同上述方法可以看出,mainLock是对worker操作的时候使用的,在添加和删除workers的时候需要获得锁。此外,当调用awaitTermination()方法的时候,对空闲的worker执行WAIT操作也采用lock的条件变量termination来执行。之后当线程进入TERMINATION状态的时候统一唤醒。

5.7 getTask

最后再分析一个关键方法。getTask,也就是前面的worker获取任务的方法。这个方法非常重要。

private Runnable getTask() {
    //是否需要使用超时时间 
    boolean timedOut = false; // Did the last poll() time out?
    //死循环
    for (;;) {
        //获得线程状态和count
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        //状态判断 当rs大于SHUTDOWN或者STOP状态的时候判断workQueue是否为空
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
            //缩减workerCount
            decrementWorkerCount();
            return null;
        }
        //获取wc
        int wc = workerCountOf(c);

        // Are workers subject to culling?
        //判断wc是否大于corePoolSize,决定timed的状态,也就是说,当wc小于corePoolSize的时候,就不考虑阻塞队列的超时
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
        //如果wc大于最大线程或者timed的状态不满足 则continue
        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) {
            //如果cas减少workerCount失败则退出
            if (compareAndDecrementWorkerCount(c))
                return null;
            continue;
            //通过这个循环将wc减少
        } 

        try {
            //如果需要超时获取,则按超时的方式获取任务,这也是阻塞队列的作用,如果有超时时间则会一直阻塞
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                workQueue.take();
            //r为从队列中拿到的任务
            if (r != null)
                return r;
            timedOut = true;
            //如果被中断 则设置timedOut为false
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}

实际上通过这个方法的代码可以发现,之所以要使用阻塞队列,原因就在于,这个方法中如果需要通过使用keepAlive的时间,那么此方法就会用pull(timeout)方法来阻塞当前调用线程。这样就能将每个worker阻塞再getTask的过程中。
重点需要注意getTask被阻塞的条件:

 allowCoreThreadTimeOut || wc > corePoolSize;

开启核心线程超时或者当前线程数大于核心线程。

5.8 interruptIdleWorkers

如果5.7方法中有线程进入了TIME_WAITING状态。那么如果需要及时使用,那么就需要对这些阻塞的线程进行中断。

private void interruptIdleWorkers(boolean onlyOne) {
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //遍历workers
        for (Worker w : workers) {
            Thread t = w.thread;
            //获得AQS的锁,如果能获得AQS的锁且不处于中断状态则进行中断
            if (!t.isInterrupted() && w.tryLock()) {
                try {
                   //调用线程的中断方法
                    t.interrupt();
                } catch (SecurityException ignore) {
                } finally {
                   //worker的AQS解锁
                    w.unlock();
                }
            }
            //是否只执行1次的标识
            if (onlyOne)
                break;
        }
    } finally {
        //解锁
        mainLock.unlock();
    }
}

调用中断的时候,需要回去两层锁,一层是mainLock,另外一层是线程的AQS锁,如果被占用则该线程不会被打断。这样一来就可以明白,再最开始刚创建的worker是不会被打断的,另外处于工作中的线程也不会被打算,只有wait状态的worker才会被打断。
而interruptIdleWorkers方法被调用的时机:

  • tryTerminate
  • shutdown
  • setCorePoolSize -> workerCountOf(ctl.get()) > corePoolSize
  • allowCoreThreadTimeOut
  • setMaximumPoolSize -> workerCountOf(ctl.get()) > maximumPoolSize
  • setKeepAliveTime

6.总结

本文对ThreadPoolExecutor线程池的源码进行了分析。相对于ConcurrentHashMap这个类的代码并不是特别复杂。实际上ThreadpoolExecutor由一个hashSet构成的workerPool和一个自定义的阻塞队列workQueue组成。基本结构如下:


基本构成

worker本身继承了AQS,然后还实现了Runnable接口。之后worker会不断的从任务队列workQueue中去获取任务。有两种获取任务的方法,getTask()。当开启了核心线程keepAlive或者当前线程数大于核心线程corepoolSize的时候,就会触发keepAliveTime的阻塞。被阻塞的线程就可以认为是空闲的线程,这些线程被阻塞队列阻塞住,这也是为什么线程池要用阻塞阻塞队列的原因。不需要被阻塞的线程则可以不断的执行run,源源不断的从工作队列workQueue中获取task执行。
由于worker实现AQS,其作用是在于当大部分线程都处于休眠的时候,线程池中活动的线程降低到队列核心线程数以下之后,此时就通过中断的方式唤醒哪些没有工作的线程。在关闭或者调整线程池的时候都适用。而AQS的目的就是在打断的时候,需要获得AQS锁,而新创建的worker和处于正常运行中的worker则会导致获取锁失败,不会被打断。
ReentrantLocak mainLock的作用在于,对workers的操作,增减或者打断都需要获得这个锁才能执行。而这个锁上的条件变量termination的唯一作用是在tryTermination中调用。
最后需要注意的是,线程池采用了大量的cas操作,最关键的是将线程的状态和workerSize合并到一个AotmicInteger对象上。通过位运算,最高三位标识状态。因此worker的大小也就是2^29-1。这个位运算的操作也是我们值得借鉴的地方。与HashMap中的位运算操作一样,都是让在看代码的时候觉得,代码还能这样写。特别提升设计能力的地方。

最关键的部分:

  • 5个状态及切换过程
  • 7个操作(基本原理一节)
  • 4个拒绝策略
  • 7个参数(构造函数)
  • 用阻塞队列的原因
  • 两种锁:AQS和mainLock
  • 添加任务、gettask、执行任务、以及中断过程
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,951评论 6 497
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,606评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 160,601评论 0 350
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,478评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,565评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,587评论 1 293
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,590评论 3 414
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,337评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,785评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,096评论 2 330
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,273评论 1 344
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,935评论 5 339
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,578评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,199评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,440评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,163评论 2 366
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,133评论 2 352