JUC（四）线程池

一、简介

创建线程的代价是十分昂贵的，需要给它分配内存、列入调度。Java中默认一个线程的线程栈大小是1M，虽然看着不多，但是如果同时创建很多线程，占用的内存也是不容忽视的。

如果不对线程进行管理，可能带来如下问题：
1、频繁申请/销毁线程，带来额外的消耗；
2、对线程的创建没有限制，导致系统内存耗尽；

解决问题：
1、针对频繁申请、销毁线程的问题，可以考虑线程复用。本来线程执行完自己的任务就会销毁，现在可以让执行完毕之后再去获取新的任务，从而省去创建新线程的开销。线程池的基本原理即享元模式（对象复用）。
2、限制线程的创建数量，避免线程过多导致内存不足。然而会引入其他问题：线程达到设定的最大数量，但是又来了新的任务。此时有以下解决思路：
（1）由提交者自行执行任务；
（2）抛异常；
（3）丢弃任务；
（4）将任务保存起来，等有线程空闲后再执行。
......
具体使用何种解决方式，交由开发者选择，线程池用到的另一个设计模式：策略模式。

二、工作原理

1、线程池主要包括三部分：
HashSet<Worker> workers：存储线程，包含核心线程和非核心线程；
BlockingQueue<Runnable> workQueue：存储任务，阻塞队列；
RejectedExecutionHandler handler：拒绝策略，线程池饱和时调用。

线程池原理

2、线程池执行流程

线程池执行流程

三、其他

1、什么时候创建非核心线程？

查看ThreadPoolExecutor的execute方法，可以看出：当前线程数量 > 核心线程数，并且任务队列不能再添加任务时，会尝试创建非核心线程。

public void execute(Runnable command) {
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
    int c = ctl.get();
    if (workerCountOf(c) < corePoolSize) {
        if (addWorker(command, true))
            return;
        c = ctl.get();
    }
    // 如果添加失败，会走false
    if (isRunning(c) && workQueue.offer(command)) {
        int recheck = ctl.get();
        if (! isRunning(recheck) && remove(command))
            reject(command);
        else if (workerCountOf(recheck) == 0)
            addWorker(null, false);
    }
    // 在这里创建新的线程
    else if (!addWorker(command, false))
        reject(command);
}

2、什么时候销毁非核心线程？

通过下方代码可以看到：
（1）再执行完execute之后，如果需要添加线程，会添加Worker，并启动线程。
（2）线程启动后，会循环获取task中的任务进行执行，由于task是阻塞队列，所以如果没有任务，当前线程也会阻塞。
（3）在getTask中会通过标记判断是否需要设置超时时间，如果设置了超时时间，当阻塞队列超时后，会直接返回null。在getTask中会在此检验是否真的需要返回null。
（4）当getTask返回null后，runWorker中的循环退出，线程退出循环，执行完毕，正常销毁。

因此：当线程数量超过了核心线程数，并且任务队列已经执行完毕，当前线程会在规定的超时时间后自动销毁。

// 添加worker并启动
private boolean addWorker(Runnable firstTask, boolean core) {
    ...
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        w = new Worker(firstTask);
        final Thread t = w.thread;
        if (t != null) {
            ...
            if (workerAdded) {
                t.start();
                workerStarted = true;
            }
        }
    } finally {
        if (! workerStarted)
            addWorkerFailed(w);
    }
    return workerStarted;
}

// worker继承自Runnable
private final class Worker extends AbstractQueuedSynchronizer implements Runnable{
    public void run() {
        runWorker(this);
    }
}

// thread执行
final void runWorker(Worker w) {
    ...
    try {
        // 循环获取task，并执行task任务
        while (task != null || (task = getTask()) != null) {
           ...
    } finally {
        processWorkerExit(w, completedAbruptly);
    }
}

// 获取task
private Runnable getTask() {
    boolean timedOut = false; // Did the last poll() time out?
    for (;;) {
        ...
        // 是否需要退出等待
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) {
            if (compareAndDecrementWorkerCount(c))
                return null;
            continue;
        }
        try {
            // 如果需要设置超时，传入超时时间，超时后再次进入循环判断一边
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                workQueue.take();
            if (r != null)
                return r;
            timedOut = true;
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}

3、核心线程会销毁吗？

通过问题2可以知道，如果设置了allowCoreThreadTimeOut，超过规定的超时时间后，核心线程也会被销毁。

4、fork/join线程池

假设我们有一个遍历文件的任务，常规的递推操作，文件是串行的。假如文件层级很多，消耗的时间会很长。

常规遍历

可以考虑将一个任务拆分成多个任务，利用CPU的优势，多线程并行操作，从而提高执行效率。ForkJoin线程池就是为了解决这种场景出现的。

多线程遍历