Java源码解析线程池ThreadPoolExecutor简介

本文基于jdk1.8进行分析

Java的线程池是在多线程常见下常用的框架，因为线程池的源码实现比较冗长，我们会分多次介绍线程池的源码。本文主要介绍线程池的基本知识。这部分知识是基于源码中线程池类ThreadPoolExecutor的javadoc注释的。注释原文如下。

/**
 * An {@link ExecutorService} that executes each submitted task using
 * one of possibly several pooled threads, normally configured
 * using {@link Executors} factory methods.
 *
 * <p>Thread pools address two different problems: they usually
 * provide improved performance when executing large numbers of
 * asynchronous tasks, due to reduced per-task invocation overhead,
 * and they provide a means of bounding and managing the resources,
 * including threads, consumed when executing a collection of tasks.
 * Each {@code ThreadPoolExecutor} also maintains some basic
 * statistics, such as the number of completed tasks.
 *
 * <p>To be useful across a wide range of contexts, this class
 * provides many adjustable parameters and extensibility
 * hooks. However, programmers are urged to use the more convenient
 * {@link Executors} factory methods {@link
 * Executors#newCachedThreadPool} (unbounded thread pool, with
 * automatic thread reclamation), {@link Executors#newFixedThreadPool}
 * (fixed size thread pool) and {@link
 * Executors#newSingleThreadExecutor} (single background thread), that
 * preconfigure settings for the most common usage
 * scenarios. Otherwise, use the following guide when manually
 * configuring and tuning this class:
 *
 * <dl>
 *
 * <dt>Core and maximum pool sizes</dt>
 *
 * <dd>A {@code ThreadPoolExecutor} will automatically adjust the
 * pool size (see {@link #getPoolSize})
 * according to the bounds set by
 * corePoolSize (see {@link #getCorePoolSize}) and
 * maximumPoolSize (see {@link #getMaximumPoolSize}).
 *
 * When a new task is submitted in method {@link #execute(Runnable)},
 * and fewer than corePoolSize threads are running, a new thread is
 * created to handle the request, even if other worker threads are
 * idle.  If there are more than corePoolSize but less than
 * maximumPoolSize threads running, a new thread will be created only
 * if the queue is full.  By setting corePoolSize and maximumPoolSize
 * the same, you create a fixed-size thread pool. By setting
 * maximumPoolSize to an essentially unbounded value such as {@code
 * Integer.MAX_VALUE}, you allow the pool to accommodate an arbitrary
 * number of concurrent tasks. Most typically, core and maximum pool
 * sizes are set only upon construction, but they may also be changed
 * dynamically using {@link #setCorePoolSize} and {@link
 * #setMaximumPoolSize}. </dd>
 *
 * <dt>On-demand construction</dt>
 *
 * <dd>By default, even core threads are initially created and
 * started only when new tasks arrive, but this can be overridden
 * dynamically using method {@link #prestartCoreThread} or {@link
 * #prestartAllCoreThreads}.  You probably want to prestart threads if
 * you construct the pool with a non-empty queue. </dd>
 *
 * <dt>Creating new threads</dt>
 *
 * <dd>New threads are created using a {@link ThreadFactory}.  If not
 * otherwise specified, a {@link Executors#defaultThreadFactory} is
 * used, that creates threads to all be in the same {@link
 * ThreadGroup} and with the same {@code NORM_PRIORITY} priority and
 * non-daemon status. By supplying a different ThreadFactory, you can
 * alter the thread's name, thread group, priority, daemon status,
 * etc. If a {@code ThreadFactory} fails to create a thread when asked
 * by returning null from {@code newThread}, the executor will
 * continue, but might not be able to execute any tasks. Threads
 * should possess the "modifyThread" {@code RuntimePermission}. If
 * worker threads or other threads using the pool do not possess this
 * permission, service may be degraded: configuration changes may not
 * take effect in a timely manner, and a shutdown pool may remain in a
 * state in which termination is possible but not completed.</dd>
 *
 * <dt>Keep-alive times</dt>
 *
 * <dd>If the pool currently has more than corePoolSize threads,
 * excess threads will be terminated if they have been idle for more
 * than the keepAliveTime (see {@link #getKeepAliveTime(TimeUnit)}).
 * This provides a means of reducing resource consumption when the
 * pool is not being actively used. If the pool becomes more active
 * later, new threads will be constructed. This parameter can also be
 * changed dynamically using method {@link #setKeepAliveTime(long,
 * TimeUnit)}.  Using a value of {@code Long.MAX_VALUE} {@link
 * TimeUnit#NANOSECONDS} effectively disables idle threads from ever
 * terminating prior to shut down. By default, the keep-alive policy
 * applies only when there are more than corePoolSize threads. But
 * method {@link #allowCoreThreadTimeOut(boolean)} can be used to
 * apply this time-out policy to core threads as well, so long as the
 * keepAliveTime value is non-zero. </dd>
 *
 * <dt>Queuing</dt>
 *
 * <dd>Any {@link BlockingQueue} may be used to transfer and hold
 * submitted tasks.  The use of this queue interacts with pool sizing:
 *
 * <ul>
 *
 * <li> If fewer than corePoolSize threads are running, the Executor
 * always prefers adding a new thread
 * rather than queuing.</li>
 *
 * <li> If corePoolSize or more threads are running, the Executor
 * always prefers queuing a request rather than adding a new
 * thread.</li>
 *
 * <li> If a request cannot be queued, a new thread is created unless
 * this would exceed maximumPoolSize, in which case, the task will be
 * rejected.</li>
 *
 * </ul>
 *
 * There are three general strategies for queuing:
 * <ol>
 *
 * <li> <em> Direct handoffs.</em> A good default choice for a work
 * queue is a {@link SynchronousQueue} that hands off tasks to threads
 * without otherwise holding them. Here, an attempt to queue a task
 * will fail if no threads are immediately available to run it, so a
 * new thread will be constructed. This policy avoids lockups when
 * handling sets of requests that might have internal dependencies.
 * Direct handoffs generally require unbounded maximumPoolSizes to
 * avoid rejection of new submitted tasks. This in turn admits the
 * possibility of unbounded thread growth when commands continue to
 * arrive on average faster than they can be processed.  </li>
 *
 * <li><em> Unbounded queues.</em> Using an unbounded queue (for
 * example a {@link LinkedBlockingQueue} without a predefined
 * capacity) will cause new tasks to wait in the queue when all
 * corePoolSize threads are busy. Thus, no more than corePoolSize
 * threads will ever be created. (And the value of the maximumPoolSize
 * therefore doesn't have any effect.)  This may be appropriate when
 * each task is completely independent of others, so tasks cannot
 * affect each others execution; for example, in a web page server.
 * While this style of queuing can be useful in smoothing out
 * transient bursts of requests, it admits the possibility of
 * unbounded work queue growth when commands continue to arrive on
 * average faster than they can be processed.  </li>
 *
 * <li><em>Bounded queues.</em> A bounded queue (for example, an
 * {@link ArrayBlockingQueue}) helps prevent resource exhaustion when
 * used with finite maximumPoolSizes, but can be more difficult to
 * tune and control.  Queue sizes and maximum pool sizes may be traded
 * off for each other: Using large queues and small pools minimizes
 * CPU usage, OS resources, and context-switching overhead, but can
 * lead to artificially low throughput.  If tasks frequently block (for
 * example if they are I/O bound), a system may be able to schedule
 * time for more threads than you otherwise allow. Use of small queues
 * generally requires larger pool sizes, which keeps CPUs busier but
 * may encounter unacceptable scheduling overhead, which also
 * decreases throughput.  </li>
 *
 * </ol>
 *
 * </dd>
 *
 * <dt>Rejected tasks</dt>
 *
 * <dd>New tasks submitted in method {@link #execute(Runnable)} will be
 * <em>rejected</em> when the Executor has been shut down, and also when
 * the Executor uses finite bounds for both maximum threads and work queue
 * capacity, and is saturated.  In either case, the {@code execute} method
 * invokes the {@link
 * RejectedExecutionHandler#rejectedExecution(Runnable, ThreadPoolExecutor)}
 * method of its {@link RejectedExecutionHandler}.  Four predefined handler
 * policies are provided:
 *
 * <ol>
 *
 * <li> In the default {@link ThreadPoolExecutor.AbortPolicy}, the
 * handler throws a runtime {@link RejectedExecutionException} upon
 * rejection. </li>
 *
 * <li> In {@link ThreadPoolExecutor.CallerRunsPolicy}, the thread
 * that invokes {@code execute} itself runs the task. This provides a
 * simple feedback control mechanism that will slow down the rate that
 * new tasks are submitted. </li>
 *
 * <li> In {@link ThreadPoolExecutor.DiscardPolicy}, a task that
 * cannot be executed is simply dropped.  </li>
 *
 * <li>In {@link ThreadPoolExecutor.DiscardOldestPolicy}, if the
 * executor is not shut down, the task at the head of the work queue
 * is dropped, and then execution is retried (which can fail again,
 * causing this to be repeated.) </li>
 *
 * </ol>
 *
 * It is possible to define and use other kinds of {@link
 * RejectedExecutionHandler} classes. Doing so requires some care
 * especially when policies are designed to work only under particular
 * capacity or queuing policies. </dd>
 *
 * <dt>Hook methods</dt>
 *
 * <dd>This class provides {@code protected} overridable
 * {@link #beforeExecute(Thread, Runnable)} and
 * {@link #afterExecute(Runnable, Throwable)} methods that are called
 * before and after execution of each task.  These can be used to
 * manipulate the execution environment; for example, reinitializing
 * ThreadLocals, gathering statistics, or adding log entries.
 * Additionally, method {@link #terminated} can be overridden to perform
 * any special processing that needs to be done once the Executor has
 * fully terminated.
 *
 * <p>If hook or callback methods throw exceptions, internal worker
 * threads may in turn fail and abruptly terminate.</dd>
 *
 * <dt>Queue maintenance</dt>
 *
 * <dd>Method {@link #getQueue()} allows access to the work queue
 * for purposes of monitoring and debugging.  Use of this method for
 * any other purpose is strongly discouraged.  Two supplied methods,
 * {@link #remove(Runnable)} and {@link #purge} are available to
 * assist in storage reclamation when large numbers of queued tasks
 * become cancelled.</dd>
 *
 * <dt>Finalization</dt>
 *
 * <dd>A pool that is no longer referenced in a program <em>AND</em>
 * has no remaining threads will be {@code shutdown} automatically. If
 * you would like to ensure that unreferenced pools are reclaimed even
 * if users forget to call {@link #shutdown}, then you must arrange
 * that unused threads eventually die, by setting appropriate
 * keep-alive times, using a lower bound of zero core threads and/or
 * setting {@link #allowCoreThreadTimeOut(boolean)}.  </dd>
 *
 * </dl>
 *
 * <p><b>Extension example</b>. Most extensions of this class
 * override one or more of the protected hook methods. For example,
 * here is a subclass that adds a simple pause/resume feature:
 *
 *  <pre> {@code
 * class PausableThreadPoolExecutor extends ThreadPoolExecutor {
 *   private boolean isPaused;
 *   private ReentrantLock pauseLock = new ReentrantLock();
 *   private Condition unpaused = pauseLock.newCondition();
 *
 *   public PausableThreadPoolExecutor(...) { super(...); }
 *
 *   protected void beforeExecute(Thread t, Runnable r) {
 *     super.beforeExecute(t, r);
 *     pauseLock.lock();
 *     try {
 *       while (isPaused) unpaused.await();
 *     } catch (InterruptedException ie) {
 *       t.interrupt();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void pause() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = true;
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void resume() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = false;
 *       unpaused.signalAll();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 * }}</pre>
 *
 * @since 1.5
 * @author Doug Lea
 */

ThreadPoolExecutor是一个ExecutorService，它使用缓存的若干个线程来执行每一个提交的任务。ThreadPoolExecutor通常使用Executors工厂方法进行配置和创建。
线程池声明了2个不同的问题：当需要执行大量异步任务时，他们通常提供更好的性能，因为他们降低了任务前的调用开销，第二个是，他们提供了一种捆绑和管理执行一系列任务时消耗的资源的方式，包括线程。每个线程池也维护着一些基本数据，例如完成任务的个数。
为了能够在比较多的场景下使用，这个类提供了许多可调整的参数和可扩展的点。然而，还是强烈要求编程者使用更方便的Executor工厂方法Executors.newCachedThreadPool(无界线程池，线程自动创建和回收)，Executors.newFixedThreadPool(固定大小的线程池)，Executors.newSingleThreadExecutor（单一线程），这些线程池可以满足大部分的常用场景。否则，当手动配置这个类时，请遵守下面的指导。
1，core size 和 max size
线程池会根据corePoolSize和maximumPoolSize来自动调整线程池的大小。当一个任务通过execute方法提交后，并且少于corePoolSize的线程正在运行，就会创建一个新的线程来处理任务，尽管此时有其他工作线程正在空闲。如果现在有多于corePoolSize并且少于maximumPoolSize的线程正在运行，只有当队列已满时，才会有一个新的线程被创建。通过把corePoolSize和maximumPoolSize设置为相同的，就可以创建一个固定大小的线程池。通过把maximumPoolSize设置为一个无界的大值例如Integer.MAX_VALUE，你就可以使线程池容纳任意多的并发线程。通常，corePoolSize和maximumPoolSize在构造时进行初始化，但是它们也可以通过setCorePoolSize和setMaximumPoolSize动态改变。
2，按需构造
默认情况下，线程，甚至是核心线程，只有当新任务到达时才会被创建，但是这个可以通过重写prestartCoreThread或prestartAllCoreThreads方法进行改变。也许，当使用一个非空队列创建线程池时，你希望能够预先启动线程们。
3，创建新线程
新线程是通过ThreadFactory创建的。如果没有指定线程工厂，那么就会使用Executors.defaultThreadFactory，它创建的线程都属于一个线程组，有相同的优先级NORM_PRIORITY，都是非守护线程。通过提供一个不同的线程工厂，你可以改变线程的名称，线程组，优先级，守护线程状态位等。如果线程工厂被newThread请求但却创建线程失败而返回null时，线程池会继续，但也许不能执行任何任务了。
4，keep-alive时间
如果线程池现在有多于corePoolSize的线程，如果他们空闲了超过keepAliveTime时间，那么多余的线程就会被结束。这提供了当线程池没有被活跃使用时降低资源消耗的一种方式。如果稍后线程池又变的活跃了，新的线程又会被重新创建。这个参数可以动态修改，使用setKeepAliveTime方法即可。使用Long.MAX_VALUE TimeUnit.NANOSECONDS可以防止多余线程在线程池关闭前结束。默认情况下，只有当线程数超过corePoolSize时keep-alive策略才有效。但是，使用allowCoreThreadTimeOut(boolean)可以使超时策略对核心线程也有效，只要keepAliveTime非0即可。
5，队列
任何阻塞队列可以用来转移和持有提交的任务。队列的使用和线程池的大小有关：
如果如果少于核心线程数的线程正在运行，线程池更倾向于添加新线程，而不是使用队列。
如果大于等于核心线程数个线程正在运行，线程池更倾向于把请求加入队列而不是添加新线程。
如果一个请求无法加入队列了，即队列已满，那么就会创建一个新线程，除非这会超过最大线程数，在那种情况下，任务会被拒绝。
通常有3种策略用于排队：
1）直接递交。工作队列的一个好的默认的选择是一个同步队列，它直接把任务交给线程而不持有它们。在这种情况下，尝试往队列里加一个任务会失败，如果没有线程立刻可用来运行它，所以一个新线程将会被创建，当处理一系列可能存在内部依赖的任务时，这个策略避免了阻塞。直接递交策略一般要求无界最大线程数来避免新任务的提交。这就导致了，当任务到达的速度比任务处理的速度更快时，线程可能会无限增长。
2）无界队列。使用无界队列（例如LinkedBlockingQueue），当所有核心线程都忙时，会导致新提交的任务在队列中等待。这样，就不会有多于核心线程数的线程被创建。（这样，最大线程数的值就没有任何影响了）当每个任务都独立于其他任务时，这种状况也许比较合适，这样任务不能影响到其他任务的执行，举个例子，在一个网页服务器上。在这种风格可以对快速到来的请求起到消峰的作用的同时，当任务到达的速度比任务处理的速度更快时，线程可能会无限增长。
3）有界队列
当使用有限最大线程数时，一个有界队列（例如ArrayBlockingQueue）可以帮助阻止资源过度消耗，但是却更不容易协调和控制。队列大小和线程池大小有可能是彼此的折中：使用大的队列和小的线程池大小能够降低CPU消耗，操作系统资源，和上下文环境切换负载，但是却能导致很低的吞吐量。如果任务频繁被阻塞（例如他们被io阻塞），系统就可以调度比你允许的更多的线程。使用小的队列一般要求同时使用大的线程池，这样可以使CPU保持忙碌但是可能导致无法接受的调度负载，这同样会降低吞吐量。
6，拒绝任务
如果线程池已经被关了，那么通过execute提交的任务会被拒绝，当线程池使用有限的队列和线程池大小并且资源都用完了时，任务同样会被拒绝。在任意情况下，execute方法会调用RejectedExecutionHandler.rejectedExecution(Runnable, ThreadPoolExecutor)方法。这里提供了四种预定义的拒绝策略：
1）在默认的ThreadPoolExecutor.AbortPolicy中，处理者在拒绝时会抛出运行时RejectedExecutionException异常。
2），在ThreadPoolExecutor.CallerRunsPolicy中，调用execute的线程会自己来执行任务。这提供了一个反馈控制机制，来降低提交任务的速度。
3）在ThreadPoolExecutor.DiscardPolicy中，无法被执行的任务会被抛弃掉。
4）在ThreadPoolExecutor.DiscardOldestPolicy中，如果线程池还没被关闭，那么工作队列的头元素会被抛弃，然后会再尝试执行。
也可以定义并使用别的RejectedExecutionHandler。这样做需要特别留心一下，特别是当拒绝策略被设计来只在特殊的容量和排队策略情况下使用时。
7，钩子方法
这个类提供了可重写的beforeExecute和afterExecute方法，用来在任务执行前和任务执行后调用。这些方法可以用来准备执行环境，例如，重新初始化ThreadLocals，收集数据，或者添加日志。此外，terminated方法可以被重写，当线程池全部结束后，来做一些需要的特殊处理。如果钩子方法或者回调函数抛出异常，内部工作线程可能会依次失败并突然结束。
8，队列维护
getQueue方法可以获取队列，用于监控目的或调试目的。强烈不建议在别的目的时使用该方法。两个额外的方法，remove(Runnable) 和 purge，可以用来帮助清理存储，当大量已经排队中的任务可以取消时。
9，写在最后
一个程序中没有任何引用，并且线程都结束了的线程池，会被自动关闭。

Java源码解析线程池ThreadPoolExecutor简介

猜你喜欢