CountDownLatch源码解析和使用示例

如果你不了解AQS也许无法看懂本章内容，建议查看博主的另一篇关于AQS的文章：AQS源码解析

注释

一个同步辅助工具，允许一个或多个线程等待一组操作在其他线程中完成。

CountDownLatch初始化时会指定一个计数器的初始值。由于调用了countDown方法，await方法会一直阻塞，直到当前计数器的值达到零，之后释放所有等待线程，接下来调用await方法会立即返回。这是一个one-shot(一次性)现象–计数器不能被重置。如果你需要一个可以重置计数器的版本，可以考虑使用CyclicBarrier。

CountDownLatch是一个多功能的同步工具，可以用于多种用途。一个初始化计数器为1的CountDownLatch可以作为一个简单的开关门或闸门：所有调用await方法的线程都会在门口等待，直到被一个调用countDown方法的线程打开闸门。一个初始化计数为N的CountDownLatch可以使用一个线程等待，直到N个线程完成某个操作，或者某个操作被完成N次。

一个线程可以在调用countDown方法之后立即继续执行，不需要等待计数器到达零。但是，任何线程在调用await方法之前都会被阻塞，直到所有线程都调用了countDown方法，计数器被减至零，才能继续执行。这保证了在所有线程完成其工作之前，任何线程都无法继续执行下一步操作，从而实现了线程之间的同步。

使用案例一

下面是一对使用 CountDownLatch 的类，其中一组工作线程使用两个 CountDownLatch：

第一个是启动信号，防止任何工作线程在驱动程序准备好它们之前继续执行；
第二个是完成信号，允许驱动程序等待直到所有工作线程完成。

class Driver {
    
     // ...
	void main() throws InterruptedException {
    
    
		CountDownLatch startSignal = new CountDownLatch(1);
		CountDownLatch doneSignal = new CountDownLatch(N);
		for (int i = 0; i < N; ++i) // create and start threads
			new Thread(new Worker(startSignal, doneSignal)).start();
		doSomethingElse();            // don't let run yet 
		startSignal.countDown();      // let all threads proceed      
		doSomethingElse();
		doneSignal.await();           // wait for all to finish
	}
}

class Worker implements Runnable {
    
    
	private final CountDownLatch startSignal;
	private final CountDownLatch doneSignal;

	Worker(CountDownLatch startSignal, CountDownLatch doneSignal) {
    
    
		this.startSignal = startSignal;
		this.doneSignal = doneSignal;
	}

	public void run() {
    
    
		try {
    
    
			startSignal.await();
			doWork();
			doneSignal.countDown();
		} catch (InterruptedException ex) {
    
    
		} // return;    
	}

	void doWork() {
    
     ...}
}

使用案例二

另一种常用的用法是将一个问题分为N个部分，用一个Runnable描述每个部分并在计数器上递减，然后将所有Runnable排队到Executor种。当所有子部分完成时，协调线程将通过await方法。（当线程必须重复这样递减计数器时，应该使用CyclicBarrier）

class Driver2 {
    
     // ...
	void main() throws InterruptedException {
    
    
		CountDownLatch doneSignal = new CountDownLatch(N);
		Executor e = ...
		for (int i = 0; i < N; ++i) // create and start threads
			e.execute(new WorkerRunnable(doneSignal, i));
		doneSignal.await();           // wait for all to finish
	}
}

class WorkerRunnable implements Runnable {
    
    
	private final CountDownLatch doneSignal;
	private final int i;

	WorkerRunnable(CountDownLatch doneSignal, int i) {
    
    
		this.doneSignal = doneSignal;
		this.i = i;
	}

	public void run() {
    
    
		try {
    
    
			doWork(i);
			doneSignal.countDown();
		} catch (InterruptedException ex) {
    
    
		} // return;
	}

	void doWork() {
    
     ...}
}

Memory consistency effects: Until the count reaches zero, actions in a thread prior to calling countDown() happen-before actions following a successful return from a corresponding await() in another thread.

在CountDownLatch中，直到计数器的值减为零之前，一个线程中在调用countDown方法之前的操作都happens-before于另一个线程中相应await方法成功返回之后的操作。

此话怎讲？这句话描述了CountDownLatch对内存一致性的影响。这意味着所有在线程调用countDown方法之前的操作都将在另一个线程中调用await方法成功返回后被看到。

构造函数

构造函数需要我们传递一个计数器的值，线程可以通过await方法之前必须调用countDown的次数。构造函数还是相当简单，就不作过多描述。

 public CountDownLatch(int count) {
    
    
        // 参数合法校验
        if (count < 0) throw new IllegalArgumentException("count < 0");
     	// 内部类Sync，继承AQS，设置AQS的state值为count
        this.sync = new Sync(count);
    }

 private static final class Sync extends AbstractQueuedSynchronizer {
    
    
        private static final long serialVersionUID = 4982264981922014374L;
		
        Sync(int count) {
    
    
            setState(count);
        }
  //......   
 }

await方法

根据注释章节我们可以知道，await的作用就是开发或者闸门，当count的数值为零值的时候开闸门。

// #CountDownLatch类
// 当前方法可能会有中断异常
public void await() throws InterruptedException {
    
    
    	// 进入到aqs类的方法
        sync.acquireSharedInterruptibly(1);
    }

此方法在AbstractQueuedSynchronizer类中，后续都统一描述为AQS类。

// #AQS类中
public final void acquireSharedInterruptibly(int arg)
            throws InterruptedException {
    
    
    	// 检测线程是否中断过
        if (Thread.interrupted())
            throw new InterruptedException();
    	// 两个核心方法出现
        if (tryAcquireShared(arg) < 0)
            doAcquireSharedInterruptibly(arg);
    }

首先尝试获取到AQS中的state状态值，如果state为0，说明当前所有线程都完成了，需要唤醒await线程。这种情况应该是线程快速完成了工作，当前并不需要await了，主线程直接做后续的工作。

// #CountDownLatch类中实现
// int acquires参数没有使用
protected int tryAcquireShared(int acquires) {
    
    
     return (getState() == 0) ? 1 : -1;
        }

如果获取AQS中的state状态值不为0，那么就需要让当前线程“停住”，等待子线程完成工作。我们先简要的查看下doAcquireSharedInterruptibly方法的源码，大概有个概念再慢慢扣细节。

// #AQS类中
// 根据上述代码可知arg==1
private void doAcquireSharedInterruptibly(int arg)
    throws InterruptedException {
    
    
    // 可以现理解为创建一个SHARED标志的节点入队,下方再进行源码分析
    final Node node = addWaiter(Node.SHARED);
    try {
    
    
        // for循环和cas 老套路就是为了不使用锁来保证线程安全
        // 除此之外，for循环会在LockSupport.unpark后继续
        for (;;) {
    
    
            // 获取当前node节点的前序节点
            final Node p = node.predecessor();
            // 如果前需节点是head节点，进入判断条件，同时也说明会从队列从head-tail去唤醒
            if (p == head) {
    
    
                // 获取state的状态值，如果state等于0返回1否则返回-1
                int r = tryAcquireShared(arg);
                // 说明当前state的值为0，也就是不需要线程等待了
                if (r >= 0) {
    
    
                    // 设置当前node节点为头节点
                    setHeadAndPropagate(node, r);
                    p.next = null; // help GC
                    return;
                }
            }
            // 执行LockSupport.park,让线程休眠等待被唤醒，也就是state为零值的时候会被唤醒
            if (shouldParkAfterFailedAcquire(p, node) &&
                parkAndCheckInterrupt())
                throw new InterruptedException();
        }
    } catch (Throwable t) {
    
    
        // 异常就取消当前node节点，并抛出异常
        cancelAcquire(node);
        throw t;
    }
}

细节一：addWaiter方法

传入的参数为Node.SHARED，AQS中的内部类定义了SHARED和EXCLUSIVE，一个用于共享，一个用于独占。

private Node addWaiter(Node mode) {
    
    
    // 设置了nextWaiter为Node.SHARED
    Node node = new Node(mode);
    // 老规矩，无参数的for循环和cas保证线程安全
    for (;;) {
    
    
        // 获取到当前的队尾节点，标记为旧的队尾节点
        Node oldTail = tail;
        //如果当前队尾为null，说明队列没有初始化，直接跳过if进入else进行同步队列的初始化
        if (oldTail != null) {
    
    
            // 队尾不为null的情况，就进入if逻辑执行插入队尾操作
            // 让当前node节点的PREV指向oldTail，这一步是建立节点间的关联
            node.setPrevRelaxed(oldTail);
            // 更新tail节点，如果不成功说明有竞争就再来一次循环
            if (compareAndSetTail(oldTail, node)) {
    
    
                // 绑定
                oldTail.next = node;
                return node;
            }
        } else {
    
    
            // 初始化一个队列，初始化之后head和tail都会为同一个new Node()
            initializeSyncQueue();
        }
    }
}

细节二：setHeadAndPropagate方法

int r = tryAcquireShared(arg)中的r大于等于0才会进入到这个方法，很明显，这个方法只会返回1和-1，如果aqs中的state状态值为0就返回1，否则返回-1。也就是只有当aqs中的state值（count的数值）为零值的时候，才会进入这个方法。

处于当前步骤，await方法执行都并没有进入等待，因为此时count的数值已经为0了，所以在这个方法内部会进行特殊处理。

private void setHeadAndPropagate(Node node, int propagate) {
    
    
        Node h = head; // Record old head for check below
        setHead(node);
        /*
         * Try to signal next queued node if:
         *   Propagation was indicated by caller,
         *     or was recorded (as h.waitStatus either before
         *     or after setHead) by a previous operation
         *     (note: this uses sign-check of waitStatus because
         *      PROPAGATE status may transition to SIGNAL.)
         * and
         *   The next node is waiting in shared mode,
         *     or we don't know, because it appears null
         *
         * The conservatism in both of these checks may cause
         * unnecessary wake-ups, but only when there are multiple
         * racing acquires/releases, so most need signals now or soon
         * anyway.
         */
    	//CountDownLatch并不关心节点的状态值，只关心count的值是不是被至为零值。
    	// 所以根本没有设置过waitStatus
    	// 进入这个方法这个if判断必定是true，因为propagate为1，count为0
        if (propagate > 0 || h == null || h.waitStatus < 0 ||
            (h = head) == null || h.waitStatus < 0) {
    
    
            Node s = node.next;
            if (s == null || s.isShared())
                // 唤醒
                doReleaseShared();
        }
    }

细节三：shouldParkAfterFailedAcquire

其实在上面的方法中，我们都没有看到过waitStatus有被赋值过，就是默认值为0，这一步就是设置值为SIGNAL。

private static boolean shouldParkAfterFailedAcquire(Node pred, Node node) {
    
    
        int ws = pred.waitStatus;
        if (ws == Node.SIGNAL)
            /*
             * This node has already set status asking a release
             * to signal it, so it can safely park.
             */
            return true;
        if (ws > 0) {
    
    
            /*
             * Predecessor was cancelled. Skip over predecessors and
             * indicate retry.
             */
            do {
    
    
                node.prev = pred = pred.prev;
            } while (pred.waitStatus > 0);
            pred.next = node;
        } else {
    
    
            /*
             * waitStatus must be 0 or PROPAGATE.  Indicate that we
             * need a signal, but don't park yet.  Caller will need to
             * retry to make sure it cannot acquire before parking.
             */
            pred.compareAndSetWaitStatus(ws, Node.SIGNAL);
        }
        return false;
    }

细节四：doReleaseShared

这个方法就是为了去唤醒await的线程。

// #AQS
// 共享模式下的释放操作--向后继节点发出信号并确保传播
private void doReleaseShared() {
    
    
        /*
         * Ensure that a release propagates, even if there are other
         * in-progress acquires/releases.  This proceeds in the usual
         * way of trying to unparkSuccessor of head if it needs
         * signal. But if it does not, status is set to PROPAGATE to
         * ensure that upon release, propagation continues.
         * Additionally, we must loop in case a new node is added
         * while we are doing this. Also, unlike other uses of
         * unparkSuccessor, we need to know if CAS to reset status
         * fails, if so rechecking.
         */
        for (;;) {
    
    
            Node h = head;
            if (h != null && h != tail) {
    
    
                int ws = h.waitStatus;
                if (ws == Node.SIGNAL) {
    
    
                    if (!h.compareAndSetWaitStatus(Node.SIGNAL, 0))
                        continue;            // loop to recheck cases
                    unparkSuccessor(h);
                }
                else if (ws == 0 &&
                         !h.compareAndSetWaitStatus(0, Node.PROPAGATE))
                    continue;                // loop on failed CAS
            }
            if (h == head)                   // loop if head changed
                break;
        }
    }

countDown方法

如果认真的读完注释，其实我们已经直到countDown这个方法就是将构造方法传入的计数值减至零值。然而直到这些并不够，我们需要知道底层是怎么实现的。

// #CountDownLatch类
public void countDown() {
    
    
    // 就一句代码，此处调用的是Sync父类AQS中的releaseShared方法
    sync.releaseShared(1);
}

// #AQS中

//共享模式下的释放，通常用于实现共享锁机制。如果tryReleaseShared方法返回true，
// 就会通过解除一个或多个线程的阻塞状态来实现共享模式下的释放。
// 这样，其他线程就可以获取对共享资源的访问权并进入共享模式
public final boolean releaseShared(int arg) {
    
    
    if (tryReleaseShared(arg)) {
    
    
        doReleaseShared();
        return true;
    }
    return false;
}

通过releaseShared方法，我们知道核心方法就两个，一个是tryReleaseShared一个是doReleaseShared，从名字可以知道这两个方法一个是尝试释放，一个就是释放。

// #CountDownLatch类
// 这个参数 int release根本没使用
protected boolean tryReleaseShared(int releases) {
    
    
    // Decrement count; signal when transition to zero
    for (;;) {
    
    
        // 获取aqs中的state的值
        int c = getState();
        // 如果c==0，直接返回false
        if (c == 0)
            return false;
        int nextc = c - 1;
        // 只是使用了for循环和cas来保证线程安全，没有添加任何锁
        if (compareAndSetState(c, nextc))
            // 如果nextc不为0 返回false
            return nextc == 0;
    }
}

当tryReleaseShared方法返回true，也就是aqs中的state值（count值被减至零值）为零值时，才执行doReleaseShared方法。这个方法就是为了释放park了的线程。和上述的细节四一致。

// #AQS
// 共享模式下的释放操作--向后继节点发出信号并确保传播
private void doReleaseShared() {
    
    
        /*
         * Ensure that a release propagates, even if there are other
         * in-progress acquires/releases.  This proceeds in the usual
         * way of trying to unparkSuccessor of head if it needs
         * signal. But if it does not, status is set to PROPAGATE to
         * ensure that upon release, propagation continues.
         * Additionally, we must loop in case a new node is added
         * while we are doing this. Also, unlike other uses of
         * unparkSuccessor, we need to know if CAS to reset status
         * fails, if so rechecking.
         */
        for (;;) {
    
    
            Node h = head;
            if (h != null && h != tail) {
    
    
                int ws = h.waitStatus;
                if (ws == Node.SIGNAL) {
    
    
                    if (!h.compareAndSetWaitStatus(Node.SIGNAL, 0))
                        continue;            // loop to recheck cases
                    unparkSuccessor(h);
                }
                else if (ws == 0 &&
                         !h.compareAndSetWaitStatus(0, Node.PROPAGATE))
                    continue;                // loop on failed CAS
            }
            if (h == head)                   // loop if head changed
                break;
        }
    }

总结

至于为什么await方法和countDown方法都需要doReleaseShared方法，我说下我看完源码的理解。

await方法需要使用doReleaseShared方法将当前线程挂起，并在CountDownLatch的计数器减为0时被唤醒。当一个线程调用await方法时，它会尝试获取CountDownLatch的锁，如果计数器的值不为0，则线程会被挂起。当计数器的值变为0时，唤醒所有等待线程，这是通过调用doReleaseShared方法来实现的。

类似地，countDown方法也需要使用doReleaseShared方法将计数器的值减1，并在计数器的值变为0时唤醒所有等待线程。当一个线程调用countDown方法时，它会将计数器的值减1，如果计数器的值变为0，则调用doReleaseShared方法来唤醒所有等待线程。