【原创】从源码剖析IO流（三）缓存流--转载请注明出处

一、BufferedInputStream

关于BufferedInputStream，首先我们要看一下，官方给予的对于BufferedInputStream这个类的备注：

/**
 * A <code>BufferedInputStream</code> adds
 * functionality to another input stream-namely,
 * the ability to buffer the input and to
 * support the <code>mark</code> and <code>reset</code>
 * methods. When  the <code>BufferedInputStream</code>
 * is created, an internal buffer array is
 * created. As bytes  from the stream are read
 * or skipped, the internal buffer is refilled
 * as necessary  from the contained input stream,
 * many bytes at a time. The <code>mark</code>
 * operation  remembers a point in the input
 * stream and the <code>reset</code> operation
 * causes all the  bytes read since the most
 * recent <code>mark</code> operation to be
 * reread before new bytes are  taken from
 * the contained input stream.
 *
 * @author  Arthur van Hoff
 * @since   JDK1.0
 */

这段备注的翻译为：

对另一个输入流的功能，即缓冲输入和支持<代码>标记< /代码>和<代码>重置< /代码>方法。当创建<代码> BufferedInputStream</Cuff>时，创建一个内部缓冲区数组。当从流读取字节或跳过字节时，内部缓冲区根据需要从所包含的输入流中重新填充，一次多个字节。<代码>标记< /代码>操作记住输入流中的一个点，而<代码>重置< /代码>操作会导致自重新开始以来读取的所有字节。在从包含的输入流中获取新字节之前，应重新分配“代码>代码>标记<代码>操作。

要注意上面的标红的部分，BufferedInputStream和InputStream的主要区别，便是在于BufferedInputStream在一开始使用的时候，会进行初始化一个数组缓冲区，在每次进行操作时，均是从该缓冲区中进行读取内容。在缓存输入流中，主要具有以下的成员属性：

//缓冲区默认的默认大小
    private static int DEFAULT_BUFFER_SIZE = 8192;

    /**
     * 分派给arrays的最大容量
     * 为什么要减去8呢？
     * 因为某些VM会在数组中保留一些头字，尝试分配这个最大存储容量，
     * 可能会导致array容量大于VM的limit，最终导致OutOfMemoryError。
     */
    private static int MAX_BUFFER_SIZE = Integer.MAX_VALUE - 8;

    /**
     * 存放数据的内部缓冲数组。
     * 当有必要时，可能会被另一个不同容量的数组替代。
     */
    protected volatile byte buf[];

    /**
     * 为缓冲区提供compareAndSet的原子更新器。
     * 这是很有必要的，因为关闭操作可以使异步的。我们使用非空的缓冲区数组作为流被关闭的指示器。
     * 该成员变量与buf数组的volatile关键字共同作用，实现了当在多线程环境中操作BufferedInputStream对象时，buf和bufUpdater都具有原子性。
     */
    private static final
        AtomicReferenceFieldUpdater<BufferedInputStream, byte[]> bufUpdater =
        AtomicReferenceFieldUpdater.newUpdater
        (BufferedInputStream.class,  byte[].class, "buf");

    /**
     * 缓冲区中的字节数。
     */
    protected int count;

    /**
     * 缓冲区当前位置的索引
     */
    protected int pos;

    /**
     * 最后一次调用mark方法时pos字段的值。
     */
    protected int markpos = -1;

    /**
     * 调用mark方法后，在后续调用reset方法失败之前所允许的最大提前读取量。
     * markpos的最大值
     */
    protected int marklimit;

然后再来看一下缓存输入流的构造器：

    public BufferedInputStream(InputStream in) {
        this(in, DEFAULT_BUFFER_SIZE);
    }

    public BufferedInputStream(InputStream in, int size) {
        super(in);
        if (size <= 0) {
            throw new IllegalArgumentException("Buffer size <= 0");
        }
        buf = new byte[size];
    }

这里在进行初始化时，可以使用一个默认的，或者自定义的长度来定义一个缓冲区字节数组的大小。但是在进行构造器构造时，并没有对缓冲区的内容进行初始化操作。此时，我们就需要来看到BufferedInputStream中的read，read1和fill三个方法：

    public synchronized int read(byte b[], int off, int len) throws IOException
    {
        getBufIfOpen(); // Check for closed stream
        if ((off | len | (off + len) | (b.length - (off + len))) < 0) {
            throw new IndexOutOfBoundsException();
        } else if (len == 0) {
            return 0;
        }

        int n = 0;
        for (;;) {
            int nread = read1(b, off + n, len - n);
            if (nread <= 0)
                return (n == 0) ? nread : n;
            n += nread;
            if (n >= len)
                return n;
            // if not closed but no bytes available, return
            InputStream input = in;
            if (input != null && input.available() <= 0)
                return n;
        }
    }

    private int read1(byte[] b, int off, int len) throws IOException {
        int avail = count - pos;
        if (avail <= 0) {
            /* If the requested length is at least as large as the buffer, and
               if there is no mark/reset activity, do not bother to copy the
               bytes into the local buffer.  In this way buffered streams will
               cascade harmlessly. */
            if (len >= getBufIfOpen().length && markpos < 0) {
                return getInIfOpen().read(b, off, len);
            }
            fill();
            avail = count - pos;
            if (avail <= 0) return -1;
        }
        int cnt = (avail < len) ? avail : len;
        System.arraycopy(getBufIfOpen(), pos, b, off, cnt);
        pos += cnt;
        return cnt;
    }

    private void fill() throws IOException {
        byte[] buffer = getBufIfOpen();
        if (markpos < 0)
            pos = 0;            /* no mark: throw away the buffer */
        else if (pos >= buffer.length)  /* no room left in buffer */
            if (markpos > 0) {  /* can throw away early part of the buffer */
                int sz = pos - markpos;
                System.arraycopy(buffer, markpos, buffer, 0, sz);
                pos = sz;
                markpos = 0;
            } else if (buffer.length >= marklimit) {
                markpos = -1;   /* buffer got too big, invalidate mark */
                pos = 0;        /* drop buffer contents */
            } else if (buffer.length >= MAX_BUFFER_SIZE) {
                throw new OutOfMemoryError("Required array size too large");
            } else {            /* grow buffer */
                int nsz = (pos <= MAX_BUFFER_SIZE - pos) ?
                        pos * 2 : MAX_BUFFER_SIZE;
                if (nsz > marklimit)
                    nsz = marklimit;
                byte nbuf[] = new byte[nsz];
                System.arraycopy(buffer, 0, nbuf, 0, pos);
                if (!bufUpdater.compareAndSet(this, buffer, nbuf)) {
                    // Can't replace buf if there was an async close.
                    // Note: This would need to be changed if fill()
                    // is ever made accessible to multiple threads.
                    // But for now, the only way CAS can fail is via close.
                    // assert buf == null;
                    throw new IOException("Stream closed");
                }
                buffer = nbuf;
            }
        count = pos;
        int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
        if (n > 0)
            count = n + pos;
    }

首先来说一下这三个方法的作用分别如下：

read：直接读取所需要的字节，然后组装进参数b中。如果没有读取到足够的字节，流中依然存在数据可以读取，则再次调用read1方法进行读取。

read1：直接读取所需要的字节，然后组装进参数b中，如果当前的缓存区数组中的数据已经被全部读取完毕了，则重新调用fill方法进行读取新的缓冲区内容。

fill：读取文件中指定长度的内容到缓冲区中。

由此，可以画出下方的流程图：

最后再来看一下Close方法中的操作：

close方法中，会对所持有的InputStream对象进行关闭，然后对内部存在的缓冲区的对象进行清除。

    public void close() throws IOException {
        byte[] buffer;
        while ( (buffer = buf) != null) {
            if (bufUpdater.compareAndSet(this, buffer, null)) {
                InputStream input = in;
                in = null;
                if (input != null)
                    input.close();
                return;
            }
            // Else retry in case a new buf was CASed in fill()
        }
    }

二、BufferedInputStream总结：

缓存输入流，本身是以一个byte类型的数组作为缓存容器，优先将输入流中的内容读取到缓存容器中，然后再从容器中输出到外部来供使用者使用的类。利用的是输入流在一次读取较多字节时，效率高于多次读取较少字节来的特点来设计的。以此降低从输入流中读取字节的频率，以达到对于长内容，在较短读取字节长度下快速读取的功能。对于该功能我们写出以下的用例来进行测试：

    public static void main(String args[]) throws Exception{
        {
            long startTime = System.currentTimeMillis();
            for (int i = 1; i <= 100; i++) {
                FileInputStream fis = new FileInputStream("E:/testFile/test.txt");
                for (byte[] b = new byte[1000]; fis.read(b, 0, 1000) > 0; ) {
                }
                fis.close();
            }
            long endTime = System.currentTimeMillis();
            System.out.println("每次读取1000字节用时：" + (endTime - startTime));
        }
        {
            long startTime = System.currentTimeMillis();
            for (int i = 1 ; i <= 100 ; i ++) {
                FileInputStream fis = new FileInputStream("E:/testFile/test.txt");
                for (byte[] b = new byte[1000] ; fis.read(b, 0, 1) > 0 ; ) {}
                fis.close();
            }
            long endTime = System.currentTimeMillis();
            System.out.println("每次读取1字节用时：" + (endTime - startTime));
        }
    }

执行的结果为：

每次读取1000字节用时：4
每次读取1字节用时：50

三、BufferedOutputStream

在学习完了BufferedInputStream之后，再来看BufferedOutPutStream就能很好的理解这个流的内容了。

首先我们需要看一下BufferedInputStream中的flushBuffer方法，该方法是将已经缓存的所有的字节，调用输出流，将字节刷新输出。这个方法会在close()方法和write方法中进行使用。

    private void flushBuffer() throws IOException {
        if (count > 0) {
            out.write(buf, 0, count);
            count = 0;
        }
    }

接下来我们只要来看一下write方法即可：

    public synchronized void write(byte b[], int off, int len) throws IOException {
        if (len >= buf.length) {
            /* If the request length exceeds the size of the output buffer,
               flush the output buffer and then write the data directly.
               In this way buffered streams will cascade harmlessly. */
            flushBuffer();
            out.write(b, off, len);
            return;
        }
        if (len > buf.length - count) {
            flushBuffer();
        }
        System.arraycopy(b, off, buf, count, len);
        count += len;
    }

wirte方法和BufferedInputStream中的read方法相反，read方法为先写入到缓存区数组中，再输出给用户，而write方法为先缓存到缓存区，当缓存区存储满了以后再输出到进行输出到文件或其他地方。

在BufferedOutPutStream中，没有提供出特殊的Close()方法，而是使用了其父类FilterOutPutStream中的Close方法，关闭自己所持有的out对象后，再调用flush方法进行刷新。即，在Close方法调用时，将会直接关闭成员属性out，然后将所有的缓存内容写入到输出流中。

    public synchronized void flush() throws IOException {
        flushBuffer();
        out.flush();
    }

四、BufferedOutPutStream总结：

缓存输入流是对流中的内容进行批量读取，然后进行分段输出给使用者，而缓存输出流，则是将分段输出的内容，缓存到缓存区数组中，然后进行批量写入到输出流中。

【原创】从源码剖析IO流（三）缓存流--转载请注明出处

猜你喜欢