Java BitSet

适用于操作整型数据。每个数据由一位bit表示，也就是说每一个整型只需一个bit的控件，这个比collection等容器是有天大的优势啊。而且插入和查询都非常方便，因为bit的index和整型数值相等，如：存储127，那么BitSet中的第127位bit就被置为1。查找127是否在BitSet中，只需使用MASK进行与操作即可。

但是可以想象到，BitSet中至少需要有多少位bit，决定于存储在BitSet中的最大整型值。所以如果存储的整型值的范围是0到一千万，而总共只有0，100，一千万三个值，那么就很浪费空间了，就是本来几十个字节的东西，用了1M的空间。所以适用BitSet的场景是要数值范围越小越好，也就是说数值越集中越好。比如都集中在一千万到一千万零2之间，那么用3位bit就可以搞定了，当然要转换以下，一千万就用0代替存储在BitSet中，查询的时候也用0代替查，以此类推。

Java中目前的实现是使用long数组。使用的肯定是1<<value循环移位了，但是首先要计算出value存储在数组中哪个long中，然后再对该long值移位。查询也是，先查出这个value在哪个long中，然后再使用MASK进行与操作查询。

存储value的源码：

    /**
     * Sets the bit at the specified index to {@code true}.
     *
     * @param  bitIndex a bit index
     * @throws IndexOutOfBoundsException if the specified index is negative
     * @since  JDK1.0
     */
    public void set(int bitIndex) {
        if (bitIndex < 0)
            throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

        int wordIndex = wordIndex(bitIndex);
        expandTo(wordIndex);

        words[wordIndex] |= (1L << bitIndex); // Restores invariants

        checkInvariants();
    }

    /**
     * Returns the value of the bit with the specified index. The value
     * is {@code true} if the bit with the index {@code bitIndex}
     * is currently set in this {@code BitSet}; otherwise, the result
     * is {@code false}.
     *
     * @param  bitIndex   the bit index
     * @return the value of the bit with the specified index
     * @throws IndexOutOfBoundsException if the specified index is negative
     */
    public boolean get(int bitIndex) {
        if (bitIndex < 0)
            throw new IndexOutOfBoundsException("bitIndex < 0: " + bitIndex);

        checkInvariants();

        int wordIndex = wordIndex(bitIndex);
        return (wordIndex < wordsInUse)
            && ((words[wordIndex] & (1L << bitIndex)) != 0);
    }

猜你喜欢