Interface BitVector

All Superinterfaces:
BigList<Boolean>, BooleanBigList, BooleanCollection, BooleanIterable, Collection<Boolean>, Comparable<BigList<? extends Boolean>>, Iterable<Boolean>, RandomAccess, Size64
All Known Implementing Classes:
AbstractBitVector, AbstractBitVector.SubBitVector, BooleanListBitVector, LongArrayBitVector, LongBigArrayBitVector

public interface BitVector extends RandomAccess, BooleanBigList
A vector of bits, a.k.a. bit sequence, bit string, binary word, etc.

This interface define several operations on finite sequences of bits. Efficient implementations, such as LongArrayBitVector, use approximately one bit of memory for each bit in the vector, but this is not enforced.

Operation of a bit vector are partially of boolean nature (e.g., logical operations between vectors), partially of language-theoretical nature (e.g., concatenation), and partially of set-theoretical nature (e.g., asking which bits are set to one). To accomodate all these points of view, this interface extends BooleanBigList, but also provides an asLongSet() method that exposes a BitSet-like view and a asLongBigList(int) method that provides integer-like access to blocks of bits of given width.

Most, if not all, classical operations on bit vectors can be seen as standard operations on these two views: for instance, the number of bits set to one is just the number of elements of the set returned by asLongSet() (albeit a direct count() method is provided, too). The standard Collection.addAll(java.util.Collection) method can be used to concatenate bit vectors, and sublist views make it easy performing any kind of logical operation on subvectors.

The only caveat is that sometimes the standard interface naming clashes slightly with standard usage: for instance, Collection.clear() will not set to zero all bits (use fill(0) for that purpose), but rather will set the vector length to zero. Also, add(long, int) will not add logically a value at the specified index, but rather will insert a new bit with the specified value at the specified position.

To increase clarity, this interface provides methods length() and length(long) which have a semantics similar to Size64.size64() and BigList.size(long).

The AbstractBitVector class provides a fairly complete abstract implementation that provides all methods except for the most basic operations. Of course, the methods of AbstractBitVector are sometimes very inefficient, but implementations such as LongArrayBitVector have their own optimised implementations.

  • Method Details

    • set

      void set(long index)
      Sets a bit in this bit vector (optional operation).
      Parameters:
      index - the index of a bit.
    • clear

      void clear(long index)
      Clears a bit in this bit vector (optional operation).
      Parameters:
      index - the index of a bit.
    • flip

      void flip(long index)
      Flips a bit in this bit vector (optional operation).
      Parameters:
      index - the index of a bit.
    • fill

      void fill(long from, long to, boolean value)
      Fills a range of bits in this bit vector (optional operation).
      Parameters:
      from - the first index (inclusive).
      to - the last index (not inclusive).
      value - the value (true or false).
    • fill

      void fill(long from, long to, int value)
      Clears a range of bits in this bit vector (optional operation).
      Parameters:
      from - the first index (inclusive).
      to - the last index (not inclusive).
      value - the value (zero or nonzero).
    • fill

      void fill(boolean value)
      Sets all bits this bit vector to the given boolean value (optional operation).
      Parameters:
      value - the value (true or false).
    • fill

      void fill(int value)
      Sets all bits this bit vector to the given integer value (optional operation).
      Parameters:
      value - the value (zero or nonzero).
    • flip

      void flip(long from, long to)
      Flips a range of bits in this bit vector (optional operation).
      Parameters:
      from - the first index (inclusive).
      to - the last index (not inclusive).
    • flip

      void flip()
      Flips all bits in this bit vector (optional operation).
    • replace

      BitVector replace(BitVector bitVector)
      Replaces the content of this bit vector with another bit vector.
      Parameters:
      bitVector - a bit vector.
      Returns:
      this bit vector.
    • subVector

      BitVector subVector(long from, long to)
      Returns a subvector view specified by initial and final index.

      The object returned by this method is a bit vector representing a view of this bit vector restricted to the given indices. Changes to the subvector will be reflected in the main vector.

      Parameters:
      from - the first index (inclusive).
      to - the last index (not inclusive).
      Returns:
      a subvector view specified by initial and final index.
    • subVector

      BitVector subVector(long from)
      Returns a subvector view specified by initial index and running up to the end of this bit vector.
      Parameters:
      from - the first index (inclusive).
      Returns:
      a subvector view specified by initial index and running up to the end of this bit vector.
      See Also:
    • asLongSet

      LongSortedSet asLongSet()
      Returns a view of this bit vector as a sorted set of long integers.

      More formally, this bit vector is infinitely extended to the left with zeros (e.g., all bits beyond length(long) are considered zeroes). The resulting infinite string is interpreted as the characteristic function of a set of integers.

      Note that, in particular, the resulting string representation is exactly that of a BitSet.

      Returns:
      a view of this bit vector as a sorted set of long integers.
      API Notes:
      Implementations are allowed to prohibit modifications outside of the current size of the bit vector.
    • asLongBigList

      LongBigList asLongBigList(int width)
      Returns a view of this bit vector as a list of nonnegative integers of specified width.

      More formally, getLong(p) will return the nonnegative integer defined by the bits starting at p * width (bit 0, inclusive) and ending at (p + 1) * width (bit width − 1, exclusive).

      Parameters:
      width - a bit width.
      Returns:
      a view of this bit vector as a list of nonnegative integers of specified width.
    • getInt

      int getInt(long index)
      Returns the value of the specified bit as an integer.

      This method is a useful synonym for BooleanBigList.getBoolean(long).

      Parameters:
      index - the index of a bit.
      Returns:
      the value of the specified bit as an integer (0 or 1).
    • getLong

      long getLong(long from, long to)
      Returns the specified bit range as a long.

      Note that bit 0 of the returned long will be bit from of this bit vector.

      Implementations are invited to provide high-speed implementations for the case in which from is a multiple of Long.SIZE and to is from + Long.SIZE (or less, in case the vector length is exceeded). This behaviour make it possible to implement high-speed hashing, copies, etc.

      Parameters:
      from - the starting bit (inclusive).
      to - the ending bit (exclusive).
      Returns:
      the long value contained in the specified bits.
    • set

      void set(long index, int value)
      Sets the value of the specified bit as an integer (optional operation).

      This method is a useful synonym for BooleanBigList.set(long, boolean).

      Parameters:
      index - the index of a bit.
      value - the new value (any nonzero integer for setting the bit, zero for clearing the bit).
    • add

      void add(long index, int value)
      Adds a bit with specified integer value at the specified index (optional operation).

      This method is a useful synonym for BooleanBigList.add(long, boolean).

      Parameters:
      index - the index of a bit.
      value - the value that will be inserted at position index (any nonzero integer for a true bit, zero for a false bit).
    • add

      void add(int value)
      Adds a bit with specified value at the end of this bit vector.

      This method is a useful synonym for BooleanList.add(boolean).

      Parameters:
      value - the new value (any nonzero integer for a true bit, zero for a false bit).
    • append

      BitVector append(long value, int k)
      Appends the less significant bits of a long integer to this bit vector.
      Parameters:
      value - a value to be appended
      k - the number of less significant bits to be added to this bit vector.
      Returns:
      this bit vector.
    • append

      BitVector append(BitVector bitVector)
      Appends another bit vector to this bit vector.
      Parameters:
      bitVector - a bit vector to be appended.
      Returns:
      this bit vector.
    • length

      long length()
      Returns the number of bits in this bit vector.

      If the number of bits in this bit vector is smaller than or equal to Integer.MAX_VALUE, this method is semantically equivalent to List.size(). In any case, this method is semantically equivalent to Size64.size64(), but it is prefererred.

      Returns:
      the number of bits in this bit vector.
    • length

      BitVector length(long newLength)
      Sets the number of bits in this bit vector.

      It is expected that this method will try to allocate exactly the necessary space.

      If the argument fits an integer, this method has the same side effects of BooleanList.size(int). In any case, this method has the same side effects of BigList.size(long), but it is preferred, as it has the advantage of returning this bit vector, thus making it possible to chain methods.

      Parameters:
      newLength - the new length in bits for this bit vector.
      Returns:
      this bit vector.
    • count

      long count()
      Counts the number of bits set to true in this bit vector.
      Returns:
      the number of bits set to true in this bit vector.
    • and

      Performs a logical and between this bit vector and another one, leaving the result in this vector.
      Parameters:
      v - a bit vector.
      Returns:
      this bit vector.
    • or

      Performs a logical or between this bit vector and another one, leaving the result in this bit vector.
      Parameters:
      v - a bit vector.
      Returns:
      this bit vector.
    • xor

      Performs a logical xor between this bit vector and another one, leaving the result in this vector.
      Parameters:
      v - a bit vector.
      Returns:
      this bit vector.
    • firstOne

      long firstOne()
      Returns the position of the first bit set in this bit vector.
      Returns:
      the first bit set, or -1 for a vector of zeroes.
    • lastOne

      long lastOne()
      Returns the position of the last bit set in this bit vector.
      Returns:
      the last bit set, or -1 for a vector of zeroes.
    • nextOne

      long nextOne(long index)
      Returns the position of the first bit set at of after the given position.
      Parameters:
      index - a bit position.
      Returns:
      the position of the first bit set at or after position index, or -1 if no such bit exists.
    • previousOne

      long previousOne(long index)
      Returns the position of the first bit set strictly before the given position.
      Parameters:
      index - a bit position.
      Returns:
      the position of the first bit set strictly before position index, or -1 if no such bit exists.
    • firstZero

      long firstZero()
      Returns the position of the first bit unset in this bit vector.
      Returns:
      the first bit unset, or -1 for a vector of ones.
    • lastZero

      long lastZero()
      Returns the position of the last bit unset in this bit vector.
      Returns:
      the last bit unset, or -1 for a vector of ones.
    • nextZero

      long nextZero(long index)
      Returns the position of the first bit unset after the given position.
      Parameters:
      index - a bit position.
      Returns:
      the first bit unset after position index (inclusive), or -1 if no such bit exists.
    • previousZero

      long previousZero(long index)
      Returns the position of the first bit unset before or at the given position.
      Parameters:
      index - a bit position.
      Returns:
      the first bit unset before or at the given position, or -1 if no such bit exists.
    • longestCommonPrefixLength

      long longestCommonPrefixLength(BitVector v)
      Returns the length of the greatest common prefix between this and the specified bit vector.
      Parameters:
      v - a bit vector.
      Returns:
      the length of the greatest common prefix.
    • isPrefix

      boolean isPrefix(BitVector v)
      Returns true if this bit vector is a prefix of the specified bit vector.
      Parameters:
      v - a bit vector.
      Returns:
      true if this bit vector is a prefix of v.
    • isProperPrefix

      boolean isProperPrefix(BitVector v)
      Returns true if this bit vector is a proper prefix of the specified bit vector.
      Parameters:
      v - a bit vector.
      Returns:
      true if this bit vector is a proper prefix of v (i.e., it is a prefix but not equal).
    • equals

      boolean equals(BitVector v, long from, long to)
      Checks for equality with a segment of another vector.
      Parameters:
      v - a bit vector.
      from - the starting bit, inclusive.
      to - the ending bit, not inclusive.
      Returns:
      true if this bit vector and v are equal in the range of positions [from..to).
    • copy

      BitVector copy(long from, long to)
      Returns a copy of a part of this bit vector.
      Parameters:
      from - the starting bit, inclusive.
      to - the ending bit, not inclusive.
      Returns:
      a copy of the part of this bit vector going from bit from (inclusive) to bit to (not inclusive)
    • copy

      BitVector copy()
      Returns a copy of this bit vector.
      Returns:
      a copy of this bit vector.
    • bits

      long[] bits()
      Returns the bits in this bit vector as an array of longs, not to be modified.
      Returns:
      an array of longs whose first length() bits contain the bits of this bit vector. The array cannot be modified.
    • hashCode

      int hashCode()
      Returns a hash code for this bit vector.

      Hash codes for bit vectors are defined as follows:

       final long length = length();
       long fullLength = length & -Long.SIZE;
       long h = 0x9e3779b97f4a7c13L ^ length;
       for(long i = 0; i < fullLength; i += Long.SIZE) h ^= (h << 5) + getLong(i, i + Long.SIZE) + (h >>> 2);
       if (length != fullLength) h ^= (h << 5) + getLong(fullLength, length) + (h >>> 2);
       (int)((h >>> 32) ^ h);
       

      The last value is the hash code of the bit vector. This hashing is based on shift-add-xor hashing (M.V. Ramakrishna and Justin Zobel, “Performance in practice of string hashing functions”, Proc. of the Fifth International Conference on Database Systems for Advanced Applications, 1997, pages 215−223).

      The returned value is not a high-quality hash such as Jenkins's, but it can be computed very quickly; in any case, 32 bits are too few for a high-quality hash to be used in large-scale applications.

      Important: all bit vector implementations are required to return the value defined here. The simplest way to obtain this result is to subclass AbstractBitVector.

      Specified by:
      hashCode in interface Collection<Boolean>
      Overrides:
      hashCode in class Object
      Returns:
      a hash code for this bit vector.
    • fast

      BitVector fast()
      Returns a fast version of this bit vector.

      Different implementations of this interface might provide different level of efficiency. For instance, views on other data structures (e.g., strings) might implement getLong(long, long) efficiently on multiples of Long.SIZE, but might fail to provide a generic, truly efficient random access.

      This method returns a (possibly immutable) bit vector with the same content as that of this bit vector. However, the returned bit vector is guaranteed to provide fast random access.

      Returns:
      a fast version of this bit vector.
    • size

      @Deprecated default int size()
      Deprecated.
      Please use Size64.size64() instead.
      Specified by:
      size in interface BigList<Boolean>
      Specified by:
      size in interface Collection<Boolean>
      Specified by:
      size in interface Size64