Class FrontCodedStringBigList

All Implemented Interfaces:
BigList<MutableString>, ObjectBigList<MutableString>, ObjectCollection<MutableString>, ObjectIterable<MutableString>, Size64, Stack<MutableString>, Serializable, Comparable<BigList<? extends MutableString>>, Iterable<MutableString>, Collection<MutableString>, RandomAccess

public class FrontCodedStringBigList extends AbstractObjectBigList<MutableString> implements RandomAccess, Serializable
Compact storage of strings using front-coding compression (also known as compression by prefix omission).

This class is functionally identical to FrontCodedStringList, except for the larger size allowed.

See Also:
  • Field Details

  • Constructor Details

    • FrontCodedStringBigList

      public FrontCodedStringBigList(Iterator<? extends CharSequence> words, int ratio, boolean utf8)
      Creates a new front-coded string list containing the character sequences returned by the given iterator.
      Parameters:
      words - an iterator returning character sequences.
      ratio - the desired ratio.
      utf8 - if true, the strings will be stored as UTF-8 byte arrays.
    • FrontCodedStringBigList

      public FrontCodedStringBigList(Collection<? extends CharSequence> c, int ratio, boolean utf8)
      Creates a new front-coded string list containing the character sequences contained in the given collection.
      Parameters:
      c - a collection containing character sequences.
      ratio - the desired ratio.
      utf8 - if true, the strings will be stored as UTF-8 byte arrays.
  • Method Details

    • utf8

      public boolean utf8()
      Returns whether this front-coded string list is storing its strings as UTF-8 encoded bytes.
      Returns:
      true if this front-coded string list is keeping its data as an array of UTF-8 encoded bytes.
    • ratio

      public int ratio()
      Returns the ratio of the underlying front-coded list.
      Returns:
      the ratio of the underlying front-coded list.
    • get

      public MutableString get(long index)
      Returns the element at the specified position in this front-coded string big list as a mutable string.
      Specified by:
      get in interface BigList<MutableString>
      Parameters:
      index - an index in the list.
      Returns:
      a MutableString that will contain the string at the specified position. The string may be freely modified.
    • get

      public void get(long index, MutableString s)
      Returns the element at the specified position in this front-coded string big list by storing it in a mutable string.
      Parameters:
      index - an index in the list.
      s - a mutable string that will contain the string at the specified position.
    • countUTF8Chars

      protected static int countUTF8Chars(byte[] a)
    • byte2Char

      protected static char[] byte2Char(byte[] a, char[] s)
    • listIterator

      public ObjectBigListIterator<MutableString> listIterator(long k)
      Specified by:
      listIterator in interface BigList<MutableString>
      Specified by:
      listIterator in interface ObjectBigList<MutableString>
      Overrides:
      listIterator in class AbstractObjectBigList<MutableString>
    • size64

      public long size64()
      Specified by:
      size64 in interface Size64
    • dump

      public void dump(String basename) throws ConfigurationException, IOException
      Throws:
      ConfigurationException
      IOException
    • main

      public static void main(String[] arg) throws IOException, com.martiansoftware.jsap.JSAPException, NoSuchMethodException
      Throws:
      IOException
      com.martiansoftware.jsap.JSAPException
      NoSuchMethodException