Package it.unimi.dsi.big.util
Class FrontCodedStringBigList
java.lang.Object
java.util.AbstractCollection<K>
it.unimi.dsi.fastutil.objects.AbstractObjectCollection<K>
it.unimi.dsi.fastutil.objects.AbstractObjectBigList<MutableString>
it.unimi.dsi.big.util.FrontCodedStringBigList
- All Implemented Interfaces:
BigList<MutableString>
,ObjectBigList<MutableString>
,ObjectCollection<MutableString>
,ObjectIterable<MutableString>
,Size64
,Stack<MutableString>
,Serializable
,Comparable<BigList<? extends MutableString>>
,Iterable<MutableString>
,Collection<MutableString>
,RandomAccess
public class FrontCodedStringBigList
extends AbstractObjectBigList<MutableString>
implements RandomAccess, Serializable
Compact storage of strings using front-coding compression (also known as compression by prefix
omission).
This class is functionally identical to FrontCodedStringList
, except for the larger size
allowed.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList
AbstractObjectBigList.ObjectRandomAccessSubList<K extends Object>, AbstractObjectBigList.ObjectSubList<K extends Object>
-
Field Summary
Modifier and TypeFieldDescriptionprotected final ByteArrayFrontCodedBigList
The underlyingByteArrayFrontCodedBigList
, ornull
.protected final CharArrayFrontCodedBigList
The underlyingCharArrayFrontCodedBigList
, ornull
.static final long
protected final boolean
Whether this front-coded list is UTF-8 encoded. -
Constructor Summary
ConstructorDescriptionFrontCodedStringBigList
(Collection<? extends CharSequence> c, int ratio, boolean utf8) Creates a new front-coded string list containing the character sequences contained in the given collection.FrontCodedStringBigList
(Iterator<? extends CharSequence> words, int ratio, boolean utf8) Creates a new front-coded string list containing the character sequences returned by the given iterator. -
Method Summary
Modifier and TypeMethodDescriptionprotected static char[]
byte2Char
(byte[] a, char[] s) protected static int
countUTF8Chars
(byte[] a) void
get
(long index) Returns the element at the specified position in this front-coded string big list as a mutable string.void
get
(long index, MutableString s) Returns the element at the specified position in this front-coded string big list by storing it in a mutable string.listIterator
(long k) static void
int
ratio()
Returns the ratio of the underlying front-coded list.long
size64()
boolean
utf8()
Returns whether this front-coded string list is storing its strings as UTF-8 encoded bytes.Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectBigList
add, add, addAll, addAll, addElements, addElements, clear, compareTo, contains, ensureIndex, ensureRestrictedIndex, equals, forEach, getElements, hashCode, indexOf, iterator, lastIndexOf, listIterator, peek, pop, push, remove, removeElements, set, setElements, size, size, subList, top, toString
Methods inherited from class java.util.AbstractCollection
containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArray
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
Methods inherited from interface java.util.Collection
containsAll, isEmpty, parallelStream, remove, removeAll, removeIf, retainAll, stream, toArray, toArray, toArray
Methods inherited from interface it.unimi.dsi.fastutil.objects.ObjectBigList
addAll, addAll, addAll, addAll, getElements, setElements, setElements, spliterator
-
Field Details
-
serialVersionUID
public static final long serialVersionUID- See Also:
-
byteFrontCodedBigList
The underlyingByteArrayFrontCodedBigList
, ornull
. -
charFrontCodedBigList
The underlyingCharArrayFrontCodedBigList
, ornull
. -
utf8
protected final boolean utf8Whether this front-coded list is UTF-8 encoded.
-
-
Constructor Details
-
FrontCodedStringBigList
Creates a new front-coded string list containing the character sequences returned by the given iterator.- Parameters:
words
- an iterator returning character sequences.ratio
- the desired ratio.utf8
- if true, the strings will be stored as UTF-8 byte arrays.
-
FrontCodedStringBigList
Creates a new front-coded string list containing the character sequences contained in the given collection.- Parameters:
c
- a collection containing character sequences.ratio
- the desired ratio.utf8
- if true, the strings will be stored as UTF-8 byte arrays.
-
-
Method Details
-
utf8
public boolean utf8()Returns whether this front-coded string list is storing its strings as UTF-8 encoded bytes.- Returns:
- true if this front-coded string list is keeping its data as an array of UTF-8 encoded bytes.
-
ratio
public int ratio()Returns the ratio of the underlying front-coded list.- Returns:
- the ratio of the underlying front-coded list.
-
get
Returns the element at the specified position in this front-coded string big list as a mutable string.- Specified by:
get
in interfaceBigList<MutableString>
- Parameters:
index
- an index in the list.- Returns:
- a
MutableString
that will contain the string at the specified position. The string may be freely modified.
-
get
Returns the element at the specified position in this front-coded string big list by storing it in a mutable string.- Parameters:
index
- an index in the list.s
- a mutable string that will contain the string at the specified position.
-
countUTF8Chars
protected static int countUTF8Chars(byte[] a) -
byte2Char
protected static char[] byte2Char(byte[] a, char[] s) -
listIterator
- Specified by:
listIterator
in interfaceBigList<MutableString>
- Specified by:
listIterator
in interfaceObjectBigList<MutableString>
- Overrides:
listIterator
in classAbstractObjectBigList<MutableString>
-
size64
public long size64() -
dump
- Throws:
ConfigurationException
IOException
-
main
-