Package it.unimi.dsi.io
Classes in this package fulfill needs that are not satisfied by the standard I/O classes available.
Reading text
We provide replacement classes such as FastBufferedReader
and classes exposing the lines of
a file as an Iterable. The general
WordReader
interface is used by MG4J
to provide customizable word segmentation.
Bit-level I/O
The standard Java API lacks bit-level I/O classes: to this purpose, we
provide InputBitStream
and OutputBitStream
, which can wrap any standard Java
corresponding stream and make it work at the bit level; moreover, they
provide support for several useful formats (such as unary, binary, minimal
binary, γ, δ and Golomb encoding).
Bit input and output streams offer also efficient buffering and a way to
reposition the bit stream in case the underlying byte stream is a
file-based stream or a RepositionableStream
.
Conventions
All coding methods work on natural numbers. The encoding of zero is very natural for some techniques, and much less natural for others. To keep methods rationally organized, all methods are able to encode any natural number. If, for instance, you want to write positive numbers in unary encoding and you do not want to waste a bit, you have to decrement them first (i.e., instead of p you must encode p − 1).
-
ClassDescriptionA bridge between byte buffers and input streams.A queue of bytes partially stored on disk.A debugging wrapper for input bit streams.A debugging wrapper for output bit streams.A word reader that breaks words on a given set of characters.A lightweight, unsynchronised buffered reader based on mutable strings.A wrapper exhibiting the lines of a file as an
Iterable
of byte arrays.An iterator over the lines of aFileLinesByteArrayIterable
.Deprecated.Deprecated.Please useFileLinesMutableStringIterable.iterator(java.io.InputStream, java.nio.charset.Charset, Class)
; thezipped
option of this class can be simulated by passing aGZIPInputStream
as decompressor.A wrapper exhibiting the lines of a file as anIterable
of mutable strings.An iterator over the lines of aFileLinesMutableStringIterable
.Bit-level input stream.An adapter that exposes a fast buffered reader as an iterator over the returned lines.A trivialWordReader
that considers each line of a document a single word.A multiple input stream.End-of-stream-only input stream.Throw-it-away output stream.End-of-stream-only reader.OfflineIterable<T,U extends T> An iterable that offers elements that were previously stored offline using specialized serialization methods.OfflineIterable.OfflineIterator<A,B extends A> An iterator returned by anOfflineIterable
.OfflineIterable.Serializer<A,B extends A> Determines a strategy to serialize and deserialize elements.Bit-level output stream.A marker interface for a closeable resource that implements safety measures to make resource tracking easier.Exhibits a singleInputStream
as a number of streams divided intoreset()
-separated segments.An interface providing methods to break the input from a reader into words.
FileLinesMutableStringIterable
instead; thezipped
option of this class can be simulated by passing aGZIPInputStream
as decompressor.