Class FastBufferedReader

java.lang.Object
java.io.Reader
it.unimi.dsi.io.FastBufferedReader
All Implemented Interfaces:
WordReader, Closeable, Serializable, AutoCloseable, Readable
Direct Known Subclasses:
DelimitedWordReader

public class FastBufferedReader extends Reader implements WordReader
A lightweight, unsynchronised buffered reader based on mutable strings.

This class provides buffering for readers, but it does so with purposes and an internal logic that are radically different from the ones adopted in BufferedReader.

There is no support for marking. All methods are unsychronised. All methods returning strings do so by writing in a given MutableString.

Note that instances of this class can wrap an array or a mutable string. In this case, instances of this class may be used as a lightweight, unsynchronised alternative to CharArrayReader providing additional services such as word and line breaking.

As any WordReader, this class is serialisable. The only field kept is the current buffer size, which will be used to rebuild a fast buffered reader with the same buffer size. All other fields will be reset.

Reading words

This class implements WordReader in the simplest way: words are defined as maximal subsequences of characters satisfying Character.isLetterOrDigit(char). To alter this behaviour, you have two choices:

The second approach is of course more flexible, but the first one is particularly useful from the command line as there is a constructor accepting the additional word constituents as a string.

See Also:
  • Field Details

    • serialVersionUID

      public static final long serialVersionUID
      See Also:
    • DEFAULT_BUFFER_SIZE

      public static final int DEFAULT_BUFFER_SIZE
      The default size of the internal buffer in bytes (16Ki).
      See Also:
    • bufferSize

      protected final int bufferSize
      The buffer size (must be equal to buffer.length).
    • wordConstituents

      protected final CharSet wordConstituents
      A set of additional characters that will be considered as word constituents, beside those accepted by Character.isLetterOrDigit(int).
    • buffer

      protected transient char[] buffer
      The internal buffer.
    • pos

      protected transient int pos
      The current position in the buffer.
    • avail

      protected transient int avail
      The number of buffer bytes available starting from pos.
    • reader

      protected transient Reader reader
      The underlying reader.
  • Constructor Details

    • FastBufferedReader

      public FastBufferedReader(int bufferSize)
      Creates a new fast buffered reader with a given buffer size. The wrapped reader will have to be set later using setReader(Reader).
      Parameters:
      bufferSize - the size in characters of the internal buffer (must be nonzero).
    • FastBufferedReader

      public FastBufferedReader(int bufferSize, CharSet wordConstituents)
      Creates a new fast buffered reader with a given buffer size and set of additional word constituents. The wrapped reader will have to be set later using setReader(Reader).
      Parameters:
      bufferSize - the size in characters of the internal buffer (must be nonzero).
      wordConstituents - a set of characters that will be considered word constituents.
    • FastBufferedReader

      public FastBufferedReader()
      Creates a new fast buffered reader with a buffer of DEFAULT_BUFFER_SIZE characters. The wrapped reader will have to be set later using setReader(Reader).
    • FastBufferedReader

      public FastBufferedReader(CharSet wordConstituents)
      Creates a new fast buffered reader with a buffer of DEFAULT_BUFFER_SIZE characters and given set of additional word constituents. The wrapped reader will have to be set later using setReader(Reader).
      Parameters:
      wordConstituents - a set of characters that will be considered word constituents.
    • FastBufferedReader

      public FastBufferedReader(String wordConstituents)
      Creates a new fast buffered reader with a buffer of DEFAULT_BUFFER_SIZE characters and a set of additional word constituents specified by a string.

      Warning: it is easy to mistake this method for one whose semantics is the same as FastBufferedReader(MutableString), that is, wrapping the argument string in a reader.

      Parameters:
      wordConstituents - a string of characters that will be considered word constituents.
      Throws:
      IllegalArgumentException - if wordConstituents contains duplicate characters.
    • FastBufferedReader

      public FastBufferedReader(String bufferSize, String wordConstituents)
      Creates a new fast buffered reader with a given buffer size and a set of additional word constituents, both specified by strings.
      Parameters:
      bufferSize - the size in characters of the internal buffer (must be nonzero).
      wordConstituents - a string of characters that will be considered word constituents.
    • FastBufferedReader

      public FastBufferedReader(Reader r, int bufferSize)
      Creates a new fast buffered reader by wrapping a given reader with a given buffer size.
      Parameters:
      r - a reader to wrap.
      bufferSize - the size in bytes of the internal buffer.
    • FastBufferedReader

      public FastBufferedReader(Reader r, int bufferSize, CharSet wordConstituents)
      Creates a new fast buffered reader by wrapping a given reader with a given buffer size and using a set of additional word constituents.
      Parameters:
      r - a reader to wrap.
      bufferSize - the size in characters of the internal buffer (must be nonzero).
      wordConstituents - a set of characters that will be considered word constituents.
    • FastBufferedReader

      public FastBufferedReader(Reader r)
      Creates a new fast buffered reader by wrapping a given reader with a buffer of DEFAULT_BUFFER_SIZE characters.
      Parameters:
      r - a reader to wrap.
    • FastBufferedReader

      public FastBufferedReader(Reader r, CharSet wordConstituents)
      Creates a new fast buffered reader by wrapping a given reader with a buffer of DEFAULT_BUFFER_SIZE characters and using a set of additional word constituents.
      Parameters:
      r - a reader to wrap.
      wordConstituents - a set of characters that will be considered word constituents.
    • FastBufferedReader

      public FastBufferedReader(char[] array, int offset, int length, CharSet wordConstituents)
      Creates a new fast buffered reader by wrapping a given fragment of a character array and using a set of additional word constituents.

      The effect of setReader(Reader) on a buffer created with this constructor is undefined.

      Parameters:
      array - the array that will be wrapped by the reader.
      offset - the first character to be used.
      length - the number of character to be used.
      wordConstituents - a set of characters that will be considered word constituents.
    • FastBufferedReader

      public FastBufferedReader(char[] array, int offset, int length)
      Creates a new fast buffered reader by wrapping a given fragment of a character array.

      The effect of setReader(Reader) on a buffer created with this constructor is undefined.

      Parameters:
      array - the array that will be wrapped by the reader.
      offset - the first character to be used.
      length - the number of character to be used.
    • FastBufferedReader

      public FastBufferedReader(char[] array, CharSet wordConstituents)
      Creates a new fast buffered reader by wrapping a given character array and using a set of additional word constituents.

      The effect of setReader(Reader) on a buffer created with this constructor is undefined.

      Parameters:
      array - the array that will be wrapped by the reader.
      wordConstituents - a set of characters that will be considered word constituents.
    • FastBufferedReader

      public FastBufferedReader(char[] array)
      Creates a new fast buffered reader by wrapping a given character array.

      The effect of setReader(Reader) on a buffer created with this constructor is undefined.

      Parameters:
      array - the array that will be wrapped by the reader.
    • FastBufferedReader

      public FastBufferedReader(MutableString s, CharSet wordConstituents)
      Creates a new fast buffered reader by wrapping a given mutable string and using a set of additional word constituents.

      The effect of setReader(Reader) on a buffer created with this constructor is undefined.

      Parameters:
      s - the mutable string that will be wrapped by the reader.
      wordConstituents - a set of characters that will be considered word constituents.
    • FastBufferedReader

      public FastBufferedReader(MutableString s)
      Creates a new fast buffered reader by wrapping a given mutable string.

      The effect of setReader(Reader) on a buffer created with this constructor is undefined.

      Parameters:
      s - the mutable string that will be wrapped by the reader.
  • Method Details

    • copy

      public FastBufferedReader copy()
      Description copied from interface: WordReader
      Returns a copy of this word reader.

      This method must return a word reader with a behaviour that matches exactly that of this word reader.

      Specified by:
      copy in interface WordReader
      Returns:
      a copy of this word reader.
    • noMoreCharacters

      protected boolean noMoreCharacters() throws IOException
      Checks whether no more characters will be returned.
      Returns:
      true if there are no characters in the internal buffer and the underlying reader is exhausted.
      Throws:
      IOException
    • read

      public int read() throws IOException
      Overrides:
      read in class Reader
      Throws:
      IOException
    • read

      public int read(char[] b, int offset, int length) throws IOException
      Specified by:
      read in class Reader
      Throws:
      IOException
    • readLine

      public MutableString readLine(MutableString s) throws IOException
      Reads a line into the given mutable string.

      The next line of input (defined as in BufferedReader.readLine()) will be stored into s. Note that if s is not loose this method will be quite inefficient.

      Parameters:
      s - a mutable string that will be used to store the next line (which could be empty).
      Returns:
      s, or null if the end of file was found, in which case s is unchanged.
      Throws:
      IOException
    • isWordConstituent

      protected boolean isWordConstituent(char c)
      Returns whether the given character is a word constituent.

      The behaviour of this FastBufferedReader as a WordReader can be radically changed by overwriting this method.

      Parameters:
      c - a character.
      Returns:
      whether c should be considered a word constituent.
    • next

      public boolean next(MutableString word, MutableString nonWord) throws IOException
      Description copied from interface: WordReader
      Extracts the next word and non-word.

      If this method returns true, a new non-empty word, and possibly a new non-word, have been extracted. It is acceptable that the first call to this method after creation or after a call to WordReader.setReader(Reader) returns an empty word. In other words both word and nonWord are maximal.

      Specified by:
      next in interface WordReader
      Parameters:
      word - the next word returned by the underlying reader.
      nonWord - the nonword following the next word returned by the underlying reader.
      Returns:
      true if a new word was processed, false otherwise (in which case both word and nonWord are unchanged).
      Throws:
      IOException
    • setReader

      public FastBufferedReader setReader(Reader reader)
      Description copied from interface: WordReader
      Resets the internal state of this word reader, which will start again reading from the given reader.
      Specified by:
      setReader in interface WordReader
      Parameters:
      reader - the new reader providing characters.
      Returns:
      this word reader.
    • skip

      public long skip(long n) throws IOException
      Overrides:
      skip in class Reader
      Throws:
      IOException
    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Specified by:
      close in class Reader
      Throws:
      IOException
    • toSpec

      public String toSpec()
    • toString

      public String toString()
      Overrides:
      toString in class Object