Interface WordReader

All Superinterfaces:
Serializable
All Known Implementing Classes:
DelimitedWordReader, FastBufferedReader, LineWordReader

public interface WordReader extends Serializable
An interface providing methods to break the input from a reader into words.

The intended implementations of this interface should decorate a given reader (see, for instance, FastBufferedReader). The reader can be changed at any time using setReader(Reader).

This interface is heavily oriented towards reusability and streaming. It is conceived so that at most one method call has to be performed per word, rather than per character, and that implementations may completely avoid object creation by setting explicitly the underlying reader.

The standard implementation (FastBufferedReader) breaks words in the trivial way. More complex implementations (e.g., for languages requiring segmentation) can subclass FastBufferedReader or provide their own implementation.

  • Method Summary

    Modifier and Type
    Method
    Description
    Returns a copy of this word reader.
    boolean
    Extracts the next word and non-word.
    setReader(Reader reader)
    Resets the internal state of this word reader, which will start again reading from the given reader.
  • Method Details

    • next

      boolean next(MutableString word, MutableString nonWord) throws IOException
      Extracts the next word and non-word.

      If this method returns true, a new non-empty word, and possibly a new non-word, have been extracted. It is acceptable that the first call to this method after creation or after a call to setReader(Reader) returns an empty word. In other words both word and nonWord are maximal.

      Parameters:
      word - the next word returned by the underlying reader.
      nonWord - the nonword following the next word returned by the underlying reader.
      Returns:
      true if a new word was processed, false otherwise (in which case both word and nonWord are unchanged).
      Throws:
      IOException
    • setReader

      WordReader setReader(Reader reader)
      Resets the internal state of this word reader, which will start again reading from the given reader.
      Parameters:
      reader - the new reader providing characters.
      Returns:
      this word reader.
    • copy

      WordReader copy()
      Returns a copy of this word reader.

      This method must return a word reader with a behaviour that matches exactly that of this word reader.

      Returns:
      a copy of this word reader.