Package it.unimi.dsi.io
Class LineWordReader
java.lang.Object
it.unimi.dsi.io.LineWordReader
- All Implemented Interfaces:
WordReader
,Serializable
A trivial
WordReader
that considers each line
of a document a single word.
The intended usage of this class is that of indexing stuff like lists of document
identifiers: if the identifiers contain nonalphabetical characters, the default
FastBufferedReader
might do a poor job.
Note that the non-word returned by next(MutableString, MutableString)
is
always empty.
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptioncopy()
Returns a copy of this word reader.boolean
next
(MutableString word, MutableString nonWord) Extracts the next word and non-word.Resets the internal state of this word reader, which will start again reading from the given reader.
-
Constructor Details
-
LineWordReader
public LineWordReader()
-
-
Method Details
-
next
Description copied from interface:WordReader
Extracts the next word and non-word.If this method returns true, a new non-empty word, and possibly a new non-word, have been extracted. It is acceptable that the first call to this method after creation or after a call to
WordReader.setReader(Reader)
returns an empty word. In other words bothword
andnonWord
are maximal.- Specified by:
next
in interfaceWordReader
- Parameters:
word
- the next word returned by the underlying reader.nonWord
- the nonword following the next word returned by the underlying reader.- Returns:
- true if a new word was processed, false otherwise (in which
case both
word
andnonWord
are unchanged). - Throws:
IOException
-
setReader
Description copied from interface:WordReader
Resets the internal state of this word reader, which will start again reading from the given reader.- Specified by:
setReader
in interfaceWordReader
- Parameters:
reader
- the new reader providing characters.- Returns:
- this word reader.
-
copy
Description copied from interface:WordReader
Returns a copy of this word reader.This method must return a word reader with a behaviour that matches exactly that of this word reader.
- Specified by:
copy
in interfaceWordReader
- Returns:
- a copy of this word reader.
-