edu.stanford.nlp.process
Class PTBEscapingProcessor
java.lang.Object
edu.stanford.nlp.process.AbstractListProcessor
edu.stanford.nlp.process.PTBEscapingProcessor
- All Implemented Interfaces:
- Function<List<HasWord>,List<HasWord>>, ListProcessor, Processor, Serializable
public class PTBEscapingProcessor
- extends AbstractListProcessor
- implements Function<List<HasWord>,List<HasWord>>
Produces a new Document of Words in which special characters of the PTB
have been properly escaped.
- Author:
- Teg Grenager (grenager@stanford.edu)
- See Also:
- Serialized Form
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
stringSubs
protected Map stringSubs
oldChars
protected char[] oldChars
oldStrings
protected static final String[] oldStrings
newStrings
protected static final String[] newStrings
defaultOldChars
protected static final char[] defaultOldChars
fixQuotes
protected boolean fixQuotes
PTBEscapingProcessor
public PTBEscapingProcessor()
PTBEscapingProcessor
public PTBEscapingProcessor(Map stringSubs,
char[] oldChars,
boolean fixQuotes)
makeStringMap
protected static Map makeStringMap()
apply
public List<HasWord> apply(List<HasWord> hasWordsList)
- Unescape a List of HasWords. Implements the
Function<List<HasWord>, List<HasWord>> interface.
- Specified by:
apply
in interface Function<List<HasWord>,List<HasWord>>
process
public List process(List input)
- Description copied from interface:
ListProcessor
- Take a List (including a Sentence) of input, and return a
List that has been processed in some way.
- Specified by:
process
in interface ListProcessor
- Parameters:
input
- must be a List of objects of type HasWord
main
public static void main(String[] args)
- This will do the escaping on an input file. Input file must already be tokenized,
with tokens separated by whitespace.
Usage: java edu.stanford.nlp.process.PTBEscapingProcessor fileOrUrl
- Parameters:
args
- Command line argument: a file or URL
Stanford NLP Group