edu.stanford.nlp.parser.lexparser
Class ChineseUnknownWordModel
java.lang.Object
edu.stanford.nlp.parser.lexparser.ChineseUnknownWordModel
- All Implemented Interfaces:
- Serializable
public class ChineseUnknownWordModel
- extends Object
- implements Serializable
Stores, trains, and scores with an unknown word model. A couple
of filters deterministically force rewrites for certain proper
nouns, dates, and cardinal and ordinal numbers; when none of these
filters are met, either the distribution of terminals with the same
first character is used, or Good-Turing smoothing is used. Although
this is developed for Chinese, the training and storage methods
could be used cross-linguistically.
- Author:
- Roger Levy
- See Also:
- Serialized Form
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ChineseUnknownWordModel
public ChineseUnknownWordModel()
score
public double score(IntTaggedWord itw)
score
public double score(TaggedWord tw)
train
public void train(Collection trees)
- trains the first-character based unknown word model.
- Parameters:
trees
- the collection of trees to be trained over
main
public static void main(String[] args)
Stanford NLP Group