edu.stanford.nlp.parser.lexparser
Class EnglishTreebankParserParams

java.lang.Object
  extended by edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams
      extended by edu.stanford.nlp.parser.lexparser.EnglishTreebankParserParams
All Implemented Interfaces:
TreebankLangParserParams, Serializable

public class EnglishTreebankParserParams
extends AbstractTreebankParserParams

Parser parameters for the Penn English Treebank (WSJ, Brown, Switchboard).

Author:
Roger Levy, Christopher Manning
See Also:
Serialized Form

Nested Class Summary
protected  class EnglishTreebankParserParams.EnglishSubcategoryStripper
           
static class EnglishTreebankParserParams.EnglishTest
           
static class EnglishTreebankParserParams.EnglishTrain
           
 
Nested classes/interfaces inherited from class edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams
AbstractTreebankParserParams.SubcategoryStripper
 
Field Summary
 
Fields inherited from class edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams
inputEncoding, outputEncoding, tlp
 
Constructor Summary
EnglishTreebankParserParams()
           
 
Method Summary
 TreeTransformer collinizer()
          the tree transformer used to produce trees for evaluation.
 TreeTransformer collinizerEvalb()
          the tree transformer used to produce trees for evaluation.
 List defaultTestSentence()
          Return a default sentence for the language (for testing)
 DiskTreebank diskTreebank()
          Allows you to read in trees from the source you want.
 void display()
          display language-specific settings
 HeadFinder headFinder()
          the HeadFinder to use for your treebank.
static void main(String[] args)
           
 MemoryTreebank memoryTreebank()
          Allows you to read in trees from the source you want.
 PrintWriter pw(OutputStream o)
          The PrintWriter used to print output to OutputStream o.
 int setOptionFlag(String[] args, int i)
          Set language-specific options according to flags.
 String[] sisterSplitters()
          Returns the splitting strings used for selective splits.
 TreeTransformer subcategoryStripper()
          Returns a TreeTransformer appropriate to the Treebank which can be used to remove functional tags (such as "-TMP") from categories.
 MemoryTreebank testMemoryTreebank()
          returns a MemoryTreebank appropriate to the testing treebank source
 Tree transformTree(Tree t, Tree root)
          This method does language-specific tree transformations such as annotating particular nodes with language-relevant features.
 TreebankLanguagePack treebankLanguagePack()
          contains Treebank-specific (but not parser-specific) info such as what is punctuation, and also information about the structure of labels
 TreeReaderFactory treeReaderFactory()
          Makes appropriate TreeReaderFactory with all options specified
 
Methods inherited from class edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams
dependencyGrammarExtractor, dependencyObjectify, getInputEncoding, getOutputEncoding, lex, lex, MLEDependencyGrammarSmoothingParams, parsevalObjectify, parsevalObjectify, pw, setInputEncoding, setOutputEncoding, treeTokenizerFactory, typedDependencyClasser, typedDependencyObjectify, untypedDependencyObjectify
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

EnglishTreebankParserParams

public EnglishTreebankParserParams()
Method Detail

headFinder

public HeadFinder headFinder()
Description copied from class: AbstractTreebankParserParams
the HeadFinder to use for your treebank.

Specified by:
headFinder in interface TreebankLangParserParams
Specified by:
headFinder in class AbstractTreebankParserParams

diskTreebank

public DiskTreebank diskTreebank()
Allows you to read in trees from the source you want. It's the responsibility of treeReaderFactory() to deal properly with character-set encoding of the input. It also is the responsibility of tr to properly normalize trees.


memoryTreebank

public MemoryTreebank memoryTreebank()
Allows you to read in trees from the source you want. It's the responsibility of treeReaderFactory() to deal properly with character-set encoding of the input. It also is the responsibility of tr to properly normalize trees.

Specified by:
memoryTreebank in interface TreebankLangParserParams
Specified by:
memoryTreebank in class AbstractTreebankParserParams

treeReaderFactory

public TreeReaderFactory treeReaderFactory()
Makes appropriate TreeReaderFactory with all options specified

Returns:
A factory that vends an appropriate TreeReader

testMemoryTreebank

public MemoryTreebank testMemoryTreebank()
returns a MemoryTreebank appropriate to the testing treebank source

Specified by:
testMemoryTreebank in interface TreebankLangParserParams
Overrides:
testMemoryTreebank in class AbstractTreebankParserParams

collinizer

public TreeTransformer collinizer()
the tree transformer used to produce trees for evaluation. Will be applied both to the

Specified by:
collinizer in interface TreebankLangParserParams
Specified by:
collinizer in class AbstractTreebankParserParams

collinizerEvalb

public TreeTransformer collinizerEvalb()
Description copied from class: AbstractTreebankParserParams
the tree transformer used to produce trees for evaluation. Will be applied both to the parse output tree and to the gold tree. Should strip punctuation and maybe do some other things. The evalb version should strip some more stuff off. (finish this doc!)

Specified by:
collinizerEvalb in interface TreebankLangParserParams
Specified by:
collinizerEvalb in class AbstractTreebankParserParams

treebankLanguagePack

public TreebankLanguagePack treebankLanguagePack()
contains Treebank-specific (but not parser-specific) info such as what is punctuation, and also information about the structure of labels

Specified by:
treebankLanguagePack in interface TreebankLangParserParams
Overrides:
treebankLanguagePack in class AbstractTreebankParserParams

pw

public PrintWriter pw(OutputStream o)
The PrintWriter used to print output to OutputStream o. It's the responsibility of pw to deal properly with character encodings for the relevant treebank.

Specified by:
pw in interface TreebankLangParserParams
Overrides:
pw in class AbstractTreebankParserParams

sisterSplitters

public String[] sisterSplitters()
Description copied from class: AbstractTreebankParserParams
Returns the splitting strings used for selective splits.

Specified by:
sisterSplitters in interface TreebankLangParserParams
Specified by:
sisterSplitters in class AbstractTreebankParserParams
Returns:
An array containing ancestor-annotated Strings: categories should be split according to these ancestor annotations.

subcategoryStripper

public TreeTransformer subcategoryStripper()
Returns a TreeTransformer appropriate to the Treebank which can be used to remove functional tags (such as "-TMP") from categories.

Specified by:
subcategoryStripper in interface TreebankLangParserParams
Overrides:
subcategoryStripper in class AbstractTreebankParserParams

transformTree

public Tree transformTree(Tree t,
                          Tree root)
This method does language-specific tree transformations such as annotating particular nodes with language-relevant features. Such parameterizations should be inside the specific TreebankLangParserParams class. This method is recursively applied to each node in the tree (depth first, left-to-right), so you shouldn't write this method to apply recursively to tree members. This method is allowed to (and in some cases does) destructively change the input tree t. It changes both labels and the tree shape.

Specified by:
transformTree in interface TreebankLangParserParams
Specified by:
transformTree in class AbstractTreebankParserParams
Parameters:
t - The input tree (with non-language-specific annotation already done, so you need to strip back to basic categories)
root - The root of the current tree (can be null for words)
Returns:
The fully annotated tree node (with daughters still as you want them in the final result)

display

public void display()
Description copied from class: AbstractTreebankParserParams
display language-specific settings

Specified by:
display in interface TreebankLangParserParams
Specified by:
display in class AbstractTreebankParserParams

setOptionFlag

public int setOptionFlag(String[] args,
                         int i)
Set language-specific options according to flags. This routine should process the option starting in args[i] (which might potentially be several arguments long if it takes arguments). It should return the index after the last index it consumed in processing. In particular, if it cannot process the current option, the return value should be i.

Specified by:
setOptionFlag in interface TreebankLangParserParams
Specified by:
setOptionFlag in class AbstractTreebankParserParams
Parameters:
args - Array of command line arguments
i - Index in command line arguments to try to process as an option
Returns:
The index of the item after arguments processed as part of this command line option.

defaultTestSentence

public List defaultTestSentence()
Return a default sentence for the language (for testing)


main

public static void main(String[] args)


Stanford NLP Group