AbstractTreebankParserParams (Stanford JavaNLP API)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.stanford.nlp.parser.lexparser
Class AbstractTreebankParserParams

java.lang.Object
  edu.stanford.nlp.parser.lexparser.AbstractTreebankParserParams

All Implemented Interfaces:: TreebankLangParserParams, Serializable

Direct Known Subclasses:: ChineseTreebankParserParams, EnglishTreebankParserParams, NegraPennTreebankParserParams, TueBaDZParserParams

public abstract class AbstractTreebankParserParams
extends Object
implements TreebankLangParserParams
extends Object
implements TreebankLangParserParams

An abstract class providing a common method base from which to complete a TreebankLangParserParams implementing class.

With some extending classes you'll want to have access to special attributes of the corresponding TreebankLanguagePack while taking advantage of this class's code for making the TreebankLanguagePack accessible. A good way to do this is to pass a new instance of the appropriate TreebankLanguagePack into this class's constructor, then get it back later on by casting a call to treebankLanguagePack(). See ChineseTreebankParserParams for an example.

Author:: Roger Levy
See Also:: Serialized Form

Nested Class Summary
`protected class`	`AbstractTreebankParserParams.SubcategoryStripper`

Field Summary
`protected String`	`inputEncoding`
`protected String`	`outputEncoding`
`protected TreebankLanguagePack`	`tlp`

Constructor Summary
`protected`	`AbstractTreebankParserParams(TreebankLanguagePack tlp)` Stores the passed-in TreebankLanguagePack.

Method Summary

abstract TreeTransformer collinizer()
the tree transformer used to produce trees for evaluation.

abstract TreeTransformer collinizerEvalb()
the tree transformer used to produce trees for evaluation.

edu.stanford.nlp.parser.lexparser.Extractor dependencyGrammarExtractor(Options op)

static



<E> Collection<E>

dependencyObjectify(Tree t,
                    HeadFinder hf,
                    TreeTransformer collinizer,
                    DependencyTyper<E> typer)

Returns the set of dependencies in a tree, according to some DependencyTyper.

abstract void display()
display language-specific settings

String getInputEncoding()
Returns the input encoding being used.

String getOutputEncoding()
Returns the output encoding being used.

abstract HeadFinder headFinder()
the HeadFinder to use for your treebank.

Lexicon lex()

Lexicon lex(Options.LexOptions op)

abstract MemoryTreebank memoryTreebank()
returns a MemoryTreebank appropriate to the treebank source

double[] MLEDependencyGrammarSmoothingParams()
Give the parameters for smoothing in the MLEDependencyGrammar.

static Collection<Constituent> parsevalObjectify(Tree t, TreeTransformer collinizer)
Takes a Tree and a collinizer and returns a Collection of labeled Constituents for PARSEVAL.

static Collection<Constituent> parsevalObjectify(Tree t, TreeTransformer collinizer, boolean labelConstituents)
Takes a Tree and a collinizer and returns a Collection of Constituents for PARSEVAL evaluation.

PrintWriter pw()
The PrintWriter used to print output.

PrintWriter pw(OutputStream o)
The PrintWriter used to print output.

void setInputEncoding(String encoding)
Sets the input encoding.

abstract int setOptionFlag(String[] args, int i)
Set language-specific options according to flags.

void setOutputEncoding(String encoding)
Sets the output encoding.

abstract String[] sisterSplitters()
Returns the splitting strings used for selective splits.

TreeTransformer subcategoryStripper()
Returns a TreeTransformer appropriate to the Treebank which can be used to remove functional tags (such as "-TMP") from categories.

MemoryTreebank testMemoryTreebank()
You can often return the same thing for testMemoryTreebank as for memoryTreebank

abstract Tree transformTree(Tree t, Tree root)
This method does language-specific tree transformations such as annotating particular nodes with language-relevant features.

TreebankLanguagePack treebankLanguagePack()
Returns an appropriate treebankLanguagePack

TokenizerFactory<Tree> treeTokenizerFactory()

static EquivalenceClasser<List<String>> typedDependencyClasser()
returns an EquivalenceClasser that classes typed dependencies by the syntactic categories of mother, head and daughter, plus direction.

static Collection<List<String>> typedDependencyObjectify(Tree t, HeadFinder hf, TreeTransformer collinizer)
Returns a collection of word-word dependencies typed by mother, head, daughter node syntactic categories.

static Collection<List<String>> untypedDependencyObjectify(Tree t, HeadFinder hf, TreeTransformer collinizer)
Returns a collection of untyped word-word dependencies for the tree.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Methods inherited from interface edu.stanford.nlp.parser.lexparser.TreebankLangParserParams
`defaultTestSentence, diskTreebank, treeReaderFactory`

Field Detail

inputEncoding

protected String inputEncoding

outputEncoding

protected String outputEncoding

tlp

protected TreebankLanguagePack tlp

Constructor Detail

AbstractTreebankParserParams

protected AbstractTreebankParserParams(TreebankLanguagePack tlp)

Stores the passed-in TreebankLanguagePack.

Method Detail

setInputEncoding

public void setInputEncoding(String encoding)

Sets the input encoding.

Specified by:: setInputEncoding in interface TreebankLangParserParams

setOutputEncoding

public void setOutputEncoding(String encoding)

Sets the output encoding.

Specified by:: setOutputEncoding in interface TreebankLangParserParams

getOutputEncoding

public String getOutputEncoding()

Returns the output encoding being used.

Specified by:: getOutputEncoding in interface TreebankLangParserParams

getInputEncoding

public String getInputEncoding()

Returns the input encoding being used.

Specified by:: getInputEncoding in interface TreebankLangParserParams

memoryTreebank

public abstract MemoryTreebank memoryTreebank()

returns a MemoryTreebank appropriate to the treebank source

Specified by:: memoryTreebank in interface TreebankLangParserParams

testMemoryTreebank

public MemoryTreebank testMemoryTreebank()

You can often return the same thing for testMemoryTreebank as for memoryTreebank

Specified by:: testMemoryTreebank in interface TreebankLangParserParams

pw

public PrintWriter pw()

The PrintWriter used to print output. It's the responsibility of pw to deal properly with character encodings for the relevant treebank.

Specified by:: pw in interface TreebankLangParserParams

pw

public PrintWriter pw(OutputStream o)

The PrintWriter used to print output. It's the responsibility of pw to deal properly with character encodings for the relevant treebank.

Specified by:: pw in interface TreebankLangParserParams

treebankLanguagePack

public TreebankLanguagePack treebankLanguagePack()

Returns an appropriate treebankLanguagePack

Specified by:: treebankLanguagePack in interface TreebankLangParserParams

headFinder

public abstract HeadFinder headFinder()

the HeadFinder to use for your treebank.

Specified by:: headFinder in interface TreebankLangParserParams

lex

public Lexicon lex()

lex

public Lexicon lex(Options.LexOptions op)

Specified by:: lex in interface TreebankLangParserParams

MLEDependencyGrammarSmoothingParams

public double[] MLEDependencyGrammarSmoothingParams()

Give the parameters for smoothing in the MLEDependencyGrammar. Defaults are the ones previously hard coded into MLEDependencyGrammar.

Specified by:: MLEDependencyGrammarSmoothingParams in interface TreebankLangParserParams

Returns:: an array of doubles with smooth_aT_hTWd, smooth_aTW_hTWd, smooth_stop, and interp

parsevalObjectify

public static Collection<Constituent> parsevalObjectify(Tree t,
                                                        TreeTransformer collinizer)

Takes a Tree and a collinizer and returns a Collection of labeled Constituents for PARSEVAL.

Parameters:: t - The tree to extract constituents from; collinizer - The TreeTransformer used to normalize the tree for evaluation
Returns:: The bag of Constituents for PARSEVAL.

parsevalObjectify

public static Collection<Constituent> parsevalObjectify(Tree t,
                                                        TreeTransformer collinizer,
                                                        boolean labelConstituents)

Takes a Tree and a collinizer and returns a Collection of Constituents for PARSEVAL evaluation. Some notes on this particular parseval:

It is character-based, which allows it to be used on segmentation/parsing combination evaluation.
whether it gives you labeled or unlabeled bracketings depends on the value of the labelConstituents parameter

(Note that I haven't checked this rigorously yet with the PARSEVAL definition -- Roger.)

untypedDependencyObjectify

public static Collection<List<String>> untypedDependencyObjectify(Tree t,
                                                                  HeadFinder hf,
                                                                  TreeTransformer collinizer)

Returns a collection of untyped word-word dependencies for the tree.

typedDependencyObjectify

public static Collection<List<String>> typedDependencyObjectify(Tree t,
                                                                HeadFinder hf,
                                                                TreeTransformer collinizer)

Returns a collection of word-word dependencies typed by mother, head, daughter node syntactic categories.

dependencyObjectify

public static <E> Collection<E> dependencyObjectify(Tree t,
                                                    HeadFinder hf,
                                                    TreeTransformer collinizer,
                                                    DependencyTyper<E> typer)

Returns the set of dependencies in a tree, according to some DependencyTyper.

typedDependencyClasser

public static EquivalenceClasser<List<String>> typedDependencyClasser()

returns an EquivalenceClasser that classes typed dependencies by the syntactic categories of mother, head and daughter, plus direction.

collinizer

public abstract TreeTransformer collinizer()

the tree transformer used to produce trees for evaluation. Will be applied both to the parse output tree and to the gold tree. Should strip punctuation and maybe do some other things.

Specified by:: collinizer in interface TreebankLangParserParams

collinizerEvalb

public abstract TreeTransformer collinizerEvalb()

the tree transformer used to produce trees for evaluation. Will be applied both to the parse output tree and to the gold tree. Should strip punctuation and maybe do some other things. The evalb version should strip some more stuff off. (finish this doc!)

Specified by:: collinizerEvalb in interface TreebankLangParserParams

sisterSplitters

public abstract String[] sisterSplitters()

Returns the splitting strings used for selective splits.

Specified by:: sisterSplitters in interface TreebankLangParserParams

Returns:: An array containing ancestor-annotated Strings: categories should be split according to these ancestor annotations.

subcategoryStripper

public TreeTransformer subcategoryStripper()

Returns a TreeTransformer appropriate to the Treebank which can be used to remove functional tags (such as "-TMP") from categories.

Specified by:: subcategoryStripper in interface TreebankLangParserParams

transformTree

public abstract Tree transformTree(Tree t,
                                   Tree root)

This method does language-specific tree transformations such as annotating particular nodes with language-relevant features. Such parameterizations should be inside the specific TreebankLangParserParams class. This method is recursively applied to each node in the tree (depth first, left-to-right), so you shouldn't write this method to apply recursively to tree members. This method is allowed to (and in some cases does) destructively change the input tree t. It changes both labels and the tree shape.

Specified by:: transformTree in interface TreebankLangParserParams

Parameters:: t - The input tree (with non-language specific annotation already done, so you need to strip back to basic categories); root - The root of the current tree (can be null for words)
Returns:: The fully annotated tree node (with daughters still as you want them in the final result)

display

public abstract void display()

display language-specific settings

Specified by:: display in interface TreebankLangParserParams

setOptionFlag

public abstract int setOptionFlag(String[] args,
                                  int i)

Set language-specific options according to flags. This routine should process the option starting in args[i] (which might potentially be several arguments long if it takes arguments). It should return the index after the last index it consumed in processing. In particular, if it cannot process the current option, the return value should be i.

Specified by:: setOptionFlag in interface TreebankLangParserParams

Parameters:: args - Array of command line arguments; i - Index in command line arguments to try to process as an option
Returns:: The index of the item after arguments processed as part of this command line option.

treeTokenizerFactory

public TokenizerFactory<Tree> treeTokenizerFactory()

Specified by:: treeTokenizerFactory in interface TreebankLangParserParams

dependencyGrammarExtractor

public edu.stanford.nlp.parser.lexparser.Extractor dependencyGrammarExtractor(Options op)

Specified by:: dependencyGrammarExtractor in interface TreebankLangParserParams

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

Stanford NLP Group

edu.stanford.nlp.parser.lexparser Class AbstractTreebankParserParams

inputEncoding

outputEncoding

tlp

AbstractTreebankParserParams

setInputEncoding

setOutputEncoding

getOutputEncoding

getInputEncoding

memoryTreebank

testMemoryTreebank

pw

pw

treebankLanguagePack

headFinder

lex

lex

MLEDependencyGrammarSmoothingParams

parsevalObjectify

parsevalObjectify

untypedDependencyObjectify

typedDependencyObjectify

dependencyObjectify

typedDependencyClasser

collinizer

collinizerEvalb

sisterSplitters

subcategoryStripper

transformTree

display

setOptionFlag

treeTokenizerFactory

dependencyGrammarExtractor

edu.stanford.nlp.parser.lexparser
Class AbstractTreebankParserParams