TreeBinarizer (Stanford JavaNLP API)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.stanford.nlp.parser.lexparser
Class TreeBinarizer

java.lang.Object
  edu.stanford.nlp.parser.lexparser.TreeBinarizer

All Implemented Interfaces:: TreeTransformer

public class TreeBinarizer
extends Object
implements TreeTransformer
extends Object
implements TreeTransformer

Binarizes trees in such a way that head-argument structure is respected. Looks only at the value of input tree nodes. Produces LSTrees with CWT labels. The input trees have to have CWT labels! Although the binarizer always respects heads, you can get left or right binarization by defining an appropriate HeadFinder.

Author:: Dan Klein, Teg Grenager, Christopher Manning

Constructor Summary
`TreeBinarizer(HeadFinder hf, TreebankLanguagePack tlp, boolean insideFactor, boolean markovFactor, int markovOrder, boolean useWrappingLabels, boolean unaryAtTop, double selectiveSplitThreshold, boolean markFinalStates)` Build a custom binarizer for Trees.

Method Summary
`protected static boolean`	`isSynthetic(String label)`
`static void`	`main(String[] args)` Let's you test out the TreeBinarizer on the command line.
`void`	`setDoSelectiveSplit(boolean doSelectiveSplit)` If this is set to true, then the binarizer will choose selectively whether or not to split states based on how many counts the states had in a previous run.
`Tree`	`transformTree(Tree t)` Binarizes the tree according to options set up in the constructor.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

TreeBinarizer

public TreeBinarizer(HeadFinder hf,
                     TreebankLanguagePack tlp,
                     boolean insideFactor,
                     boolean markovFactor,
                     int markovOrder,
                     boolean useWrappingLabels,
                     boolean unaryAtTop,
                     double selectiveSplitThreshold,
                     boolean markFinalStates)

Build a custom binarizer for Trees.

Parameters:: hf - the HeadFinder to use in binarization; tlp - the TreebankLanguagePack to use; insideFactor - whether to do inside markovization; markovFactor - whether to markovize the binary rules; markovOrder - the markov order to use; only relevant with markovFactor=true; useWrappingLabels - whether to use state names (labels) that allow wrapping from right to left; unaryAtTop - Whether to actually materialize the unary that rewrites a passive state to the active rule at the top of an original local tree. This is used only when compaction is happening; selectiveSplitThreshold - if selective split is used, this will be the threshold used to decide which state splits to keep; markFinalStates - whether or not to make the state names (labels) of the final active states distinctive

Method Detail

setDoSelectiveSplit

public void setDoSelectiveSplit(boolean doSelectiveSplit)

If this is set to true, then the binarizer will choose selectively whether or not to split states based on how many counts the states had in a previous run. These counts are stored in an internal counter, which will be added to when doSelectiveSplit is false. If passed false, this will initialize (clear) the counts.

Parameters:: doSelectiveSplit -

isSynthetic

protected static boolean isSynthetic(String label)

transformTree

public Tree transformTree(Tree t)

Binarizes the tree according to options set up in the constructor. Does the whole tree by calling itself recursively.

Specified by:: transformTree in interface TreeTransformer

Parameters:: t - A tree to be binarized. The non-leaf nodes must already have CategoryWordTag labels, with heads percolated.
Returns:: A binary tree.

main

public static void main(String[] args)

Let's you test out the TreeBinarizer on the command line. This main method doesn't yet handle as many flags as one would like.