Lexicon (Stanford JavaNLP API)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.stanford.nlp.parser.lexparser
Interface Lexicon

All Superinterfaces:: Serializable

All Known Implementing Classes:: BaseLexicon, ChineseCharacterBasedLexicon, ChineseLexicon, ChineseLexiconAndWordSegmenter

public interface Lexicon
extends Serializable
extends Serializable

An interface for lexicons interfacing to lexparser.

Author:: Galen Andrew

Field Summary
`static String`	`BOUNDARY`
`static String`	`BOUNDARY_TAG`
`static String`	`UNKNOWN_WORD`

Method Summary
`boolean`	`isKnown(int word)` Checks whether a word is in the lexicon.
`boolean`	`isKnown(String word)` Checks whether a word is in the lexicon.
`void`	`readData(BufferedReader in)` Read the lexicon from the BufferedReader in the format written by writeData.
`Iterator`	`ruleIteratorByWord(int word, int loc)` Get an iterator over all rules (pairs of (word, POS)) for this word.
`double`	`score(IntTaggedWord iTW, int loc)` Get the score of this word with this tag (as an IntTaggedWord) at this loc.
`void`	`train(Collection trees)` Trains this lexicon on the Collection of trees.
`void`	`writeData(Writer w)` Write the lexicon in human-readable format to the Writer.

Field Detail

UNKNOWN_WORD

static final String UNKNOWN_WORD

See Also:: Constant Field Values

BOUNDARY

static final String BOUNDARY

See Also:: Constant Field Values

BOUNDARY_TAG

static final String BOUNDARY_TAG

See Also:: Constant Field Values

Method Detail

isKnown

boolean isKnown(int word)

Checks whether a word is in the lexicon.

Parameters:: word - The word as an int
Returns:: Whether the word is in the lexicon

isKnown

boolean isKnown(String word)

Checks whether a word is in the lexicon.

Parameters:: word - The word as a String
Returns:: Whether the word is in the lexicon

ruleIteratorByWord

Iterator ruleIteratorByWord(int word,
                            int loc)

Get an iterator over all rules (pairs of (word, POS)) for this word.

Parameters:: word - The word, represented as an integer in Numberer; loc - The position of the word in the sentence (counting from 0). Implementation note: The BaseLexicon class doesn't actually make use of this position information.
Returns:: An Iterator over a List ofIntTaggedWords, which pair the word with possible taggings as integer pairs. (Each can be thought of as a tag -> word rule.)





train
void train(Collection trees)

Trains this lexicon on the Collection of trees.











score
double score(IntTaggedWord iTW,
             int loc)

Get the score of this word with this tag (as an IntTaggedWord) at this 
 loc.
 (Presumably an estimate of P(word | tag).)





Parameters:
iTW - An IntTaggedWord pairing a word and POS tag
loc - The position in the sentence.  In the default implementation
               this is used only for unknown words to change their
               probability distribution when sentence initial
Returns:
A double valued score, usually - log P(word|tag)






writeData
void writeData(Writer w)
               throws IOException

Write the lexicon in human-readable format to the Writer.
 (An optional operation.)





Parameters:
w - The writer to output to
Throws:
IOException





readData
void readData(BufferedReader in)
              throws IOException

Read the lexicon from the BufferedReader in the format written by 
 writeData.
 (An optional operation.)





Parameters:
in - The BufferedReader to read from
Throws:
IOException














  
      Overview 
      Package 
    Class 
      Tree 
      Deprecated 
      Index 
      Help 
  









 PREV CLASS 
 NEXT CLASS

  FRAMES   
 NO FRAMES   
 







  SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD






Stanford NLP Group

edu.stanford.nlp.parser.lexparser Interface Lexicon

UNKNOWN_WORD

BOUNDARY

BOUNDARY_TAG

isKnown

isKnown

ruleIteratorByWord

train

score

writeData

readData

edu.stanford.nlp.parser.lexparser
Interface Lexicon