edu.stanford.nlp.trees.international.pennchinese
Class ChineseEnglishWordMap

java.lang.Object
  extended by edu.stanford.nlp.trees.international.pennchinese.ChineseEnglishWordMap

public class ChineseEnglishWordMap
extends Object

A class for mapping Chinese words to English. Uses CEDict free Lexicon.

Author:
Galen Andrew

Constructor Summary
ChineseEnglishWordMap()
          Make a ChineseEnglishWordMap with the default CEDict path ("cedict_ts.u8")
ChineseEnglishWordMap(String dictPath)
          Make a ChineseEnglishWordMap
ChineseEnglishWordMap(String dictPath, String pattern, String delimiter, String charset)
           
 
Method Summary
 void addMap(Map<String,Set<String>> addM)
          Add all of the mappings from the specified map to the current map.
 boolean containsKey(String key)
          Does the word exist in the dictionary?
 Set<String> getAllTranslations(String key)
           
 String getFirstTranslation(String key)
           
 Map<String,Set<String>> getReverseMap()
          return a reversed map of the current map
static ChineseEnglishWordMap getStaticMap()
          A method for getting the one static copy of the map.
static void main(String[] args)
          The main method reads (segmented, whitespace delimited) words from a file and prints them with their English translation(s).
 void readCEDict(String dictPath)
           
 void readCEDict(String dictPath, String pattern, String delimiter, String charset)
           
 int size()
           
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

ChineseEnglishWordMap

public ChineseEnglishWordMap()
Make a ChineseEnglishWordMap with the default CEDict path ("cedict_ts.u8")


ChineseEnglishWordMap

public ChineseEnglishWordMap(String dictPath)
Make a ChineseEnglishWordMap

Parameters:
dictPath - the path/filename of the CEDict

ChineseEnglishWordMap

public ChineseEnglishWordMap(String dictPath,
                             String pattern,
                             String delimiter,
                             String charset)
Method Detail

getStaticMap

public static ChineseEnglishWordMap getStaticMap()
A method for getting the one static copy of the map.

Returns:
the static copy of ChineseEnglishWordMap

containsKey

public boolean containsKey(String key)
Does the word exist in the dictionary?


getAllTranslations

public Set<String> getAllTranslations(String key)
Parameters:
key - a Chinese word
Returns:
the English translation (null if not in dictionary)

getFirstTranslation

public String getFirstTranslation(String key)
Parameters:
key - a Chinese word
Returns:
the English translations as an array (null if not in dictionary)

readCEDict

public void readCEDict(String dictPath)

readCEDict

public void readCEDict(String dictPath,
                       String pattern,
                       String delimiter,
                       String charset)

getReverseMap

public Map<String,Set<String>> getReverseMap()
return a reversed map of the current map


addMap

public void addMap(Map<String,Set<String>> addM)
Add all of the mappings from the specified map to the current map.


toString

public String toString()
Overrides:
toString in class Object

size

public int size()

main

public static void main(String[] args)
                 throws IOException
The main method reads (segmented, whitespace delimited) words from a file and prints them with their English translation(s). The path and filename of the CEDict Lexicon can be supplied via the "-dictPath" flag; otherwise the default filename "cedict_ts.u8" in the current directory is checked. By default, only the first translation is printed. If the "-all" flag is given, all translations are printed. The input and output encoding can be specified using the "-encoding" flag. Otherwise UTF-8 is assumed.

Throws:
IOException


Stanford NLP Group