Table of harshness scores of Enron emails generated by OASYS: harshness.enron(36MB).
Each row has a filename (followed by a colon), a score computed by the DWTF algorithm, a score by the Template_No_Topic algorithm, a score by the TF_No_Topic algorithm, and a hybrid score computed using the former 3 scores. Personally I doubt the accuracy of the hybrid scores, but believe more or less the TF_No_Topic scores based on my spot check. (Reference: Cesarano, Bonnie Dorr, Antonio Picariello, Diego Reforgiato, Amelia Sagoff, V.S. Subrahmanian (2006), OASYS: An Opinion Analysis System. AAAI-CAAW 2006, Palo Alto, CA.)


Table of <filename messageID> and table of generated from Enron email corpus.


Lists mapping from email addresses to mentioned names. These are extracted from NameSearchAddress.out which is filtered down ONLY to cases in which there is a single unique email address for the menioned name (i.e., not zero, and not more than one).

Enron email corpus annotated by LingPipe: Annotated Enron corpus (496M).

Yejun Wu (wuyj AT glue DOT umd DOT edu)
3/31/06