Rapidly Retargetable Translingual Detection

Organization: University of Maryland / Johns Hopkins University

Principal Investigator: Douglas W. Oard (Maryland) / William J. Byrne (Hopkins)


The objective of this project is to rapidly create usable systems for translingual document detection that can be employed by analysts who are fluent in English to detect potentially important documents that are written in other languages.


This objective will be met by developing a core set of technologies to automatically extract translation knowledge from naturally occurring resources and for using those resources in translingual detection applications. The extraction effort is focused on three types of naturally occurring resources:
  1. printed bilingual dictionaries that can be rapidly scanned,
  2. translation-equivalent Web pages that can be automatically detected in large collections, and
  3. topically-related collections of "comparable" monolingual texts in each language that can be assembled automatically.
These sources have complementary strengths and limitations-by exploiting all three it will be possible to rapidly assemble relatively comprehensive translation lexicons. The resulting lexicons will be used with fully automatic and semi-automatic (interactive) translingual detection techniques that are tuned to the characteristics of those sources of translation knowledge.

Recent Accomplishments:

Current Plan:

Technology Transition:

Updated by Daqing He / UMIACS / UMD / on