|
WETCAT – Web-Enabled Translation using Corpus-based Acquisition of Transfer Rules
At the University of Vienna, Werner Winiwarter is building a prototype of a Web-based Japanese-English machine translation system called WETCAT. One main feature of the system is that there are no handcrafted transfer rules used in the translation engine. The whole transfer knowledge is learned automatically from Japanese-English translation examples through structural matching between the parse trees. The acquisition process also includes a consolidation step in which rules are generalized to increase the coverage for new unseen data as long as this does not result in conflicts with other existing rules. As training data, the JENAAD corpus is used, which contains 150,000 sentence pairs taken from news articles.
Through the Web interface, you cannot only translate Japanese sentences but also receive information about lexical, syntactic, and transfer data. This makes WETCAT also a very useful tool for language students. The system can be easily customized, i.e. if you don't agree with a translation you can just correct it in the Web interface and update your personal transfer rule base.
The system is implemented in Amzi! Prolog by using the Amzi! Logic Server CGI Interface to develop the Web application. After having experimented with many other software development tools, Amzi! Prolog was chosen as the perfect environment because it offers an expressive declarative programming language within the Eclipse Platform, powerful unification operations for the efficient application of the transfer rules, and full Unicode support for Japanese characters.
A demo version of WETCAT will be publicly available in the near future.
Additional information:
- Example with screenshots of using WETCAT
- ATR Presentation (PowerPoint)
- W. Winiwarter. Incremental Learning of Transfer Rules for Customized Machine Translation. U. Seipel, M. Hanus, U. Geske, O. Bartenstein (eds). Applications of Declarative Programming and Knowledge Management. Lecture Notes in Artificial Intelligence, Vol. 3392, Berlin, © Springer-Verlag, 2005
- W. Winiwarter. Automatic Acquisition of Transfer Rules from Translation Examples. Proc. of España for Natural Language Processing, Lecture Notes in Artificial Intelligence, Berlin, © Springer-Verlag, 2004
- W. Winiwarter. PETRA – the Personal Embedded Translation and Reading Assistant. Proc. of the InSTIL/ICALL 2004 Symposium on NLP and Speech Technologies in Advanced Language Learning Systems, Padova, UNIPRESS, 2004
More recent publications about the system will be added here soon.
Werner Winiwarter
Department of Scientific Computing, University of Vienna
|