Post-Conference Tutorials: Thursday, November 4, 2010

Note: All tutorials are half day - and include a 30 minute break in the middle. Detailed descriptions follow.

T5: Statistical Machine Translation with Open Source Software, Philipp Koehn and Hieu Hoang

This tutorial will introduce the open source tools such as the Moses toolkit and stress its functionality. It is a hands-on tutorial more geared towards using the tool, rather than a descriptions of the details of the methods. The tutorial will cover the following topics: overview of the training and tuning pipeline; working with an experimental management system; analyzing the output; word alignment toolkits; input modalities: words, confusion networks, lattices; output modalities: 1-best, n-best, search graph; translation models: phrasal, hierarchical, syntactified; language models: binarization, quantization, randomization; decision rules: MAP, MBR, lattice MBR, consensus; trade-offs between speed and quality; trade-offs between speed and memory efficiency; incremental updating of the translation model; adding features to the decoder.


Philipp Koehn, University of Edingburgh
Address: 10 Crichton Road, Edinburgh, EH8-9AB

Dr Koehn's research interests include machine translation and its applications, as well as large-scale natural language learning. He has been a lecturer at the University of Edinburgh since 2005, after spending a year as a post-doc at MIT and receiving his PhD from the University of Southern California in 2003. He has served as area chair for machine translation at major conferences (ACL, NAACL, MT Summit) and is known for his efforts to foster open source resources for machine translation, such as the Moses decoder and the Europarl corpus.

Hieu Hoang, University of Edingburgh
Address: 10 Crichton Road, Edinburgh, EH8-9AB

Hieu Hoang is a graduate student at the University of Edinburgh working on improving machine translation with the addition of linguistic information. He is a founding member of the team that created the Moses toolkit and continues to be a major contributor and maintainer of the system. He has degrees in Computer Science and Machine Learning (London) and has over 10 years experience as a software developer. He is expected to receive his PhD in 2010.



For website assistance contact:
Last updated: August 1, 2010
Lark Bunting