tSearch: Leveraging the Automatic Analysis of Machine Translation Evaluation in Fast and Flexible Error Analysis
Omega-S208 Campus Nord - UPC
Tue Oct 01, 2013
The development cycle of machine translation (MT) systems includes important stages such the evaluation, error analysis, and system refinement. These stages have promoted several research studies and tools focused on aiding the tasks of MT developers.
This master thesis is concerned with the error analysis stage. We have developed the tSearch tool, an open web-based application that aims to facilitate the qualitative analysis of translation quality. So far, MT developers usually have to tackle the tedious and time- consuming task of inspecting a large set of translations almost manually. There exist some tools focused on the classification of the error types. In contrast, tSearch offers a new focus providing mechanisms for doing complex searches over a collection of translation cases evaluated with a large set of diverse translation evaluation measures, which makes easy the discovery of patterns that may help MT developers to improve the translation quality. Thus, the error analysis task can be carried out over a specific subset of translations examples, turning into a more efficient and feasible work.
To carry out this proposal, tSearch builds on Asiya, an open toolkit for automatic machine translation evaluation, that provides a large set of evaluation measures. The Asiya evaluation results are used by tSearch in order to process a user query. The search engine offers a rich and flexible query language that allows to find translation examples matching a combination of numerical and structural features. Furthermore, its database design permits a fast response time for all queries supported on realistic-size testbeds. We also offer a friendly and easy to use web interface that makes possible an interactive access to the query results.