WMT 2016 Test Sets

These are the test sets for the WMT shared translation task. They are small parallel data sets used for testing MT systems, and are typically created by translating a selection of crawled articles from online news sites. The core languages are German-English and Czech-English; other guest language pairs will be introduced in each year. The guest language pairs for 2016 were Romanian-English.We also included Russian, Turkish, Chinese, Estonian and Kazakh with funding from other sources, as well as Finnish in 2016. The source data are crawled from online news sites and carry the respective licensing conditions.

You don’t have the permission to edit this resource.