The human evaluation (HE) dataset created for English to German (EnDe) and Vietnamese to English (ViEn) MT tasks was a subset of the official test set of the IWSLT 2015 evaluation campaign. The resulting HE sets are composed of 600 segments for EnDe and 500 segments for EnFr, each corresponding to around 10,000 words. Human evaluation was based on Post-Editing, i.e. the manual correction of the MT system output, which was carried out by professional translators. Five primary runs submitted to the evaluation campaign were post-edited for each of the two tasks.
Data are publicly available through the WIT3 website wit3.fbk.eu. 600 segments for EnDe and 500 segments for ViEn (10K tokens each). 5 different automatic translations post-edited by professional translators (for Analysis of MT quality and Quality Estimation components).