IWSLT 2015 Human Post-Editing data – META-SHARE

Last view: 2026-01-06

95 Last view: 2026-01-06

Last update: 2018-03-19

4 Last update: 2018-03-19

Last download: 2025-12-11

10 Last download: 2025-12-11

IWSLT 2015 Human Post-Editing data

https://wit3.fbk.eu/show.php?release=2015-01&page=subjeval&texthead=Human%20evaluation%20data

The human evaluation (HE) dataset created for English to German (EnDe) and Vietnamese to English (ViEn) MT tasks was a subset of the official test set of the IWSLT 2015 evaluation campaign. The resulting HE sets are composed of 600 segments for EnDe and 500 segments for EnFr, each corresponding to around 10,000 words. Human evaluation was based on Post-Editing, i.e. the manual correction of the MT system output, which was carried out by professional translators. Five primary runs submitted to the evaluation campaign were post-edited for each of the two tasks.
Data are publicly available through the WIT3 website wit3.fbk.eu. 600 segments for EnDe and 500 segments for ViEn (10K tokens each). 5 different automatic translations post-edited by professional translators (for Analysis of MT quality and Quality Estimation components).

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Licence

CC - BY

Distribution Access/Medium: Downloadable

Contact Person

text

1
2

Bilingual text corpusLanguages

Vietnamese English

Linguality

Linguality type: Bilingual

Multi-linguality type: Parallel

Size

10,000 Tokens

500 segments

Bilingual text corpusLanguages

English German

Linguality

Linguality type: Bilingual

Multi-linguality type: Parallel

Size

10,000 Tokens

600 segments

Metadata

Created: 13/12/2017

Last Updated: 19/03/2018

Metadata Creator

Usage

Foreseen UseNlp Applications

Use NLP Specific: Machine Translation

Actual Use - Nlp Applications

Use NLP Specific: Machine Translation

People who looked at this resource also viewed the following:

People who downloaded this resource also downloaded the following:

IWSLT 2016 Human Post-Editing data