Europarl-QTLeap WSD/NED corpus – META-SHARE

Last view: 2026-07-07

50 Last view: 2026-07-07

Europarl-QTLeap WSD/NED corpus

The texts are sentences from the Europarl parallel corpus (Koehn, 2005). The textscontain the monolingual sentences from parallel corpora for the following
pairs: Bulgarian-English, Czech-English, Portuguese-English and Spanish-
English. The English corpus is comprised by the English side of the Spanish-
English corpus.
Basque is not in Europarl. In addition, it contains the Basque and English
sides of the GNOME corpus (Tiedemann, 2012).
The texts have been automatically annotated with NLP tools, including Word
Sense Disambiguation, Named Entity Disambiguation and Coreference
resolution.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Unrestricted Use

Licence

CC - BY - NC - SA

Contact Person

António Branco

text

Multilingual text corpusLanguages

Czech Bulgarian Basque Spanish; Castilian English Portuguese

Linguality

Linguality type: Multilingual

Multi-linguality type: Parallel

Size

12 Gb

Metadata

Created: 13/05/2016

Last Updated: 13/05/2016

Metadata Creator

Lyubomir Zlatkov

People who looked at this resource also viewed the following: