OSW Polish-English parallel corpus (CC-BY-NC)

PELCRA-PAR-4

Pęzik P., Ogrodniczuk M., Przepiórkowski A (2011). Parallel and spoken corpora in an open repository of Polish language resources. Human Language Technologies as a Challenge for Computer Science and Linguistics. LTC Poznań 2011.

ID:

509

A subset of the PELCRA Polish parallel corpora licensed under the CC-BY-NC license. This resource contains 757 Polish-English texts from the Centre for Eastern Studies (OSW) website. The texts are sentence-aligned with the mAligna aligner using the Church & Gale algorithm. The texts are provided as TEI P5-compliant XML files with custom PELCRA extensions and in the XLIFF format.

You don’t have the permission to edit this resource.

  • xmllint