English-Persian parallel Corpus

91 Last view: 2026-06-25

English-Persian parallel Corpus

View resource name in all available languages

Corpus parallèle anglais-persan

http://catalog.elra.info/product_info.php?products_id=1111

ID:

ELRA-W0051

The corpus consists of about 3,500,000 English and Persian (Farsi) words aligned at sentence level (about 100,000 sentences, distributed over 50,021 entries). The format of the files is Unicode. It has been originally created with SQL Server, but it is presented in access file type. The texts in the corpus include a variety of text types, wich are distributed as follows:
- Art: 1804 entries (3.61%)
- Culture: 5097 entries (10.19%)
- Idiom: 435 entries (0.87%)
- Law: 2266 entries (4.53%)
- Literature: 11470 entries (22.93%)
- Medicine: 1089 entries (2.18%)
- Others: 16989 entries (33.96%)
- Poetry: 692 entries (1.38%)
- Politics: 5493 entries (10.98%)
- Proverb: 292 entries (0.58%)
- Religion: 686 entries (1.37%)
- Science: 3708 entries (7.41%)

View resource description in all available languages

Ce corpus comprend environ 3 500 000 mots anglais et persans (farsi) alignés au niveau de la phrase (environ 100 000 phrases, réparties sur 50,021 entrées). Le format de fichier est Unicode. A l’origine, le corpus a été créé sur serveur SQL, mais est présenté sous la forme de fichiers access. Les textes du corpus consistent en une variété de types de textes, répartis comme suit:
- Art: 1804 entrées (3,61%)
- Culture: 5097 entrées (10,19%)
- Idiom: 435 entrées (0,87%)
- Law: 2266 entrées (4,53%)
- Literature: 11470 entrées (22,93%)
- Medicine: 1089 entrées (2,18%)
- Others: 16989 entrées (33,96%)
- Poetry: 692 entrées (1,38%)
- Politics: 5493 entrées (10,98%)
- Proverb: 292 entrées (0,58%)
- Religion: 686 entrées (1,37%)
- Science: 3708 entrées (7,41%)

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Start date: 07/07/2009

Licence

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

User Nature: Commercial

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Academic

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Academic

Contact Person

Mapelli Valérie

text

Bilingual text corpusLanguages

Persian English

Linguality

Linguality type: Bilingual

Multi-linguality type: Parallel

Size

no size available

Metadata

Created: 12/05/2005

Version

Version: 1.0

Last Updated: 12/03/2010

People who looked at this resource also viewed the following: