CRATER 2 Corpus

61 Last view: 2025-10-30

View resource name in all available languages

CRATER 2

http://catalog.elra.info/product_info.php?products_id=636

ID:

ELRA-W0033

The CRATER corpus was built upon the foundations of an earlier project, ET10/63, which was funded in the final phase of the Eurotra programme. The Corpus Resources and Terminology Extraction project (MLAP-93 20) extended the bilingual annotated English-French International Telecommunications Union corpus produced within ET10/63 to include Spanish.
The CRATER 2 corpus was produced by the Department of Linguistics & Modern English Language, Lancaster University (United Kingdom) with funding from ELRA. The ELRA funding in turn was provided by the European Commission project LRsP&P (Language Resources Production & Packaging - LE4-8335). This project has enhanced the CRATER corpus, available under the reference ELRA-W0003 in the ELRA catalogue. CRATER 2 has significantly expanded the French/English component of the parallel corpus by increasing the size of the English/French corpus from 1,000,000 words per language to approximately 1,500,000 tokens per language.
The offer consists of 1,500,000 tokens for English and French and of 1,000,000 tokens for Spanish, with morphosyntactical annotations (human-edited).
CRATER 2 (ref. ELRA-W0033) includes CRATER (ref. ELRA-W0003).

View resource description in all available languages

Le corpus CRATER a été élaboré sur les bases d'un ancien projet, ET10/63, financé dans le cadre de la phase finale du programme Eurotra. Le projet Corpus Resources and Terminology Extraction (MLAP-93 20) a étendu à l'espagnol le corpus bilingue annoté anglais-français de l'Union Internationale des Télécommunications, produit dans le cadre de ET10/63. Le corpus CRATER 2 a été produit par le département de linguistique et de langue anglaise de l'université de Lancaster (UK), financé par ELRA dans le cadre du projet LRsP&P de la Commission Européenne (LE4-8335).

La ressource CRATER, disponible sous la référence ELRA W0003 dans le catalogue de ressources linguistiques maintenu par ELDA, a été nettement enrichie dans le cadre de ce projet. Dans CRATER 2, le nombre de tokens augmente de 50% pour la partie français-anglais du corpus parallèle, ce qui représente un total 1 500 000 tokens pour chaque langue.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Start date: 31/01/2002

Licence

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

User Nature: Commercial

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Academic

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Academic

Contact Person

Mapelli Valérie

text

Multilingual text corpusLanguages

English French Spanish

Variety: Castilian (Type: Dialect) (2 Gb)

Linguality

Linguality type: Multilingual

Size

no size available

Resource Creation

Funding Project

LRsP&P (Language Resources Production & Packaging - LE4-8335)

Funding Type: Eu Funds

Metadata

Created: 12/05/2005

Version

Version: 1.0

Last Updated: 23/05/2012

Documentation

Samples Location: http://catalog.elra....

Document Type: Unpublished

SMP_W0033_EN001.pdf,

People who looked at this resource also viewed the following:

Resources from the same project