NE3L named entities Chinese corpus

45 Last view: 2025-10-30

NE3L named entities Chinese corpus

View resource name in all available languages

Corpus chinois d’entités nommées NE3L

http://catalog.elra.info/product_info.php?products_id=1227

ID:

ELRA-W0079

The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian.
For this project, 5 named entity categories were taken into account: Person, Place, Organisation, Time and Amount. Each language was concerned only by a subset of these categories, i.e. Arabic was marked up with Time and Amount tags, as well as Russian, whereas Chinese was marked up with Person, Place and Organisation tags.
The Chinese corpus contains 79,302 words coming from articles extracted from “Le Monde Diplomatique” newspaper, and published in 2001.

View resource description in all available languages

Le projet NE3L (Named Entities 3 Langues) consiste à annoter plusieurs corpus de langues différentes en entités nommées. Les données, au format texte, sont extraites de journaux écrits, et traitent de sujets variés. 3 langues différentes ont été annotées : l'Arabe, le Chinois et le Russe.
Pour ce projet, 5 catégories d'entités nommées ont été prises en compte : Personne, Lieu, Organisation, Temps et Quantité. Chaque langue traitée n'est concernée que par une partie de ces catégories, ainsi: l'Arabe ne prend en compte que les balises Temps et Quantité, de même pour le Russe, et le Chinois est concerné par les balises Personne, Lieu et Organisation.
Le corpus chinois contient 79,302 mots provenant d’articles tirés du journal « Le Monde Diplomatique », parus en 2001.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Start date: 29/09/2014

Licence

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

Fee: 5,000.00

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

Fee: 5,000.00

User Nature: Academic

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

Fee: 5,000.00

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

Fee: 5,000.00

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

Fee: 5,000.00

User Nature: Commercial

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

Fee: 5,000.00

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

Fee: 5,000.00

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

Fee: 5,000.00

User Nature: Commercial

Contact Person

Mapelli Valérie

text

Monolingual text corpusLanguages

Chinese

Linguality

Linguality type: Monolingual

Text Format

Plain text

Size

no size available

Resource Creation

Creation ended: 01/01/2014

Funding Project

NE3L

Funding Type: Other

Metadata

Created: 12/05/2005

Version

Version: 1.0

Last Updated: 29/09/2014

People who looked at this resource also viewed the following:

Resources from the same project