PAROLE Italian Corpus

View resource name in all available languages

Corpus PAROLE italien

ID:

ELRA-W0043

The PAROLE Italian Corpus comprises 3,135,651 words collected from four different domains:
• newspapers: 2,179,800 words from La Stampa, La Repubblica, Il Corriere della Sera, L’Unione Sarda, Il Sole 24ore, between 1992 and 1996,
• periodicals: 143,810 words from Casaviva, 100cose, Epoca, Espansione, Grazia, Panorama, Starbene, Storia Illustrata, Zerouno, between 1985 and 1988,
• books: 564,964 words, between 1970 and 1989,
• miscellaneous: 247,077 words from CNR documents, Patents, Maritime documents, Theater, between 1987 and 1997.

About 250,000 words were morphosyntactically annotated and lemmatized.

View resource description in all available languages

Le corpus PAROLE italien comprend 3 135 651 mots collectés à partir de quatre domaines différents :
• journaux : 2 179 800 mots issus de La Stampa, La Repubblica, Il Corriere della Sera, L’Unione Sarda, Il Sole 24ore, entre 1992 et 1996,
• périodiques : 143 810 mots issus de Casaviva, 100cose, Epoca, Espansione, Grazia, Panorama, Starbene, Storia Illustrata, Zerouno, entre 1985 et 1988,
• livres : 564 964 mots, entre 1970 et 1989,
• divers : 247 077 mots issus de documents du CNR, de licences, de documents maritimes, de théatre, entre 1987 et 1997.

Environ 250 000 mots sont annotés au niveau morpho-syntaxique et lemmatisés.

You don’t have the permission to edit this resource.