ISLE Speech Corpus – META-SHARE

Last view: 2026-04-03

114 Last view: 2026-04-03

ISLE Speech Corpus

View resource name in all available languages

Corpus de parole ISLE

http://catalog.elra.info/product_info.php?products_id=568

ID:

ELRA-S0083

Approx. 20 minutes of speech (per speaker) from 23 German and 23 Italian intermediate learners of English. Each speaker recorded sentences from several blocks of differing types (reading simple sentences, using minimal pairs, giving answers to multiple choice questions). The prompts were of varying perplexities.

About 2/3 of the data for each speaker was annotated by one of a team of linguists. The files were corrected first at the word level, and an automatic recognizer was then used to produce phone-level annotations. The annotator then re-annotated each sentence to mark phone and stress errors (e.g., substitutions, insertions, or deletions).

Corpus details:
* a total of 46 speakers (23 German and 23 Italian.)
* 11484 utterances
* 1.92 gigabytes of WAV files (4 CDs)
* 17 hours, 54 minutes, and 44 seconds of speech data

A much more detailed explanation of the ISLE corpus will be available in the proceedings of LREC 2000. An electronic copy of this paper may be obtained by sending an email to Dr. Wolfgang Menzel at <menzel@nats.informatik.uni-hamburg.de>.

W. Menzel, E. Atwell, P. Bonaventura, D. Herron, P. Howarth, R. Morton, and C. Souter. "The ISLE corpus of non-native spoken English", Proc. Second LREC.

View resource description in all available languages

Ce corpus comprend environ 20 minutes (par locuteur) d'enregistrements de 23 locuteurs allemands et 23 locuteurs italiens d'un niveau intermédiaire d'apprentissage de l'anglais. Chaque locuteur a enregistré des phrases de plusieurs blocs et différents types (lecture de phrases simples, usage de paires minimales, réponses à des questions à choix multiples).

Près des 2/3 des données pour chaque locuteur ont été annotés par une équipe de linguistes. Les fichiers ont d'abord été corrigés au niveau du mot, puis une reconnaissance automatique a été utilisée pour produire les annotations phonétiques. Une nouvelle annotation a ensuite été réalisée afin de marquer les phonèmes et les erreurs d'accents (par ex: substitutions, insertions, ou suppressions).

Contenu du corpus :

* 46 locuteurs (23 allemands et 23 italiens)
* 11484 occurrences
* 1,92 gigabytes de fichiers WAV (4 CD)
* 17 heures, 54 minutes, et 44 secondes de données de parole

Une description plus détaillée du corpus sera disponible dans les actes de LREC 2000. Une copie électronique du papier peut être obtenue en s'adressant directement à ELRA (Référence : W. Menzel, E. Atwell, P. Bonaventura, D. Herron, P. Howarth, R. Morton et C. Souter. "The ISLE corpus of non-native spoken English", Actes de la deuxième conférence LREC.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Start date: 02/05/2000

Licence

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Commercial

Contact Person

Mapelli Valérie

audio

Monolingual audio corpusLanguages

English

Linguality

Linguality type: Monolingual

Size

no size available

Metadata

Created: 12/05/2005

Version

Version: 1.0

Last Updated: 24/01/2013

People who looked at this resource also viewed the following: