The HIWIRE database, a noisy and non-native English speech corpus for cockpit communication – META-SHARE

Last view: 2026-03-18

17 Last view: 2026-03-18

The HIWIRE database, a noisy and non-native English speech corpus for cockpit communication

View resource name in all available languages

Base de données HIWIRE, corpus de parole en anglais pour la communication en cockpit en milieu bruité

http://catalog.elra.info/product_info.php?products_id=1088

ID:

ELRA-S0293

This database has been collected and packaged under the auspices of the IST-EU STREP project HIWIRE (Human Input that Works In Real Environments). The database was designed to be used as a tool for development and test of speech processing and recognition techniques dealing with robust non-native speech recognition.

The database contains 8,099 English utterances pronounced by non-native speakers (31 French, 20 Greek, 20 Italian, and 10 Spanish speakers). The collected utterances correspond to human input in a command and control aeronautics application. The data was recorded in studio with a close-talking microphone and real noise recorded in an airplane cockpit was artificially added to the data. The signals are provided in clean (studio recordings with close talking microphone), low, mid and high noise conditions. The three noise levels correspond approximately to signal-to-noise ratios of 10dB, 5dB and -5 dB respectively.

Clean audio data has been recorded in different office rooms using a close-talking microphone for lowest ambient acoustic effects (Plantronics USB-45). The used sampling frequency is 16 kHz and data is stored in Windows PCM WAV 16 bits mono format.

Recordings correspond to prompts extracted from an aeronautic command and control application. A total of 8,099 utterances have been recorded corresponding to 81 speakers pronouncing 100 utterances each. The speaker distribution is as follows:

<table border="0" width="100%" cellspacing="0" cellpadding="2" class="infoBoxContents">
<tr align=center><td>Country</td><td># Speakers</td><td># Utterances</td></tr>
<tr align=center><td>France</td><td>31 (38.3%)</td><td>3100</td></tr>
<tr align=center><td>Greece</td><td>20 (24.7%)</td><td>2000</td></tr>
<tr align=center><td>Italy</td><td>20 (24.7%)</td><td>2000</td></tr>
<tr align=center><td>Spain</td><td>10 (12.3%)</td><td>999</td></tr>
<tr align=center><td>Total</td><td>81</td><td>8099</td></tr>
</table>

To generate the noisy data utterances, the speech level is maintained and only the noise amplitude is modified to obtain the desired SNR. The noise amplitude is adjusted to obtain three different averaged SNR values of 10dB, 5dB and -5dB which are referenced as low noise (LN), mid noise (MN) and high noise (HN) conditions. For each given condition the noise level remains constant.

The speech data are pcm-wav files (16kHz / 16 bits / mono) stored on one DVD. The total size is 3.03 Gbytes for 33.053 files.

View resource description in all available languages

Cette base de données a été collectée et finalisée sous les auspices du projet IST-EU STREP HIWIRE (« Human Input that Works In Real Environments »). La base de données a été conçue pour être utilisée comme outil de développement et de test des techniques de traitement et reconnaissance de la parole en relation avec la reconnaissance de la parole robuste « non native ».

La base de données comprend 8099 occurrences en anglais prononcées par des locuteurs non natifs de l’anglais (31 locuteurs français, 20 grecs, 20 italiens et 10 espagnols). Les occurrences collectées correspondent à des entrées humaines dans une application de commande et de contrôle aéronautique. Les données ont été enregistrées en studio avec un microphone « close-talk » et du bruit réel enregistré dans un cockpit d’avion a été ajouté aux données. Les signaux sont fournis dans 4 conditions: propre (enregistrements studio avec microphone « close talk »), milieu bruité faible, milieu bruité moyen et milieu bruité élevé. Les trois niveaux de bruit correspondent approximativement et respectivement aux ratios signal-bruit de 10dB, 5dB and -5 dB.

Les données audio propres ont été enregistrées dans différents bureaux en utilisant un microphone « close-talk » pour effets acoustiques ambiants faibles (Plantronics USB-45). La fréquence d’échantillonnage utilisée est de 16 kHz et les données sont stockées au format Windows PCM WAV 16 bits mono.

Les enregistrements correspondent à des prompts extraits d’applications de commande et de contrôle aéronautique. Un total de 8099 occurrences a été enregistré par 81 locuteurs ayant prononcé 100 occurrences chacun. La répartition des locuteurs est la suivante:

<table border="0" width="100%" cellspacing="0" cellpadding="2" class="infoBoxContents">
<tr align=center><td>Pays</td><td># Locuteurs</td><td># Occurrences</td></tr>
<tr align=center><td>France</td><td>31 (38.3%)</td><td>3100</td></tr>
<tr align=center><td>Grèce</td><td>20 (24.7%)</td><td>2000</td></tr>
<tr align=center><td>Italie</td><td>20 (24.7%)</td><td>2000</td></tr>
<tr align=center><td>Espagne</td><td>10 (12.3%)</td><td>999</td></tr>
<tr align=center><td>Total</td><td>81</td><td>8099</td></tr>
</table>

Pour générer les occurrences des données bruitées, le niveau de parole est maintenu et seule l’amplitude du bruit est modifiée pour obtenir le SNR souhaité. L’amplitude du bruit est ajustée pour obtenir trois valeurs SNR moyennes de 10dB, 5dB and -5dB qui sont référencées en conditions « low noise » (LN), « mid noise » (MN) et « high noise » (HN). Pour chaque condition donnée, le niveau de bruit reste constant.

Les données de parole sont des fichiers pcm-wav (16kHz / 16 bits / mono) stockées sur un DVD. La taille totale est de 3,03 Goctets pour 33053 fichiers.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Start date: 25/11/2008

Licence

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Academic

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

User Nature: Commercial

Contact Person

Mapelli Valérie

audio

Monolingual audio corpusLanguages

English English English English

Linguality

Linguality type: Monolingual

Size

no size available

Resource Creation

Creation ended: 01/01/2007

Funding Project

HIWIRE (Human Input that Works In Real Environments)

Funding Type: Eu Funds

Metadata

Created: 12/05/2005

Version

Version: 1.0

Last Updated: 25/11/2008

Usage

Actual Use - Nlp Applications

Use NLP Specific: Speech Recognition

People who looked at this resource also viewed the following: