Siemens Synthesis Corpus - SI1000P – META-SHARE

Last view: 2026-06-25

14 Last view: 2026-06-25

Siemens Synthesis Corpus - SI1000P

View resource name in all available languages

Corpus de synthèse de Siemens - SI1000P

http://catalog.elra.info/product_info.php?products_id=567

ID:

ELRA-S0082

The SI1000P recordings were done to provide material for high quality concatenate speech synthesis. It contains 1000 newspaper sentences read by two German professional broadcasting announcers in studio quality together with the laryngographic signal and the glottal pulse stream. Parts of the corpus were labelled and segmented phonemically (SAM-PA) and prosodically (borders + accents).

Both speakers are trained and experienced broadcast announcers at the local state broadcasting unit. They were asked to read the texts in a speaking style like broadcast announcing, very correct, but fluently and without pausing between words.

The recordings were done in a total echo-cancelling studio at the Institute of Phonetics at the University of Munich. Recording channels were:
- speech signal recorded by Sennheiser MKH20 omnidirectional, 30 cm from mouth.
- laryngograph signal, LxProc of Laryngograph Ltd. London.
- glottis pulse stream by laryngograph
- start/stop pulse at beginning and end of utterance
Recording machine was a high quality 4 channel DAT (48 kHz, 16 bit). The data were copied to hard disk and cut according the pulse information in the forth channel into separate utterances (one utterance per file).

Speech signals were filtered and down-sampled from 48 kHz to 16 kHz. Laryngograph signals were filtered and downsampled to 16 kHz. The format of the signal files is PhonDat 2.

The resulting segmentation and all information accompanying the signal is summed up in the corresponding Partitur File. The Partitur File format is an open structure that allows the easy description and processing of information aligned to a speech signal.

The database also provides an ordered list of all occurring words together with the standard pronunciation in SAM-PA and the orthography of all spoken utterances in the corpus.

View resource description in all available languages

Les enregistrements de SI1000P ont été réalisés dans le but de fournir du matériel pour la synthèse vocale concaténée de haute qualité. Le corpus comprend 1000 phrases de journaux lues par deux annonceurs professionnels allemands de radio-télé-diffusion en qualité studio avec le signal laryngographique et le signal des pulsations glottiques. Des parties du corpus ont été étiquetées et segmentées au niveau phonémique (SAM-PA) et prosodique (frontières + accents).

Les deux locuteurs sont des présentateurs expérimentés de la radio-télé-diffusion locale d'état. On leur a demandé de lire les textes dans un style parlé à la manière des annonces de radio-télé-diffusion, très correct, mais de manière fluide et sans marquer de pause entre les mots.

Les enregistrements ont été réalisés dans un studio anéchoïque à l'Institut de Phonétique de l'Université de Munich. Les canaux d'enregistrements sont :

- signal de parole enregistré par un Sennheiser MKH20 omnidirectionnel, placé à 30 cm de la bouche.
- signal laryngographique, LxProc de Laryngograph Ltd. Londres.
- signal de pulsation à la glotte par laryngographe
- indicateur au début et à la fin de chaque occurrence

L'appareil enregistreur utilisé est un DAT 4 canaux haute qualité (48 kHz, 16 bit). Les données ont été copiées sur un disque dur et découpées en occurrences séparées (une occurrence par fichier) en fonction du signal d'impulsion présent dans le quatrième canal.

Les signaux de parole ont été filtrés et sous-échantillonnés de 48 kHz à 16 kHz. Les signaux laryngographiques ont été filtrés et sous-échantillonnés à 16 kHz. Le format des fichiers de signaux utilisé est PhonDat 2.

La segmentation et l'information accompagnant le signal sont données dans le fichier "Partitur" correspondant. Le format de fichier "Partitur" est une structure ouverte qui permet une description facile et un traitement de l'information aligné à un signal de parole.

La base de données fournit également une liste de tous les mots prononcés avec la prononciation standard en SAM-PA, ainsi que l'orthographe de toutes les occurrences parlées du corpus.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Start date: 06/04/2000

Licence

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Academic

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Non Members of ELRA

User Nature: Commercial

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Academic

ELRA END USER

Restrictions: Academic - Non Commercial Use

For Members of ELRA

User Nature: Commercial

ELRA VAR

Restrictions: Commercial Use

For Members of ELRA

User Nature: Commercial

Contact Person

Mapelli Valérie

audio

Monolingual audio corpusLanguages

German

Linguality

Linguality type: Monolingual

Size

no size available

Metadata

Created: 12/05/2005

Version

Version: 1.0

Last Updated: 09/05/2005

People who looked at this resource also viewed the following: