SmartWeb Motorbike Corpus (SMC)
View resource name in all available languages
Corpus SMC (SmartWeb Motorbike Corpus)
The SMARTWEB UMTS data collection was created within the publicly funded German SmartWeb project in the years 2004-2006. It comprises a collection of user queries to a naturally spoken Web interface with the main focus on the soccer world series in 2006. The recordings include field recordings using a hand-held UMTS device (one person, SmartWeb Handheld Corpus SHC, ref. ELRA-S0278), field recordings with video capture of the primary speaker and a secondary speaker (SmartWeb Video Corpus SVC, ref. ELRA-S0279), as well as mobile recordings performed on a BMW motorbike (one speaker, SmartWeb Motorbike Corpus SMC, ref. ELRA-S0280).
This corpus corresponds to mobile recordings performed on a BMW motorbike (SmartWeb Motorbike Corpus SMC) and contains recordings spoken by 36 speakers in a human-machine query situation on a running motor cycle (BMW). Bikers were asked to solve several tasks with a spoken query system to the WWW using an integrated system connected to a speech server via an UMTS connection. Recorded channels are the Bluetooth helmet microphone over UMTS (telephone quality), and - partly - the Bluetooth helmet microphone and an additional neck microphone in high quality.
The corpus contains:
- Total number of recorded queries: 2,315
- Total duration segmented speech: 377 minutes
- Formats: WAV 44,1kHz, 16 bit, ALAW 8kHz 8bit, Verbmobil transliteration, BAS Partitur Format (BPF)
- Segmentation: automatic segmentation into queries by the recording server
- Distribution: 3 DVD-R
See also ELRA-S0278 and ELRA-S0280.
View resource description in all available languages