Monolingual text corpus Languages
Portuguese
(92,000 Tokens)
Linguality Linguality type: Monolingual
Text Format Size Character encoding
UTF - 8
(92,000 Tokens)
Domains Annotation Morphosyntactic Annotation - Pos Tagging Segmentation level: Word
Format: text/plain
Annotation Mode: Automatic
Alignment Segmentation level: Utterance
Annotation Mode: Manual
Time Coverage
1970-2001
(92,000 Tokens)
Geographic coverage
Portugal, Brazil, Angola, Cape Verde, Guinea-Bissau, Mozambique, Macao, Goa, East-Timor
(92,000 Tokens)
Creation Creation mode: Manual
Monolingual audio corpus Languages
Portuguese
(524 Minutes)
Variety:
Portuguese from Guinea-Bissau
(Type: Other)
(20 Minutes)
Variety:
Portuguese from Macau
(Type: Other)
(36 Minutes)
Variety:
Portuguese from Mozambique
(Type: Other)
(29 Minutes)
Variety:
European Portuguese
(Type: Other)
(160 Minutes)
Variety:
Portuguese from Angola
(Type: Other)
(41 Minutes)
Variety:
Portuguese from Cape Verde
(Type: Other)
(33 Minutes)
Variety:
Portuguese from Goa
(Type: Other)
(13 Minutes)
Variety:
Portuguese from São Tome and Principe
(Type: Other)
(35 Minutes)
Variety:
Brazilian Portuguese
(Type: Other)
(117 Minutes)
Variety:
Portuguese from Timor
(Type: Other)
(16 Minutes)
Linguality Linguality type: Monolingual
Size Effective speech duration
524 Minutes
Domains Annotation Speech Annotation - Orthographic Transcription Segmentation level: Word
Format: text/plain
Annotation Mode: Manual
Speech Annotation - Sound To Text Alignment Segmentation level: Utterance
Format: text/xml
Annotation Mode: Manual
Content Speech items: Other
Non-speech items: Other
Noise Level: Low
Setting Naturality: Spontaneous
Conversational type: Monologue
Scenario: Other
Audience: No
Interactivity: Interactive
Audio Formats wav Recording quality: High
Quantization: 16
Time Coverage
1970-2001
Geographic coverage
Portugal, Brazil, Angola, Cape Verde, Guinea-Bissau, Mozambique, Macao, Goa and East-Timor
Recording Recording environment: Closed Public Place, Other
Recording device type: Other
Capture Capturing device type: Microphone
Person SourceSet Origin of persons: Native
Age of persons: Adult
Sex of persons: Mixed
Number of persons: 86
Age range end: 82
Hearing impairment of persons: No
Age range start: 17
Speaking impairment of persons: No
Geographic distribution of persons: Portugal, Brazil, Angola, Cape Verde, Guinea-Bissau, Mozambique, Macao, Goa, East-Timor