Monolingual text corpus Languages
Portuguese
(80,000 Tokens)
Linguality Linguality type: Monolingual
Text Format Size Character encoding
UTF - 8
(80,000 Tokens)
Domains Modalities Annotation Alignment Segmentation level: Utterance
Format: text/xml
Annotation Mode: Manual
Size:
80,000 Tokens
Morphosyntactic Annotation - Pos Tagging Segmentation level: Word
Format: text/plain
Annotation Mode: Automatic
Size:
80,000 Tokens
Time Coverage
1970-1975
(80,000 Tokens)
Geographic coverage
Portugal
(80,000 Tokens)
Creation Creation mode: Manual
Monolingual audio corpus Languages
Portuguese
(520 Minutes)
Linguality Linguality type: Monolingual
Size Effective speech duration
520 Minutes
Audio duration
520 Minutes
Domains Annotation Speech Annotation - Sound To Text Alignment Segmentation level: Utterance
Format: text/xml
Annotation Mode: Manual
Content Speech items: Other
Non-speech items: Other
Noise Level: Medium
Setting Naturality: Spontaneous
Conversational type: Monologue
Scenario: Other
Audience: No
Interactivity: Interactive
Audio Formats wav Recording quality: Medium
Quantization: 16
Time Coverage
1970-1975
Geographic coverage
Portugal
(520 Minutes)
Recording Recording environment: Closed Public Place, Office, Other
Recording device type: Other
Capture Person SourceSet Origin of persons: Native
Age of persons: Adult
Sex of persons: Mixed
Number of persons: 212
Age range end: 62
Hearing impairment of persons: No
Age range start: 17
Speaking impairment of persons: No
Geographic distribution of persons: Portugal