Home
Register
Login
Browse Resources
Community
Statistics
Help
User Manual (Old version)
META-SHARE Portal
About
META-SHARE Members
META-SHARE Repositories
META-SHARE Managing Nodes
LR Sharing
Licensing LRs
Notice and Takedown Policy
Privacy
Data Protection
Data Protection Statement
17
Last view: 2023-06-22
Corpus of Colloquial Bulgarian
BgSpeech
http://www.bgspeech.net/index.html
ID:
821
The Corpus of Colloquial Bulgarian is a selection of data of oral forms of contemporary Bulgarian language amounting to 357,584 signs.
« Back
Download
You don’t have the permission to edit this resource.
Edit Resource
Distribution
Availability:
Available
Availability Start date:
21/01/2013
Licences
CC - BY - NC - SA 4.0
Distribution Details
Download location:
hidden
Distribution Access/Medium:
Accessible Through Interface
Execution location:
hidden
IPR Holders:
Sofia University St. Kliment Ohridski
Department of Bulgarian Language
Sofia University St. Kliment Ohridski
SU
[javascript protected email address]
15 Tsar Osvoboditel Blvd.
1504 Sofia
Bulgaria (BG)
Tel.: +3592 9308 248
Contact Person
Yovka Tisheva
Sofia University St. Kliment Ohridski
SU
Associate Professor
[javascript protected email address]
15 Tsar Osvoboditel Blvd.
1504 Sofia
Bulgaria (BG)
Department of Bulgarian Language
SU
15 Tsar Osvoboditel Blvd.
1504 Sofia
Bulgaria
[javascript protected email address]
text
Monolingual text corpus
Languages
Bulgarian
Language Script:
Cyrillic
Linguality
Linguality type:
Monolingual
Size
357,584 Phonetic Units
Character encoding
UTF - 8
Modalities
Spoken Language
Annotation
Speech Annotation - Speaker Turns
Segmentation level:
Paragraph, Phoneme, Prosodic Boundaries, Sentence, Word
Resource Creation
Resource Creator
Sofia University St. Kliment Ohridski
Department of Bulgarian Language
Sofia University St. Kliment Ohridski
SU
[javascript protected email address]
15 Tsar Osvoboditel Blvd.
1504 Sofia
Bulgaria (BG)
Creation started:
01/05/2000
Funding Project
Central and South-East European Resources
(CESAR)
URL:
http://cesar.nytud.hu/
Funding Type:
Eu Funds
Project duration:
01/02/2011 - 30/01/2013
Elaboration of the transcription system for the spoken Bulgarian
Funding Type:
National Funds
Funding Country:
Bulgaria (BG)
Syntactic characteristics of the contemporary spoken Bulgarian
Funding Type:
National Funds
Funding Country:
Bulgaria (BG)
Annotating corpora of spoken Bulgarian
Funding Type:
National Funds
Funding Country:
Bulgaria (BG)
Maintaining and updating the data base of contemporary Bulgarian language
Funding Type:
National Funds
Funding Country:
Bulgaria (BG)
Metadata
Created:
29/01/2013
Last Updated:
01/02/2013
Metadata Creator
Tsvetana Dimitrova
Institute for Bulgarian Language
IBL
Assistant Professor
[javascript protected email address]
52 Shipchenski prohod Blvd.
1113 Sofia
Bulgaria (BG)
Department of Computational Linguistics
IBL
[javascript protected email address]
Version
Version:
4.0
Last Updated:
15/01/2013
Validation
Validated
Usage
Access tools
http://www.bgspeech....
Foreseen Use
Human Use
Actual Use - Human Use
Documentation
Tool Documentation:
Online
Samples Location:
http://www.bgspeech....
Atanas Atanasov. Encoding Bulgarian Colloquial Speech Using TEI Specification. Computer Applications in Slavic Studies. “Boyan Penev” Publishing Center, Sofia, 2006, pp. 233-240
Атанас Атанасов. Проблеми при създаването на езикови корпуси с транскрибирана българска разговорна реч. Паисиеви четения. Научни трудове, том 44, кн. 1, сб. А, 2006. УИ “Паисий Хилендарски”, Пловдив, 2006, 289-296
Йовка Тишева, Марина Джонова. Електронни ресурси за българската разговорна реч (инициативата BgSpeech). Littera et Lingua, лято 2010.
Йовка Тишева, Марина Джонова. Корпус с устна българска реч – специфика и структура. Български език 58 (2011), 3, 34-53
People who looked at this resource also viewed the following:
Corpora of Spoken Finnish
Corpus of Contemporaneous Spanish Novels
Corpus of Colloquial Erzya
Coral Corpus Aligner
Resources from the same project