Home
Register
Login
Browse Resources
Community
Statistics
Help
About
META-SHARE Members
META-SHARE Repositories
META-SHARE Managing Nodes
LR Sharing
Licensing LRs
Notice and Takedown Policy
Privacy
Data Protection
Data Protection Statement
82
Last view: 2023-10-20
2
Last update: 2013-01-21
Cultural Parallels: Study and promotion of the cultural inheritance of the neighbouring areas in Greece and Bulgaria through internet technology
Cultural Parallels corpus
Collection of Greek and Bulgarian literary texts for the promotion of the common cultural inheritance of the neighbouring areas in Greee and Bulgaria.
« Back
Download
You don’t have the permission to edit this resource.
Edit Resource
Distribution
Availability
Available - Restricted Use
Licence
Under Negotiation
Restrictions:
Academic - Non Commercial Use
Distribution Access/Medium:
Accessible Through Interface
Contact Persons
Nikos Glaros
http://www.ilsp.gr/i...
Institute for Language and Speech Processing / Athena R.C.
ILSP / Athena R.C.
Senior Research Fellow
[javascript protected email address]
Tel.: +302106875451
Fax: +302106854270
http://www.ilsp.gr/
ILSP / Athena R.C.
Artemidos 6 & Epidavrou
GR-15125 Maroussi
GR
[javascript protected email address]
Tel.: +302106875300
Fax: +302106854270, +302106856794
Voula Giouli
http://www.ilsp.gr/i...
Institute for Language and Speech Processing / Athena R.C.
ILSP / Athena R.C.
Scientific Associate
[javascript protected email address]
Tel.: +302106875448
Fax: +302106854270
http://www.ilsp.gr/
ILSP / Athena R.C.
Artemidos 6 & Epidavrou
GR-15125 Maroussi
GR
[javascript protected email address]
Tel.: +302106875300
Fax: +302106854270, +302106856794
text
Bilingual text corpus
Languages
Bulgarian
Modern Greek (1453-)
Linguality
Linguality type:
Bilingual
Multi-linguality type:
Comparable (The corpus consists of a subset of Greek texts (fairy tales) and their translation into Bulgarian, a subset of Bulgarian (fairy tales) with their translation into Greek and a subset of Greek and Bulgarian comparable texts (in the same domain))
Text Format
text/txt
Size
700,000 Words
Character encoding
UTF - 8
ISO - 8859 - 7
Domains
fiction
Modalities
Written Language
Classification
Text type:
Poems
Conformance to classification scheme:
Other
Text type:
Fiction
Conformance to classification scheme:
Other
Text type:
Fairy tales
Conformance to classification scheme:
Other
Annotation
Lemmatization
Tagset:
ILSP tagset
StandOff:
False
Format:
XML
Standard practices conformance:
XCES
Annotation Mode:
Mixed
Semantic Annotation - Named Entities
Tagset:
ACE-extended
StandOff:
False
Format:
XML
Standard practices conformance:
Other
Annotation Tools:
MENER
Segmentation
StandOff:
False
Segmentation level:
Sentence
Format:
TIPSTER
Annotation Mode:
Automatic
Morphosyntactic Annotation - B Pos Tagging
Tagset:
ILSP tagset
StandOff:
False
Format:
XML
Standard practices conformance:
XCES
Geographic coverage
Thrace
Creation
Creation mode details:
Scanning & OCR followed by manual checks
Creation mode:
Mixed
Original Sources
Scanned texts
Resource Creation
Creation lasted:
01/10/2005 - 30/09/2007
Funding Project
Cultural Parallels: Study and promotion of the cultural inheritance of the neighbouring areas in Greece and Bulgaria through internet technology
(Cultural Parallels)
Funding Type:
National Funds
Funder:
INTERREG IIIA / PHARE CBC GREECE – BULGARIA
Funding Country:
GR
Project duration:
01/10/2005 - 30/09/2007
Metadata
Created:
02/02/2012
Last Updated:
21/01/2013
Source:
META-SHARE/ILSP
Validation
Validated
Type of Validation:
Content
Validation Mode:
Mixed
Mode Details:
Manual correction of the annotations, mainly of the mixed language texts
Usage
Foreseen Use
Nlp Applications
Human Use
Actual Use - Human Use
Documentation
Document Type:
In Proceedings
Voula Giouli and E. Marzelou and Prokopis Prokopidis and M. Zourari,
Language Technologies for Processing Greek textual Cultural Heritage data
, , 9th International Conference on Greek Linguistics , 2009
Book Title:
Proceedings of the 9th International Conference on Greek Linguistics
Document Type:
In Proceedings
Voula Giouli and Nick Glaros and K. Simov and P. Osenova,
A web-enabled and speech-enhanced parallel corpus of Greek - Bulgarian cultural texts
, , EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities and Education - LaTeCH - SHELT&R 2009 , 2009
Book Title:
Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities and Education - LaTeCH - SHELT&R 2009
Document Type:
In Book
Voula Giouli and K. Simov and P. Osenova,
Linguistic Resources for CH/SSH. A parallel Greek-Bulgarian corpus: a digital resource of the shared Cultural Heritage
, , 2011
Editor:
Sporlender, C. and van de Bosch, A., and Zervanou, K.
Publisher:
Springer
Book Title:
Language Technology for Cultural Heritage: Selected papers from the LaTeCH Workshop Series
People who looked at this resource also viewed the following:
Cultural Thesaurus of the Greek Language
CSTParser
CstLemma - for Danish
Culture for All
Resources from the same project