Home
Register
Login
Browse Resources
Community
Statistics
Help
About
META-SHARE Members
META-SHARE Repositories
META-SHARE Managing Nodes
LR Sharing
Licensing LRs
Notice and Takedown Policy
Privacy
Data Protection
Data Protection Statement
99
Last view: 2024-11-28
2
Last update: 2015-11-25
23
Last download: 2024-05-22
INTERA Corpus - the Bulgarian-English part
The Bulgarian-English part of the INTERA corpus; written, domain specific (law, education) parallel subcorpus; 2MWs (1 MWs per language); TMX format.
« Back
Download
You don’t have the permission to edit this resource.
Edit Resource
Distribution
Availability
Available - Restricted Use
Licence
CC - BY
Restrictions:
Attribution
Distribution Access/Medium:
Downloadable
Attribution Details:
The INTERA Corpus - Bulgarian-English part of ILSP/RC Athena licensed under CC-BY as accessed via META-SHARE
Contact Person
Maria Gavrilidou
http://www.ilsp.gr/i...
Senior Research Fellow
[javascript protected email address]
Artemidos 6
GR-151 25 Maroussi
GR
Tel.: +302106875441
Fax: +302106854270
text
Bilingual text corpus
Languages
Bulgarian (1,000,000 Words)
English (1,000,000 Words)
Linguality
Linguality type:
Bilingual
Multi-linguality type:
Parallel
Text Format
application/x-tmx+xml
Size
2,000,000 Words
Character encoding
UTF - 8
Domains
law
education
Modalities
Written Language
Annotation
Alignment
StandOff:
False
Segmentation level:
Sentence
Format:
application/x-tmx+xml
Standard practices conformance:
TMX
Creation
Creation mode details:
web crawling; manual selection; semi-automatic conversion to the desired formats
Creation mode:
Mixed
Resource Creation
Creation lasted:
01/01/2003 - 31/12/2004
Funding Project
Integrated European language data Repository Area
(INTERA - e-content EDC-22076 INTERA / 27924)
URL:
http://www.elda.org/...
Funding Type:
Eu Funds
Funder:
eContent
Project duration:
01/01/2003 - 31/12/2004
Metadata
Created:
02/02/2012
Last Updated:
26/11/2015
Usage
Foreseen Use
Nlp Applications
Use NLP Specific:
Machine Translation
Actual Use - Nlp Applications
Use NLP Specific:
Terminology Extraction
Relation
Related Resource:
INTERA corpus
Relation Type:
isPartOf
Documentation
Document Type:
In Proceedings
Maria Gavrilidou and Penny Labropoulou and Elina Desipri and Voula Giouli et al,
Building parallel corpora for eContent professionals
, , COLING 2004 , 2004
Book Title:
Proceedings of COLING 2004
Document Type:
In Proceedings
Maria Gavrilidou and Penny Labropoulou and Stelios Piperidis et al,
Language resources production models: the case of INTERA multilingual corpus and terminology
, , 5th International Conference on Language Resources and Evaluation (LREC-2006) , 2006
Book Title:
Porceedings of the 5th International Conference on Language Resources and Evaluation (LREC-2006)
Document Type:
In Proceedings
Maria Gavrilidou and Penny Labropoulou and Monica Monachini and Stelios Piperidis and Claudia Soria,
Building Multilingual Terminological Resources
, , RANLP 2005 International Workshop on Language and Speech Infrastructure for Information Access in the Balkan Countries , 2005
Book Title:
Proceedings of the RANLP 2005 International Workshop on Language and Speech Infrastructure for Information Access in the Balkan Countries
Document Type:
Tech Report
Maria Gavrilidou and Voula Giouli and Elina Desipri and Penny Labropoulou and Monica Monachini et al,
D5.2 - Report on the multilingual resources production
,
http://www.elda.org/...
, 2004
People who looked at this resource also viewed the following:
INTERA Corpus - the Bulgarian POS annotated part of the BG-EN pair
INTERA Corpus - the Bulgarian-English terms from the BG-EN pair
Insurance (Termcat)
Ingrian Corpus (UHLCS)
People who downloaded this resource also downloaded the following:
INTERA Corpus - the Bulgarian POS annotated part of the BG-EN pair
INTERA Corpus - the Greek-English part
INTERA English-Slovene SVEZ ACQUIS Corpus
INTERA Corpus - the Serbian-English part
Resources from the same project