Home
Register
Login
Browse Resources
Community
Statistics
Help
About
META-SHARE Members
META-SHARE Repositories
META-SHARE Managing Nodes
LR Sharing
Licensing LRs
Notice and Takedown Policy
Privacy
Data Protection
Data Protection Statement
66
Last view: 2023-08-20
3
Last update: 2016-01-08
6
Last download: 2021-09-23
INTERA Corpus - the English structurally annotated part of the SR-EN pair
The English part of the SR-EN pair of the INTERA corpus; written, domain specific (law, education, health, finance); (1 MWs); XCES format.
« Back
Download
You don’t have the permission to edit this resource.
Edit Resource
Distribution
Availability
Available - Restricted Use
Licence
CC - BY
Restrictions:
Attribution
Distribution Access/Medium:
Downloadable
Attribution Details:
The INTERA Corpus - the English part of the SR-EN pair of the ILSP/RC Athena licensed under CC-BY as accessed via META-SHARE
Contact Person
Maria Gavrilidou
http://www.ilsp.gr/i...
Senior Research Fellow
[javascript protected email address]
Artemidos 6
GR-151 25 Maroussi
GR
Tel.: +302106875441
Fax: +302106854270
text
Monolingual text corpus
Languages
English (1,000,000 Words)
Linguality
Linguality type:
Monolingual
Text Format
application/x-xces+xml
Size
1,000,000 Words
Character encoding
UTF - 8
Domains
law
education
health
finance
Modalities
Written Language
Annotation
Structural Annotation
StandOff:
False
Segmentation level:
Sentence
Format:
application/x-xces+xml
Standard practices conformance:
XCES
Creation
Creation mode details:
web crawling; manual selection; semi-automatic conversion to the desired formats
Creation mode:
Mixed
Original Sources
various texts found mainly over the internet
Resource Creation
Creation lasted:
01/01/2003 - 31/12/2004
Funding Project
Integrated European language data Repository Area
(INTERA - e-content EDC-22076 INTERA / 27924)
URL:
http://www.elda.org/...
Funding Type:
Eu Funds
Funder:
eContent
Project duration:
01/01/2003 - 31/12/2004
Metadata
Created:
02/02/2012
Last Updated:
08/01/2016
Usage
Foreseen Use
Nlp Applications
Use NLP Specific:
Machine Translation
Actual Use - Nlp Applications
Use NLP Specific:
Terminology Extraction
Relation
Related Resource:
INTERA corpus
Relation Type:
isPartOf
Documentation
Document Type:
In Proceedings
Maria Gavrilidou and Penny Labropoulou and Elina Desipri and Voula Giouli et al,
Building parallel corpora for eContent professionals
, , COLING 2004 , 2004
Book Title:
Proceedings of COLING 2004
Document Type:
In Proceedings
Maria Gavrilidou and Penny Labropoulou and Stelios Piperidis et al,
Language resources production models: the case of INTERA multilingual corpus and terminology
, , 5th International Conference on Language Resources and Evaluation (LREC-2006) , 2006
Book Title:
Porceedings of the 5th International Conference on Language Resources and Evaluation (LREC-2006)
Document Type:
In Proceedings
Maria Gavrilidou and Penny Labropoulou and Monica Monachini and Stelios Piperidis and Claudia Soria,
Building Multilingual Terminological Resources
, , RANLP 2005 International Workshop on Language and Speech Infrastructure for Information Access in the Balkan Countries , 2005
Book Title:
Proceedings of the RANLP 2005 International Workshop on Language and Speech Infrastructure for Information Access in the Balkan Countries
Document Type:
Tech Report
Maria Gavrilidou and Voula Giouli and Elina Desipri and Penny Labropoulou and Monica Monachini et al,
D5.2 - Report on the multilingual resources production
,
http://www.elda.org/...
, 2004
People who looked at this resource also viewed the following:
INTERA Corpus - the Greek-English part
INTERA Corpus - the English structurally annotated part of the EL-EN pair
INTERA Corpus - the English structurally annotated part of the EN-SL SVEZ ACQUIS corpus
INTERA Corpus - the Greek-English terms from the EL-EN pair
People who downloaded this resource also downloaded the following:
INTERA Corpus - the English POS annotated part of the SR-EN pair
INTERA Corpus - the English structurally annotated part of the EL-EN pair
INTERA Corpus - the Bulgarian-English terms from the BG-EN pair
INTERA corpus - the Serbian-English terms from the SR-EN pair
Resources from the same project