Bilingual term pairs extracted from comparable Web resources using the TaaS Bilingual Term Extraction System
ID:
TAAS-FMC-1
The resource contains bilingual term pairs automatically extracted from comparable resources found in the Web using the TaaS Bilingual Term Extraction System. The workflow for bilingual term extraction consisted of:
1) Focussed Monolingual Crawler for comparable corpora collection from the Web and for plaintext extraction.
2) DictMetric for cross-lingual document level alignment of the collected comparable corpora.
3) Tilde's Wrapper System for CollTerm for identification of terms in plaintext documents.
4) Term normalisation tools developed by Tilde and University of Sheffield for acquisition of term normalised (canonical) forms from terms in different surface forms.
5) MPAligner in order to extract bilingual term pairs (align terms) from term tagged Wikipedia document pairs.
People who looked at this resource also viewed the following:
- Bilingual term pairs extracted from Wikipedia using the TaaS Bilingual Term Extraction System
- Bilingual term pairs extracted from comparable news feeds resources using the TaaS Bilingual Term Extraction System.
- Blacklist Classifier
- Bilingual Spanish-English and English-Spanish lexicons (INCYTA) - Technology, Engineering & Construction