MLRS Corpus

142,397 Maltese texts from 10 genres.

The file “” expands into a folder “corpus”, containing the file “”, which expands into the folder “”. This folder contains the files:
• filelist.txt
• malti02.academic.txt
• malti02.literature.txt
• malti02.metadata.txt
• malti02.misc.txt
• malti02.parl.txt
• malti02.parl.txt.bak
• malti02.religion.txt
• malti02.speeches.txt
• malti02.web.genral.txt
• README.txt
• removed-from-corpus.txt
• tend.txt
• tstart.txt

All texts of a genre are in one .txt file for that genre. In this file, texts are marked with the XML tags <t>…</t>, paragraphs are marked <p>…</p>, sentences are marked <s>…</s>, and one word per line, followed by a tab and its POS tag.

You don’t have the permission to edit this resource.