The Corpus of Spoken and Written Ter Saami

53 Last view: 2026-03-23

The Corpus of Spoken and Written Ter Saami

ID:

http://urn.fi/urn:nbn:fi:lb-2015102002

Ter Saami belongs to the peninsular subbranch of East-Saamic and is not used in daily communication today.

Ter Saami used to be spoken in the eastern inland parts and the eastern coastal parts of the Kola Peninsula. But today there are no children learning the language at home; the handful of speakers left belong to the grandparent generation and live scattered around on the Kola Peninsula and even in other parts of Russia.

Ter Saami has no standardized written form. However, Ter Saami written texts have occasionally been printed using either Cyrillic or Latin script or even phonemic transcription.

This corpus contains spoken and written samples of Ter Saami. The written texts originate from a small booklet with Ter Saami poems by Oktyabrina Voronova from 1989, Pushkin's "Tale of the Fisherman and the Fish" translated into Ter Saami and published as a phonological transcription in 1971 as well as a few other small texts, among them Ter Saami words, phrases and sentences in Latin script from Chernyakov's small manuscript for a Ter Saami primer from 1929. The spoken text samples included in this corpus originate from recordings collected and transcribed by different researchers since the 1850s.

Whereas all texts in this corpus should be represented in a uniform orthographic variant, i.e. the contemporary Kildin Saami alphabet with slight modifications, in order to make corpus searches easier, the unification of the different original orthographies is still in the works. Note also that the current version of the corpus includes only the orthographic representation, but no additional morphosyntactic annotations. However, a new version of the corpus will be annotated for parts-of-speech. In the future we also plan to annotate the corpus morphologically and syntactically.

All data - including audio and video files, if there exist linked multmedia data - аre also available from the DoBeS archive of the Kola Saami Documentation Project (http://dobes.mpi.nl/projects/sami/) at The Language Archive (https://corpus1.mpi.nl/ds/asv/?29&openhandle=hdl:1839/00-0000-0000-0005-8A34-E). But note that the access to raw data and annotations through the archive might be restricted.

Scrambled sentences from the corpus will be made available in Korp (https://korp.csc.fi/).

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Unrestricted Use

Licence

CC - BY

Restrictions: Attribution

Attribution Details: The Corpus of Spoken and Written Ter Saami. Compiled by Michael Rießler with the help of Anja Harder, David Pineda Dijkerman, Maryna Litvak and Niko Partanen. Freiburg and Helsinki. 2015.

Licensors:

Michael Rießler

Distribution rights holders:

Michael Rießler

IPR Holder

Michael Rießler

Contact Person

Michael Rießler

text

Monolingual text corpusLanguages

Ter Sami

Linguality

Linguality type: Monolingual

Size

Modalities

Written Language

Metadata

Created: 21/10/2015

Last Updated: 21/10/2015

Metadata Language: English (en)

Metadata Creator

Imre Bartis

Usage

Foreseen UseHuman Use

Use NLP Specific: Linguistic Research

People who looked at this resource also viewed the following: