LUNA.PL Corpus

75 Last view: 2026-07-03

LUNA.PL

http://zil.ipipan.waw.pl/LUNA

ID:

407 The corpus contains human-human spoken dialogues in Polish. The corpus is annotated on several levels, from transcription of dialogues and their morphosyntactic analysis, to semantic annotation on concepts, predicates and anaphora. Annotation on the morphosyntactic and semantic levels was done automatically and then manually corrected. At the concept level, the annotation scheme comprises about 200 concepts from an ontology designed specially for the project. The set of frames for predicate level annotation was defined as a FrameNet-like resource.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Unrestricted Use

Licence

BSD - Style

Restrictions: Attribution

Fee: free of charge

Download location: hidden

Distribution Access/Medium: Downloadable

Contact Person

Małgorzata Marciniak

text

Monolingual text corpusLanguages

Polish

Linguality

Linguality type: Monolingual

Size

81,049 Words

500 Files

12,778 Utterances

1.2 Gb

AnnotationSegmentation

Segmentation level: Utterance, Word, Word Group

Semantic Annotation

Segmentation level: Clause, Word Group

Creation

Creation mode details: Manual transcription of recorded data, automatic creation of files with information about speakers’ turns...

Creation mode: Mixed

Original Sources

Warsaw Transport Authority information center recordings

Resource Creation

Funding Project

Spoken Language UNderstanding in multilinguAl communication systems (LUNA)

URL: http://www.ist-luna.eu

Funding Types: Eu Funds, National Funds

Funders: European Commission, Polish Ministry of Science and Higher Education

Funding Country: Poland

Project duration: 04/09/2006 - 03/09/2009

Metadata

Created: 17/10/2011

Source: CESAR

Metadata Creator

Małgorzata Marciniak

Maciej Ogrodniczuk

Michał Lenart

Version

Version: 1.0

Last Updated: 12/11/2011

People who looked at this resource also viewed the following:

Resources from the same project