Home
Register
Login
Browse Resources
Community
Statistics
Help
About
META-SHARE Members
META-SHARE Repositories
META-SHARE Managing Nodes
LR Sharing
Licensing LRs
Notice and Takedown Policy
Privacy
Data Protection
Data Protection Statement
49
Last view: 2023-06-27
NKJP1mEcono corpus
NKJP1mEcono
http://zil.ipipan.waw.pl/NKJP1mEcono
ID:
450
Economy-related subcorpus of the National Corpus of Polish, containing manually created sense annotation layer.
« Back
Download
You don’t have the permission to edit this resource.
Edit Resource
Distribution
Availability
Available - Unrestricted Use
Licence
GPL
Restrictions:
Share Alike
Fee:
free of charge
Download location:
hidden
Distribution Access/Medium:
Downloadable
Contact Person
Łukasz Kobyliński
http://zil.ipipan.wa...
Research Assistant
[javascript protected email address]
Jana Kazimierza 5
01-248 Warsaw
Tel.: +48 22 38 00 559
Fax: +48 22 38 00 510
text
Monolingual text corpus
Languages
Polish
Linguality
Linguality type:
Monolingual
Size
87,816 Words
11 Mb
Annotation
Segmentation
Tagset:
NKJP tagset
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Lemmatization
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Segmentation
StandOff:
True
Segmentation level:
Sentence
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Semantic Annotation - Word Senses
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Manual (manually disambiguated using AnotEk)
Annotation Tools:
AnotEk
Start date:
01/06/2010
End date:
01/09/2010
Morphosyntactic Annotation - B Pos Tagging
Tagset:
NKJP tagset
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Morphosyntactic Annotation - Pos Tagging
Tagset:
NKJP tagset
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Segmentation
StandOff:
True
Segmentation level:
Paragraph
Format:
text/xml
Standard practices conformance:
TEI
Start date:
01/02/2010
End date:
28/02/2010
Creation
Creation mode:
Mixed
Creation mode details:
The corpus has been created by selecting economy-related paragraphs from the 1M subcorpus of the National Corpus of Polish. Manual word sense annotation has been created by linguists using AnotEk.
Original Sources
1 million subcorpus of National Corpus of Polish (1MNKJP)
Creation Tools
Java code
AnotEk 1.0
TaKIPI 1.8
Metadata
Created:
23/01/2013
Last Updated:
23/01/2013
Source:
CESAR
Metadata Creator
Łukasz Kobyliński
http://zil.ipipan.wa...
Research Assistant
[javascript protected email address]
Jana Kazimierza 5
01-248 Warsaw
Tel.: +48 22 38 00 559
Fax: +48 22 38 00 510
Version
Version:
1.0
People who looked at this resource also viewed the following:
Nganasan Speech Corpus
Non-native Speech in European Portuguese for Computer-Assisted Language Learning
NetDC Arabic BNSC (Broadcast News Speech Corpus)
Nepali Spoken Corpus