Home
Register
Login
Browse Resources
Community
Statistics
Help
User Manual (Old version)
META-SHARE Portal
About
META-SHARE Members
META-SHARE Repositories
META-SHARE Managing Nodes
LR Sharing
Licensing LRs
Notice and Takedown Policy
Privacy
Data Protection
Data Protection Statement
12
Last view: 2016-03-18
NKJP1mEcono corpus
NKJP1mEcono
http://zil.ipipan.waw.pl/NKJP1mEcono
ID:
450
Economy-related subcorpus of the National Corpus of Polish, containing manually created sense annotation layer.
« Back
Download
You don’t have the permission to edit this resource.
Edit Resource
Distribution
Availability:
Available
Licences
GPL
Conditions:
Share Alike
Distribution Details
Fee:
free of charge
Download location:
hidden
Distribution Access/Medium:
Downloadable
Contact Person
Łukasz Kobyliński
http://zil.ipipan.wa...
Research Assistant
[javascript protected email address]
Jana Kazimierza 5
01-248 Warsaw
Tel.: +48 22 38 00 559
Fax: +48 22 38 00 510
text
Monolingual text corpus
Languages
Polish
Language Script:
Latin
Linguality
Linguality type:
Monolingual
Size
87,816 Words
11 Mb
Annotation
Segmentation
Tagset:
NKJP tagset
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Lemmatization
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Segmentation
StandOff:
True
Segmentation level:
Sentence
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Semantic Annotation - Word Senses
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Manual (manually disambiguated using AnotEk)
Annotation Tools:
AnotEk
Start date:
01/06/2010
End date:
01/09/2010
Morphosyntactic Annotation - B Pos Tagging
Tagset:
NKJP tagset
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Morphosyntactic Annotation - Pos Tagging
Tagset:
NKJP tagset
StandOff:
True
Segmentation level:
Word
Format:
text/xml
Standard practices conformance:
TEI
Annotation Mode:
Automatic
Annotation Tools:
TaKIPI 1.8
Start date:
01/02/2010
End date:
28/02/2010
Segmentation
StandOff:
True
Segmentation level:
Paragraph
Format:
text/xml
Standard practices conformance:
TEI
Start date:
01/02/2010
End date:
28/02/2010
Creation
Creation mode:
Mixed
Creation mode details:
The corpus has been created by selecting economy-related paragraphs from the 1M subcorpus of the National Corpus of Polish. Manual word sense annotation has been created by linguists using AnotEk.
Original Sources
1 million subcorpus of National Corpus of Polish (1MNKJP)
Creation Tools
Java code
AnotEk 1.0
TaKIPI 1.8
Metadata
Created:
23/01/2013
Last Updated:
23/01/2013
Source:
CESAR
Metadata Creator
Łukasz Kobyliński
http://zil.ipipan.wa...
Research Assistant
[javascript protected email address]
Jana Kazimierza 5
01-248 Warsaw
Tel.: +48 22 38 00 559
Fax: +48 22 38 00 510
Version
Version:
1.0
People who looked at this resource also viewed the following:
NE3L named entities Arabic corpus
Nganasan Speech Corpus
n-grams from Slovak National Corpus
NEMLAR Broadcast News Speech Corpus