LX-Tagger

27 Last view: 2026-06-25

http://lxcenter.di.fc.ul.pt/tools/en/LXTaggerEN.html

The present tool, that was built to deal with Portuguese-specific issues concerning syntactic categorization, assigns a single morpho-syntactic tag, from the tagset below, to every token. The tag is attached to the token, using a / (slash) symbol as separator:

um exemplo → um/IA exemplo/CN

Each individual token in multi-token expressions gets the tag of that expression prefixed by "L" and followed by the number of its position within the expression:

de maneira a que → de/LCJ1 maneira/LCJ2 a/LCJ3 que/LCJ4

This tagger was developed with TnT software over 90% of a small, 260 Ktoken, accurately hand tagged corpus. Accuracy of 96.87% was obtained with the tagger being trained over 90% of the 260 Ktokens and evaluated over the held out 10%, this being repeated over 10 different test runs and the results averaged.
LX-Tokenizer was developed and is maintained at University of Lisbon by the NLX-Natural Language and Speech Group of the Department of Informatics.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Restricted Use

Licence

Proprietary

Restrictions: Academic - Non Commercial Use

User Nature: Academic

Download location: hidden

Distribution Access/Medium: Downloadable

Licensors:

António Branco

Distribution rights holders:

António Branco

IPR Holder

António Branco

Contact Person

António Branco

toolService

Tool

Language Dependent

Input

Media type: Text

Resource type: Corpus

Modality: Written Language

Output

Media type: Text

Resource type: Corpus

Modality: Written Language

Segmentation level: Word

Operation

Operating system: Linux

Evaluation

Evaluated: True

Resource Creation

Resource Creator

António Branco

Metadata

Created: 07/11/2012

Last Updated: 07/11/2012

Source: METANET4U

META-SHARE

Metadata Language: English (en)

Metadata Creator

Catarina Carvalheiro

Version

Version: 1.0

Last Updated: 07/11/2012

Documentation

Tool Documentation: Online

Document Type: Other

Catarina Carvalheiro, LX-Tagger Narrative Description, http://194.117.45.19...

Document Type: Masters Thesis

João Silva, Shallow Processing of Portuguese: From Sentence Chunking to Nominal Lemmatization, http://docs.di.fc.ul... , 2007

Document Language: English

People who looked at this resource also viewed the following:

Resources from the same creators