LX-UTagger is a POS tagger for Portuguese that adopts the Universal Part-of-Speech tagset (UPOS), related to the Universal Dependency framework, with an initial performance of 99.06% under a ten-fold cross validation scheme.
It is described in this article:
António Branco, João Ricardo Silva, Luís Gomes and João Rodrigues, 2022, "Universal Grammatical Dependencies for Portuguese with CINTIL Data, LX Processing and CLARIN support", In Proceedings, 13th Conference on Language Resources and Evaluation (LREC2022).
which should be used as its canonical citation, and which interested users are referred for detailed information.
This tagger is trained with its companion CINTIL-UPos corpus, with around 1 Million manually annotated tokens, which can be obtained here: https://hdl.handle.net/21.11129/0000-000E-8B30-F.
You may also be interested in the following related resources that can also be found in this repository:
LX-USuite (https://hdl.handle.net/21.11129/0000-000F-327C-E),
LX-UDParser (https://hdl.handle.net/21.11129/0000-000E-8B31-E),
LX-Suite (https://hdl.handle.net/21.11129/0000-000E-5991-A),
LX-Tagger (https://hdl.handle.net/21.11129/0000-000B-D325-D),
LX-DepParser (https://hdl.handle.net/21.11129/0000-000E-598D-0),
LX-Parser (https://hdl.handle.net/21.11129/0000-000E-5999-2).