CroSloEngual BERT

PID

Trilingual BERT (Bidirectional Encoder Representations from Transformers) model, trained on Croatian, Slovenian, and English data. State of the art tool representing words/tokens as contextually dependent word embeddings, used for various NLP classification tasks by finetuning the model end-to-end. CroSloEngual BERT are neural network weights and configuration files in pytorch format (ie. to be used with pytorch library).

Identifier
PID http://hdl.handle.net/11356/1317
Related Identifier https://arxiv.org/abs/2006.07890
Related Identifier http://hdl.handle.net/11356/1330
Related Identifier http://embeddia.eu
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1317
Provenance
Creator Ulčar, Matej; Robnik-Šikonja, Marko
Publisher Faculty of Computer and Information Science, University of Ljubljana
Publication Year 2020
Funding Reference info:eu-repo/grantAgreement/EC/H2020/825153
Rights Creative Commons - Attribution 4.0 International (CC BY 4.0); https://creativecommons.org/licenses/by/4.0/; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Croatian; Slovenian; Slovene; English
Resource Type toolService
Format text/plain; charset=utf-8; application/octet-stream; text/plain; downloadable_files_count: 3
Discipline Linguistics