Slovene Conformer CTC BPE E2E Automated Speech Recognition model RSDO-DS2-ASR-E2E 2.0

PID

This Conformer CTC BPE E2E Automated Speech Recognition model was trained following the NVIDIA NeMo Conformer-CTC recipe (for details see the official NVIDIA NeMo NMT documentation, https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/intro.html, and NVIDIA NeMo GitHub repository https://github.com/NVIDIA/NeMo). It provides functionality for transcribing Slovene speech to text.

The training, development and test datasets were based on the Artur dataset and consisted of 630.38, 16.48 and 15.12 hours of transcribed speech in standardised form, respectively. The model was trained for 200 epochs and reached WER 0.0429 on the development and WER 0.0558 on the test dataset.

Identifier
PID http://hdl.handle.net/11356/1737
Related Identifier https://github.com/clarinsi/Slovene_ASR_e2e
Related Identifier https://rsdo.slovenscina.eu/en/speech-technologies
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1737
Provenance
Creator Lebar Bajec, Iztok; Bajec, Marko; Bajec, Žan; Rizvič, Mitja
Publisher Faculty of Computer and Information Science, University of Ljubljana
Publication Year 2022
Rights Apache License 2.0; https://opensource.org/licenses/Apache-2.0; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Language Slovenian; Slovene
Resource Type toolService
Format text/plain; charset=utf-8; application/octet-stream; downloadable_files_count: 1
Discipline Linguistics