Slovene Conformer CTC BPE E2E Automated Speech Recognition model RSDO-DS2-ASR-E2E 2.0

Dataset

PID

This Conformer CTC BPE E2E Automated Speech Recognition model was trained following the NVIDIA NeMo Conformer-CTC recipe (for details see the official NVIDIA NeMo NMT documentation, https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/intro.html, and NVIDIA NeMo GitHub repository https://github.com/NVIDIA/NeMo). It provides functionality for transcribing Slovene speech to text.

The training, development and test datasets were based on the Artur dataset and consisted of 630.38, 16.48 and 15.12 hours of transcribed speech in standardised form, respectively. The model was trained for 200 epochs and reached WER 0.0429 on the development and WER 0.0558 on the test dataset.

Identifier
PID	http://hdl.handle.net/11356/1737
Related Identifier	https://github.com/clarinsi/Slovene_ASR_e2e
Related Identifier	https://rsdo.slovenscina.eu/en/speech-technologies
Metadata Access	http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1737

Provenance
Creator	Lebar Bajec, Iztok; Bajec, Marko; Bajec, Žan; Rizvič, Mitja
Publisher	Faculty of Computer and Information Science, University of Ljubljana
Publication Year	2022
Rights	Apache License 2.0; https://opensource.org/licenses/Apache-2.0; PUB
OpenAccess	true
Contact	info(at)clarin.si

Representation
Language	Slovenian; Slovene
Resource Type	toolService
Format	text/plain; charset=utf-8; application/octet-stream; downloadable_files_count: 1
Discipline	Linguistics