Données de réplication pour : "Solving morphological analogies: from retrieval to generation"

DOI

This repository contains the models trained for the article "Solving morphological analogies: from retrieval to generation".

The data is split in 3 folders: "models", "results", and "logs".

The folders "models" and "result", respectively found as "model.zip" and "results.zip", contain the path structure "[model]/[dataset]/[langage]/[random_initialization_id]", where "[dataset]" follows the Siganalogies labels: "2016" for Sigmorphon2016 and JBATS; "2019" for Sigmorphon2019.

The data contained is as follows:

"models/[...]/model.pkl": PyTorch model file; "models/[...]/summary.csv": file containing the evaluation results and other metadata about the training and the structure of the model, as well as the timestamp at which the model finished training; "models/[...]/version_1.0/": PyTorch-Lightning training logs viewable by Tensorboard; "models/[...]/fails.csv": enumeration of all the test analogies that the model did not manage to predict correctly, in an extensive format (for most purposes, it is not necessary to consult Siganalogies to analyse the results).

The two folders cover the following models:

"clf": CNN+ANNc for classification; "ret": CNN+ANNr for retrieval; "3cosmul": CNN+3CosMul for retrieval, only contains "summary.csv" and reuses the embedding model of "clf"; "ret-annc": CNN+ANNc for retrieval, only contains "summary.csv" and reuses the embedding model of "clf".

The folder "logs" has been unpacked in Dorel, and each file can be found separately. The path structure used follows "ae_annr/[dataset]/[langage]/model[random_initialization_id]-data[random_data_split_id]".

The data contained is as follows:

"logs/ae_annr/[...]/debug/checkpoints/[...].pkl": PyTorch-Lightning model file; "logs/ae_annr/[...]/summary.csv": file containing the evaluation results and other metadata about the training and the structure of the model, as well as the timestamp at which the model finished training; "logs/ae_annr/[...]/debug/": PyTorch-Lightning training logs viewable by Tensorboard; "logs/ae_annr/[...]/fails.csv": enumeration of all the test analogies that the model did not manage to predict correctly, in an extensive format (for most purposes, it is not necessary to consult Siganalogies to analyse the results).

This folder only covers the AE+ANNr model.

PyTorch, 1.13.1

PyTorch-Lightning, 1.9.3

Identifier
DOI https://doi.org/10.12763/I5ED78
Metadata Access https://dorel.univ-lorraine.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.12763/I5ED78
Provenance
Creator Marquer, Esteban ORCID logo; Miguel Couceiro ORCID logo
Publisher Université de Lorraine
Contributor Marquer, Esteban
Publication Year 2023
Rights Etalab (CC-BY); info:eu-repo/semantics/openAccess; https://www.etalab.gouv.fr/wp-content/uploads/2017/04/ETALAB-Licence-Ouverte-v2.0.pdf
OpenAccess true
Contact Marquer, Esteban (Université de Lorraine)
Representation
Resource Type Model; Dataset
Format application/octet-stream; text/tab-separated-values; application/zip
Size 9671; 475; 9303; 7831; 8015; 9487; 6543; 7279; 6727; 9119; 7095; 5991; 8199; 7647; 7463; 8383; 3783; 288166; 246298; 92001; 132350; 800043; 777042; 102578; 396064; 75909; 640942; 182310; 122164; 812243; 646510; 970871; 117234; 422484; 190463; 79825; 764977; 58487; 119640; 484132; 1829523; 182365; 67983; 134441; 466632; 529178; 1027004; 324142; 968139; 257362; 194644; 869412; 87054; 390848; 288259; 330382; 156436; 532306; 735377; 84429; 291063; 355577; 175465; 249998; 366480; 214205; 233738; 794093; 596411; 592924; 247492; 775474; 81735; 103178; 200142; 422931; 767018; 62693; 66373; 352666; 126816; 100621; 227314; 389929; 555057; 148366; 535922; 120191; 576988; 349989; 74580; 557899; 296071; 336923; 496806; 1077370; 570521; 67454; 478411; 511318; 529679; 76738; 474617; 204465; 195305; 550384; 303238; 829575; 559692; 356227; 1174585; 443978; 195577; 59100; 246138; 561669; 537814; 136216; 480914; 449505; 416750; 208222; 65362; 305055; 930905; 243993; 252870; 361352; 243062; 256818; 91071; 406355; 230446; 467726; 337468; 87044; 522712; 252372; 61828; 187658; 486886; 395916; 65981; 508966; 755637; 133643; 758077; 78877; 306519; 242332; 894525; 753945; 244271; 939784; 93032; 252199; 431985; 198650; 304809; 62912; 268297; 80453; 366316; 61331; 453753; 269424; 513851; 344149; 303405; 75554; 91645; 60163; 281988; 320694; 453993; 112479; 395098; 1274; 1820; 816; 1542; 1870; 1372; 1060; 1560; 574; 1340; 884; 932; 996; 716; 1116; 1200; 71774254; 400248434; 11195597; 11181709; 11209357; 11292429; 11361869; 11486477; 11597389; 11126349; 11264781; 11500365; 11625101; 11527949; 11056909; 11112269; 11361805; 173; 164; 168; 179; 171; 182; 157; 166; 178; 175; 169; 184; 1744; 185; 167; 170; 1759; 1797; 186; 165; 162; 174; 1689; 1850; 172; 180; 1717; 176; 177; 190; 1669; 1737; 181; 163; 187; 191; 1693; 183; 1687; 188; 1760; 161; 1755; 1691; 1747; 1655
Version 1.0
Discipline Agriculture, Forestry, Horticulture, Aquaculture; Agriculture, Forestry, Horticulture, Aquaculture and Veterinary Medicine; Humanities; Life Sciences; Linguistics; Social Sciences; Social and Behavioural Sciences; Soil Sciences