PANACEA Environment Bilingual Glossary FR-EN (French-English)

DOI

This folder contains files for bilingual glossary creation from factored phrase tables that include part of speech tagged text for FR-EN language pair. The tables are firstly filtered using part of speech tag sequences for each language so that entries with unsuitable part of speech sequences are filtered out. Then, feature scores from the phrase table are combined in a log-linear model to score each entry. The user specifies how large the output glossary should be (relative to the input) and the bottom ranking entries are discarded to produce the desired size glossary.

Identifier
DOI https://doi.org/10.34810/data356
Metadata Access https://dataverse.csuc.cat/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34810/data356
Provenance
Creator Dublin City University. School of Computing
Publisher CORA.Repositori de Dades de Recerca
Contributor Universitat Pompeu Fabra. Institut Universitari de Lingüística Aplicada (IULA)
Publication Year 2023
Funding Reference European Commission 248064
Rights Custom Dataset Terms; info:eu-repo/semantics/openAccess; https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.34810/data356
OpenAccess true
Representation
Resource Type Textual data; Dataset
Format application/pdf; text/plain; text/html; application/zip
Size 172184; 205; 8889; 2143; 13296014
Version 1.0
Discipline Agriculture, Forestry, Horticulture, Aquaculture; Agriculture, Forestry, Horticulture, Aquaculture and Veterinary Medicine; Humanities; Life Sciences; Social Sciences; Social and Behavioural Sciences; Soil Sciences