This paper presents a new set of lemma embeddings for the Latin language. Embeddings are trained on a manually annotated corpus of texts belonging to the Classical era: different models, architectures and dimensions are tested and evaluated using a novel benchmark for the synonym selection task. A qualitative evaluation is also performed on the embeddings of rare lemmas. In addition, we release vectors pre-trained on the “Opera Maiora” by Thomas Aquinas, thus providing a resource to analyze Latin in a diachronic perspective.
Vir is to Moderatus as Mulier is to Intemperans. Lemma Embeddings for Latin / Sprugnoli, Rachele; Passarotti, Marco; Moretti, Giovanni. - (2019), pp. 1-7. (Intervento presentato al convegno Sixth Italian Conference on Computational Linguistics tenutosi a BARI -- ITA nel 13-15 November 2019) [10.5281/zenodo.3565572].
Vir is to Moderatus as Mulier is to Intemperans. Lemma Embeddings for Latin
Sprugnoli Rachele;
2019-01-01
Abstract
This paper presents a new set of lemma embeddings for the Latin language. Embeddings are trained on a manually annotated corpus of texts belonging to the Classical era: different models, architectures and dimensions are tested and evaluated using a novel benchmark for the synonym selection task. A qualitative evaluation is also performed on the embeddings of rare lemmas. In addition, we release vectors pre-trained on the “Opera Maiora” by Thomas Aquinas, thus providing a resource to analyze Latin in a diachronic perspective.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.