This repository contains: - the Content Type Dataset Version 1.5 (in the folder "Datasets"); - the latest version of the guidelines for annotating Content Types; - the data statement related to CTD V1.5; - a set of spreadsheets containing metadata about the documents included in the dataset, e.g. year of publication, author's name, author's nationality, author's gender (in the folder "Documents_Metadata"); - the data to replicate a set of experiments for the identification of Content Types (in the folder "Datasets"); - the best model for the identification of Content Types obtained adopting the BiLSTM-CNN-CRF with ELMo-Representations for Sequence Tagging implementation by Nils Reimers and Iryna Gurevych (in the folder "Best_Model"); - the data used to calculate the Inter-Annotator Agreement (in the folder "IAA"): the script used for calculating Cohen's k is available here: https://github.com/johnnymoretti/CAT_R_Kappa_Cohen

Content Type Dataset - v1.5 / Caselli, Tommaso; Sprugnoli, Rachele; Moretti, Giovanni. - Content Type Dataset - v1.5:(2021).

Content Type Dataset - v1.5

Rachele Sprugnoli;
2021-01-01

Abstract

This repository contains: - the Content Type Dataset Version 1.5 (in the folder "Datasets"); - the latest version of the guidelines for annotating Content Types; - the data statement related to CTD V1.5; - a set of spreadsheets containing metadata about the documents included in the dataset, e.g. year of publication, author's name, author's nationality, author's gender (in the folder "Documents_Metadata"); - the data to replicate a set of experiments for the identification of Content Types (in the folder "Datasets"); - the best model for the identification of Content Types obtained adopting the BiLSTM-CNN-CRF with ELMo-Representations for Sequence Tagging implementation by Nils Reimers and Iryna Gurevych (in the folder "Best_Model"); - the data used to calculate the Inter-Annotator Agreement (in the folder "IAA"): the script used for calculating Cohen's k is available here: https://github.com/johnnymoretti/CAT_R_Kappa_Cohen
2021
Content Type Dataset - v1.5 / Caselli, Tommaso; Sprugnoli, Rachele; Moretti, Giovanni. - Content Type Dataset - v1.5:(2021).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/2934573
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact