This paper presents L-KD, a tool that relies on available linguistic and knowledge resources to perform keyphrase clustering and labelling. The aim of L-KD is to help finding and tracing themes in English and Italian text data, represented by groups of keyphrases and associated domains. We perform an evaluation of the top-ranked domains using the 20 Newsgroup dataset, and we show that 8 domains out of 10 match with manually assigned labels. This confirms the good accuracy of this approach, which does not require supervision.
KD Strikes Back: from Keyphrases to Labelled Domains Using External Knowledge Sources / Moretti, Giovanni; Sprugnoli, Rachele; Tonelli, Sara. - (2016), pp. 216-221. (Intervento presentato al convegno Third Italian Conference on Computational Linguistics (CLiC-it 2016) tenutosi a Napoli, Italia nel 5-7 December 2016).
KD Strikes Back: from Keyphrases to Labelled Domains Using External Knowledge Sources
Rachele Sprugnoli;
2016-01-01
Abstract
This paper presents L-KD, a tool that relies on available linguistic and knowledge resources to perform keyphrase clustering and labelling. The aim of L-KD is to help finding and tracing themes in English and Italian text data, represented by groups of keyphrases and associated domains. We perform an evaluation of the top-ranked domains using the 20 Newsgroup dataset, and we show that 8 domains out of 10 match with manually assigned labels. This confirms the good accuracy of this approach, which does not require supervision.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.