This paper suggests a variation of a well-known probabilistic matrix factorization algorithm which is commonly used in data analysis and scientific computing, and which has been considered recently to serve natural language processing. The proposed variation is meant to take benefit from the fact that matrices processed in natural language processing tasks are normally sparse rectangular matrices with one dimension much larger than the other, and this can be used to ensure adequate accuracy with acceptable computation time. Preliminary experiments on real-world textual corpora show that the proposed algorithm achieves relevant improvements compared to the original one.
A probabilistic matrix factorization algorithm for approximation of sparse matrices in natural language processing / Tarantino, G.; Monica, S.; Bergenti, F.. - In: ICT EXPRESS. - ISSN 2405-9595. - 4:2(2018), pp. 87-90. [10.1016/j.icte.2018.04.005]
A probabilistic matrix factorization algorithm for approximation of sparse matrices in natural language processing
Monica S.;Bergenti F.
2018-01-01
Abstract
This paper suggests a variation of a well-known probabilistic matrix factorization algorithm which is commonly used in data analysis and scientific computing, and which has been considered recently to serve natural language processing. The proposed variation is meant to take benefit from the fact that matrices processed in natural language processing tasks are normally sparse rectangular matrices with one dimension much larger than the other, and this can be used to ensure adequate accuracy with acceptable computation time. Preliminary experiments on real-world textual corpora show that the proposed algorithm achieves relevant improvements compared to the original one.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.