A popular low-dimensional latent representation that retains as much information as possible about a data set is the one represented by the output of a hidden layer located within a neural net having homomorphic input and output layers, termed autoencoder, trained to produce a copy of the input data as its output. This exploratory paper suggests that reformulating the problem as a GP-based symbolic regression can achieve the same goal. The latent representation, in this case, is obtained as a byproduct of the solution to the problem of finding a parametric equation that represents a model of a family of signals (functions) that share the same equation, differing only for the values of a set of free parameters that appear in their definition. This hypothesis is supported by a simple proof of concept based on the results of symbolic regression of a set of Gaussian functions. A discussion of possible issues that might need to be tackled when the method is applied to more complex real-world data and of the corresponding possible countermeasures concludes the paper.
Hybrid GP/PSO Representation of 1-D Signals in an Autoencoder Fashion / Magnani, G.; Mordonini, M.; Cagnoni, S.. - 1977:(2024), pp. 228-238. ( 17th Italian Workshop on Artificial Life and Evolutionary Computation, WIVACE 2023 ita 2023) [10.1007/978-3-031-57430-6_18].
Hybrid GP/PSO Representation of 1-D Signals in an Autoencoder Fashion
Magnani G.;Mordonini M.;Cagnoni S.
2024-01-01
Abstract
A popular low-dimensional latent representation that retains as much information as possible about a data set is the one represented by the output of a hidden layer located within a neural net having homomorphic input and output layers, termed autoencoder, trained to produce a copy of the input data as its output. This exploratory paper suggests that reformulating the problem as a GP-based symbolic regression can achieve the same goal. The latent representation, in this case, is obtained as a byproduct of the solution to the problem of finding a parametric equation that represents a model of a family of signals (functions) that share the same equation, differing only for the values of a set of free parameters that appear in their definition. This hypothesis is supported by a simple proof of concept based on the results of symbolic regression of a set of Gaussian functions. A discussion of possible issues that might need to be tackled when the method is applied to more complex real-world data and of the corresponding possible countermeasures concludes the paper.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


