The aim of this paper is to analyze a different practical implementation of the LIne Search based stochastic gradient Algorithm (LISA) recently proposed by Franchini et al. The LISA scheme belongs to the class of stochastic gradient methods and it practically relies on a line search strategy to select the learning rate and a dynamic technique to increase the mini batch size of the stochastic directions. Despite promising performance of LISA in solving optimization problems from machine learning applications, its mini batch increasing strategy involves checking for a condition which may be computationally expensive and memory demanding, especially in the presence of both deep neural networks and very large-scale dataset. In this work we investigate an a-priori procedure to select the size of the current mini batch which allows to dampen the computational execution time of the LISA method and to control the requests of memory resources, especially when dealing with hardware accelerators. A numerical experience on training both statistical models for binary classification and deep neural networks for multi-class image classification confirms the effectiveness of the proposal.
Line Search Stochastic Gradient Algorithm with A-priori Rule for Monitoring the Control of the Variance / Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.; Zanni, L.. - 14476 LNCS:(2025), pp. 94-107. (Intervento presentato al convegno Numerical Computations: Theory and Algorithms. NUMTA 2023 tenutosi a Pizzo Calabro - Italy nel 14-20 June 2023) [10.1007/978-3-031-81241-5_7].
Line Search Stochastic Gradient Algorithm with A-priori Rule for Monitoring the Control of the Variance
Trombini I.;
2025-01-01
Abstract
The aim of this paper is to analyze a different practical implementation of the LIne Search based stochastic gradient Algorithm (LISA) recently proposed by Franchini et al. The LISA scheme belongs to the class of stochastic gradient methods and it practically relies on a line search strategy to select the learning rate and a dynamic technique to increase the mini batch size of the stochastic directions. Despite promising performance of LISA in solving optimization problems from machine learning applications, its mini batch increasing strategy involves checking for a condition which may be computationally expensive and memory demanding, especially in the presence of both deep neural networks and very large-scale dataset. In this work we investigate an a-priori procedure to select the size of the current mini batch which allows to dampen the computational execution time of the LISA method and to control the requests of memory resources, especially when dealing with hardware accelerators. A numerical experience on training both statistical models for binary classification and deep neural networks for multi-class image classification confirms the effectiveness of the proposal.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.