Predictive Power of a Bayesian Effective Action for Fully Connected One Hidden Layer Neural Networks in the Proportional Limit

IRIS

We perform accurate numerical experiments with fully connected one hidden layer neural networks trained with a discretized Langevin dynamics on the MNIST and CIFAR10 datasets. Our goal is to empirically determine the regimes of validity of a recently derived Bayesian effective action for shallow architectures in the proportional limit. We explore the predictive power of the theory as a function of the parameters (the temperature T, the magnitude of the Gaussian priors λ1, λ0, the size of the hidden layer N1, and the size of the training set P) by comparing the experimental and predicted generalization error. The very good agreement between the effective theory and the experiments represents an indication that global rescaling of the infinite-width kernel is a main physical mechanism for kernel renormalization in fully connected Bayesian standard-scaled shallow networks.

Predictive Power of a Bayesian Effective Action for Fully Connected One Hidden Layer Neural Networks in the Proportional Limit / Baglioni, P.; Pacelli, R.; Aiudi, R.; Di Renzo, F.; Vezzani, A.; Burioni, R.; Rotondo, P.. - In: PHYSICAL REVIEW LETTERS. - ISSN 0031-9007. - 133:2(2024). [10.1103/PhysRevLett.133.027301]

Predictive Power of a Bayesian Effective Action for Fully Connected One Hidden Layer Neural Networks in the Proportional Limit

Baglioni P.;Pacelli R.;Aiudi R.;Di Renzo F.;Vezzani A.;Burioni R.;Rotondo P.

2024-01-01

Abstract

We perform accurate numerical experiments with fully connected one hidden layer neural networks trained with a discretized Langevin dynamics on the MNIST and CIFAR10 datasets. Our goal is to empirically determine the regimes of validity of a recently derived Bayesian effective action for shallow architectures in the proportional limit. We explore the predictive power of the theory as a function of the parameters (the temperature T, the magnitude of the Gaussian priors λ1, λ0, the size of the hidden layer N1, and the size of the training set P) by comparing the experimental and predicted generalization error. The very good agreement between the effective theory and the experiments represents an indication that global rescaling of the infinite-width kernel is a main physical mechanism for kernel renormalization in fully connected Bayesian standard-scaled shallow networks.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2024
			
	Citazione
	
				Predictive Power of a Bayesian Effective Action for Fully Connected One Hidden Layer Neural Networks in the Proportional Limit / Baglioni, P.; Pacelli, R.; Aiudi, R.; Di Renzo, F.; Vezzani, A.; Burioni, R.; Rotondo, P.. - In: PHYSICAL REVIEW LETTERS. - ISSN 0031-9007. - 133:2(2024). [10.1103/PhysRevLett.133.027301]
			
	Appare nelle tipologie:
	
				1.1 Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
2401.11004v1.pdf accesso aperto Tipologia: Documento in Post-print Licenza: Creative commons Dimensione 1.24 MB Formato Adobe PDF Visualizza/Apri	1.24 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/2991834

Citazioni

ND

4

4

social impact