Subnetwork Learning for Laplace Approximations
Background The Laplace approximation is a promising approach to posterior approximation which can address some core issues of deep learning such as poor calibration. Scaling this method to large parameters spaces is intractable because the covariance matrix is quadratic in the number of neural network parameters and hence cannot be stored in memory. A proposed solution to this problem is to only treat a subset of the parameters as stochastic [1, 2, 3] and treat the rest as deterministic. However the method of selecting a subnetwork is still an open problem. In this project we will explore the possibility of learning optimal subnetwork structure by instantiating the small covariance matrix and backpropogating through a Bayesian loss function (ELBO, Marginal Likelihood, Predictive Posterior Distribution). ...