Correcting Stochastic Gradient Descent for scalable Bayesian Inference
Besides its use as an optimization algorithm for training neural network models, Stochastic Gradient Descent (SGD) has been widely applied as a scalable algorithm for Bayesian inference and uncertainty quantification. This approach is particularly interesting in the large dataset regime, where most traditional Markov Chain Monte Carlo (MCMC) samplers become computationally intractable. In this talk, I will first discuss some limitations, in terms of accuracy, of SGD-based MCMC methods, and then how such limitations can be overcome, asymptotically, through a post-hoc correction of the algorithm's output. Finally, I will discuss how this correction method can be naturally extended to correcting further sources of bias in covariance estimation, such as the (random) approximation of intractable integrals in likelihoods of Generalized Linear Mixed Models. Based on joint work with Sayan Mukherjee and Samuel Berchuk.
Area: IS3 - Mathematics of Machine Learning (Andrea Agazzi)
Keywords: Gradient Descent
Please Login in order to download this file