Speaker: Jason Rocks
Department of Physics
Boston University
Boston, MA
Title: Memorizing without overfitting: Bias, variance and interpolation in over-parameterized models
Abstract: Over the last decade, advances in Machine Learning, and in particular Deep Learning, have resulted in incredible progress in the ability to learn statistical relationships from large data sets and make accurate predictions. At the same time, we have experienced an explosion in the quantity and quality of biological data available from experiments. However, the application of Deep Learning methods to biological data poses interesting challenges due to data heterogeneity, biological interpretability, and new potential sources of bias. While some of these challenges arise due to the nature of the data, others stem from Deep Learning models themselves. In contrast to models from classical statistics, Deep Learning models almost always have many more fit parameters than data points, a setting in which classical statistical intuitions such as the bias-variance tradeoff no longer apply. This raises fundamental questions about how these methods work and what new, unaccounted-for biases arise in these "over-parameterized" models that are not present in classic statistics. In this presentation, we use methods from statistical physics to derive analytic expressions for bias and variance in three minimal models for over-parameterization (linear regression and two-layer neural networks with linear and nonlinear activation functions), allowing us to disentangle properties stemming from the model architecture and random sampling of data. Using these three models, we demonstrate the root causes of bias and variance, allowing us to construct a holistic understanding of generalization error and the bias-variance trade-off in over-parameterized models.