Geometric Analysis of Deep Representations

Background Modern deep neural networks, especially those in the overparameterized regime with a very high number of parameters, perform impressively well. Traditional learning theories contradict these empirical results and fail to explain this phenomenon, leading to new approaches that aim to understand why deep learning generalizes. A common belief is that flat minima [1] in the parameter space lead to models with good generalization characteristics. For instance, these models may learn to extract high-quality features from the data, known as representations....

May 27, 2024 ยท Georgios Arvanitidis