Haputhanthri, U., Storan, L., Jiang, Y., Raheja, T., Shai, A., Akengin, O., Miolane, N., Schnitzer, M. J., Dinc, F., & Tanaka, H

Abstract

Training recurrent neural networks (RNNs) is a high-dimensional process that requires updating numerous parameters. Therefore, it is often difficult to pinpoint the underlying learning mechanisms. To address this challenge, we propose to gain mechanistic insights into the phenomenon of \emph{abrupt learning} by studying RNNs trained to perform diverse short-term memory tasks. In these tasks, RNN training begins with an initial search phase. Following a long period of plateau in accuracy, the values of the loss function suddenly drop, indicating abrupt learning. Analyzing the neural computation performed by these RNNs reveals geometric restructuring (GR) in their phase spaces prior to the drop. To promote these GR events, we introduce a temporal consistency regularization that accelerates (bioplausible) training, facilitates attractor formation, and enables efficient learning in strongly connected networks. Our findings offer testable predictions for neuroscientists and emphasize the need for goal-agnostic secondary mechanisms to facilitate learning in biological and artificial networks.

Citation

Haputhanthri, U., Storan, L., Jiang, Y., Raheja, T., Shai, A., Akengin, O., Miolane, N., Schnitzer, M. J., Dinc, F., & Tanaka, H. (2025). Understanding and controlling the geometry of memory organization in RNNs. ArXiv.org. https://arxiv.org/abs/2502.07256

‌BibTeX

@article{guigui2023introduction, title={Introduction to riemannian geometry and geometric statistics: from basic theory to implementation with geomstats}, author={Guigui, Nicolas and Miolane, Nina and Pennec, Xavier and others}, journal={Foundations and Trends{\textregistered} in Machine Learning}, volume={16}, number={3}, pages={329--493}, year={2023}, publisher={Now Publishers, Inc.} }

heat

Files