Submitted by miolane on
Bibliography

Kunin, D., Luca Marchetti, G., Chen, F., Karkada, D., B. Simon, J., R. DeWeese, M., Ganguli, S., Miolane, N.

This paper introduces Alternating Gradient Flows (AGF), a framework that models feature learning in two-layer networks trained from small initialization as an alternating process of neuron activation and loss minimization. AGF explains the timing and structure of loss drops during training, unifying prior analyses and revealing that networks learn features like principal components or Fourier modes in a predictable order.

Thumbnail
Image
Publication Year