🎉 My research showing how to pretrain Large Language Models (LLMs) with small batch sizes and simple SGD (without momentum) is out on arXiv! Huge shout out to my mentor Prof. Micah Goldblum and my amazing collaborators at the Wilson Lab at NYU Courant. Here, we provide beautiful recommendations for slightly broke practitioners to train language models AND compete with the giants and their inordinate compute! Do reach out if you're interested in talking about training large language models, optimizers, and have general distaste for gradient accumulation [Summer 2025]

🎉 I'm going to be in California this summer! I'll be interning with the ML team at Numenta to work on some amazing tech and research! Cannot wait to roam around the beaches in the bay :) [Summer 2025]

🎉 I won the best poster award at the 15th Annual Machine Learning Symposium conducted by the New York Academy of Sciences for my work on learning without backpropagation! Do check out my poster and let me know what you think about my work! [Fall 2024]