Feel free to refer my Google Scholar Profile as well.
-
Representation Learning Using a Single Forward Pass
A Somasundaram, P Mishra, A Borthakur
[Poster] Best poster award @ 15th Annual Machine Learning Symposium at the New York Academy of Sciences -
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful
Martin Marek, Sanae Lotfi, Aditya Somasundaram, Andrew Gordon Wilson, Micah Goldblum (under Review)