Hi! I'm Hiroki Naganuma, a Ph.D. candidate in Computer Science at Université de Montréal and Mila - Quebec AI Institute, advised by Prof. Ioannis Mitliagkas.
My background includes high-performance computing and distributed deep learning, particularly for training large language models (LLMs). I'm interested in semi-synchronous approaches, the challenges of large batch training, and efficient optimization algorithms. Early in my Ph.D., I conducted extensive benchmarks on out-of-distribution generalization and confidence calibration. Additionally, I have analyzed the optimization dynamics of generative adversarial networks and am currently exploring efficient fine-tuning techniques and fairness considerations in LLMs.
Most recently, I completed a research internship at Meta Superintelligence Lab (MSL) (Menlo Park), working with Dr. Hao-Jun Michael Shi and Dr. Parameswaran Raman on large-batch training and optimization. Prior to that, I was a Student Researcher at Google DeepMind (Mountain View), where I worked with Dr. George E. Dahl on learning rate scheduling, and I also interned at Microsoft Research (Redmond) with Dr. Philipp Witte and Dr. Russell J. Hewett on efficient pretraining algorithms for large language models.
I am a recipient of the Masason Foundation Fellowship. I earned my B.Sc. (2017) and M.Sc. (2019) from the Tokyo Institute of Technology as Valedictorian, where I worked closely with Prof. Rio Yokota and collaborators.
Collaboration & Mentorship: If you're interested in my work, feel free to reach out!
I expect to complete my PhD in Spring/Summer 2026 and am currently exploring both postdoctoral and industry research opportunities. Please feel free to reach out if you see a potential fit.CV (Last updated: Aug 2025) / Résumé (Last updated: Aug 2025)