Skip to content
View Hiroki11x's full-sized avatar
🦙
🦙

Organizations

@jphacks @rioyokotalab @crest-deep @TITAMAS @RotaPlusPlus @Agents-NY @ArtHackDay-Plus1 @MLHPC

Block or report Hiroki11x

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Hiroki11x/README.md

About

Hi! I'm Hiroki Naganuma, a Ph.D. candidate in Computer Science at Université de Montréal and Mila - Quebec AI Institute, advised by Prof. Ioannis Mitliagkas.

My background includes high-performance computing and distributed deep learning, particularly for training large language models (LLMs). I'm interested in semi-synchronous approaches, the challenges of large batch training, and efficient optimization algorithms. Early in my Ph.D., I conducted extensive benchmarks on out-of-distribution generalization and confidence calibration. Additionally, I have analyzed the optimization dynamics of generative adversarial networks and am currently exploring efficient fine-tuning techniques and fairness considerations in LLMs.

Most recently, I completed a research internship at Meta Superintelligence Lab (MSL) (Menlo Park), working with Dr. Hao-Jun Michael Shi and Dr. Parameswaran Raman on large-batch training and optimization. Prior to that, I was a Student Researcher at Google DeepMind (Mountain View), where I worked with Dr. George E. Dahl on learning rate scheduling, and I also interned at Microsoft Research (Redmond) with Dr. Philipp Witte and Dr. Russell J. Hewett on efficient pretraining algorithms for large language models.

I am a recipient of the Masason Foundation Fellowship. I earned my B.Sc. (2017) and M.Sc. (2019) from the Tokyo Institute of Technology as Valedictorian, where I worked closely with Prof. Rio Yokota and collaborators.

Collaboration & Mentorship: If you're interested in my work, feel free to reach out!

I expect to complete my PhD in Spring/Summer 2026 and am currently exploring both postdoctoral and industry research opportunities. Please feel free to reach out if you see a potential fit.

CV (Last updated: Aug 2025) / Résumé (Last updated: Aug 2025)

Pinned Loading

  1. horovod/horovod horovod/horovod Public

    Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

    Python 14.6k 2.3k

  2. LossLandscapeGeometry LossLandscapeGeometry Public

    No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths (ICML2024)

    Shell 8

  3. Optimizer_Comparison_OOD Optimizer_Comparison_OOD Public

    Empirical Study on Optimizer Selection for Out-of-Distribution Generalization (TMLR2023)

    Python 5

  4. ConjugateGradient_GAN ConjugateGradient_GAN Public

    Conjugate Gradient Method for Generative Adversarial Networks (AISTATS2023)

    Python 3 1

  5. Timm_OOD_Calibration Timm_OOD_Calibration Public

    An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration (TMLR2025)

    Python 2

  6. Pseudo-Asynchronous-LocalSGD Pseudo-Asynchronous-LocalSGD Public

    Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training (TMLR2025)

    Python