Adding StateDictAdapter #1601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

wwwjn merged 12 commits into pytorch:main from HosseinKaviani-H:main

Aug 26, 2025

Contributor

HosseinKaviani-H commented Aug 19, 2025

In this PR, I'm adding the StateDictAdapter for Qwen3 to enable loading HF checkpoints. We can use this script to adapt the checkpoint from HF to the format that we can load into the torchtitan model and vice versa. This can enable us to do a parity test with the HF implementation and make sure that our results are aligned with the HF implementation.


          Fix config file path in run_train.sh

a09953b

meta-cla bot added the CLA Signed label


          Adding state_dict_adapter

8dbceeb

wwwjn reviewed

View reviewed changes

torchtitan/experiments/qwen3/model/state_dict_adapter.py Outdated Show resolved Hide resolved


          Adding state_dict_adapter

a755f53

tianyu-l reviewed

View reviewed changes

torchtitan/experiments/qwen3/model/state_dict_adapter.py Show resolved Hide resolved

Hossein Kavianihamedani and others added 2 commits

August 19, 2025 16:19


          Resolve README conflict

0a04bde


          Merge branch 'main' into main

ee4485f

wwwjn previously approved these changes

View reviewed changes

Contributor

wwwjn left a comment •

edited

Loading

Update: Please fix __init__.py first

wwwjn requested changes

View reviewed changes

Contributor

wwwjn left a comment •

edited

Loading

Almost forgot to mention, you need to plug in the stateDictAdapter in __init__.py

Hossein Kavianihamedani added 3 commits

August 20, 2025 14:25


          Resolve README conflict and add StateDictAdapter changes

41f6589


          Merge branch 'main' of https://github.com/HosseinKaviani-H/torchtitan…

8c93715

…_local


          Update __init__.py file

4c790fb

Contributor Author

HosseinKaviani-H commented Aug 20, 2025

Update: Please fix __init__.py first

Fixed

wwwjn previously approved these changes

View reviewed changes

facebook-github-bot commented Aug 20, 2025

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.


          Merge branch 'pytorch:main' into main

43c5565

facebook-github-bot commented Aug 21, 2025

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.


          Merge branch 'pytorch:main' into main

665525f

facebook-github-bot commented Aug 22, 2025

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.


          Merge branch 'pytorch:main' into main

fe9ca45

facebook-github-bot commented Aug 25, 2025

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.


          Merge branch 'pytorch:main' into main

174da03

Contributor

fegin commented Aug 26, 2025

[Meta Internal-Only Changes Check](https://github.com/pytorch/torchtitan/pull/1601/checks?check_run_id=48932323894)

Can you check again what's going on with this PR? Do you export the latest version of the diff?

facebook-github-bot commented Aug 26, 2025

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.

HosseinKaviani-H closed this

Contributor Author

HosseinKaviani-H commented Aug 26, 2025

[Meta Internal-Only Changes Check](https://github.com/pytorch/torchtitan/pull/1601/checks?check_run_id=48932323894)

Can you check again what's going on with this PR? Do you export the latest version of the diff?

I did and imported to the base again. I don't know why it's throwing an error like this.

HosseinKaviani-H reopened this

pytorch-bot bot dismissed stale reviews from wwwjn and wwwjn

August 26, 2025 17:23

This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.

Contributor

fegin commented Aug 26, 2025

Please check the internal diff message.

Contributor

wwwjn commented Aug 26, 2025

I haven't meet this error before, and please check the fix the signal

wwwjn approved these changes

View reviewed changes

wwwjn merged commit e65ef30 into pytorch:main

6 of 7 checks passed

wwwjn pushed a commit that referenced this pull request


          Adding StateDictAdapter (#1601)

c5083c1

In this PR, I'm adding the StateDictAdapter for Qwen3 to enable loading
HF checkpoints. We can use this script to adapt the checkpoint from HF
to the format that we can load into the torchtitan model and vice versa.
This can enable us to do a parity test with the HF implementation and
make sure that our results are aligned with the HF implementation.

---------

Co-authored-by: Hossein Kavianihamedani <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels