Skip to content

Conversation

HosseinKaviani-H
Copy link
Contributor

In this PR, I'm adding the StateDictAdapter for Qwen3 to enable loading HF checkpoints. We can use this script to adapt the checkpoint from HF to the format that we can load into the torchtitan model and vice versa. This can enable us to do a parity test with the HF implementation and make sure that our results are aligned with the HF implementation.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 19, 2025
wwwjn
wwwjn previously approved these changes Aug 20, 2025
Copy link
Contributor

@wwwjn wwwjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Please fix __init__.py first

Copy link
Contributor

@wwwjn wwwjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost forgot to mention, you need to plug in the stateDictAdapter in __init__.py

@HosseinKaviani-H
Copy link
Contributor Author

Update: Please fix __init__.py first

Fixed

wwwjn
wwwjn previously approved these changes Aug 20, 2025
@facebook-github-bot
Copy link

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.

@facebook-github-bot
Copy link

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.

@facebook-github-bot
Copy link

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.

@facebook-github-bot
Copy link

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.

@fegin
Copy link
Contributor

fegin commented Aug 26, 2025

[Meta Internal-Only Changes Check](https://github.com/pytorch/torchtitan/pull/1601/checks?check_run_id=48932323894)

Can you check again what's going on with this PR? Do you export the latest version of the diff?

@facebook-github-bot
Copy link

@HosseinKaviani-H has imported this pull request. If you are a Meta employee, you can view this in D80660953.

@HosseinKaviani-H
Copy link
Contributor Author

[Meta Internal-Only Changes Check](https://github.com/pytorch/torchtitan/pull/1601/checks?check_run_id=48932323894)

Can you check again what's going on with this PR? Do you export the latest version of the diff?

I did and imported to the base again. I don't know why it's throwing an error like this.

@pytorch-bot pytorch-bot bot dismissed stale reviews from wwwjn and wwwjn August 26, 2025 17:23

This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.

@fegin
Copy link
Contributor

fegin commented Aug 26, 2025

Please check the internal diff message.

@wwwjn
Copy link
Contributor

wwwjn commented Aug 26, 2025

I haven't meet this error before, and please check the fix the signal

@wwwjn wwwjn merged commit e65ef30 into pytorch:main Aug 26, 2025
6 of 7 checks passed
wwwjn pushed a commit that referenced this pull request Sep 3, 2025
In this PR, I'm adding the StateDictAdapter for Qwen3 to enable loading
HF checkpoints. We can use this script to adapt the checkpoint from HF
to the format that we can load into the torchtitan model and vice versa.
This can enable us to do a parity test with the HF implementation and
make sure that our results are aligned with the HF implementation.

---------

Co-authored-by: Hossein Kavianihamedani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants