Replies: 1 comment 5 replies
-
For me, this sounds not okay. I just have tested it on my setup (3080 TI) and I reached the point Do you have the 8GB or 16GB version of the 4060 TI? I suspect that the model is not running on the GPU, even though it shows utilization during training. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Thanks for the amazing course — I’ve really learned a lot from it!
I'm currently working through CH07 to fine-tune the pre-trained model on instruction data.
Using the exact code from Section 7.6 (no changes), device="cuda" on my 4060 Ti GPU, and it takes about 25 minutes just to reach Ep1 Step 000025.
Then, I switched to device="cpu", and it finishes the entire fine-tuning in ~24 minutes.
GPU is detected (torch.cuda.is_available() == True) and shows ~97% utilization during training, but it still runs significantly slower than CPU.
Just wondering — is this expected? Or could something be misconfigured on my end?
Appreciate any advice, and thanks again for the great learning experience!
Beta Was this translation helpful? Give feedback.
All reactions