-
-
Notifications
You must be signed in to change notification settings - Fork 805
Description
This is for bugs only
Did you already ask in the discord?
Yes
You verified that this is a bug and not a feature request or question by asking in the discord?
Yes
Describe the bug
When trying to replicate the workflow in this official Ostris YouTube channel video, I get an OOM error before the actual training starts, when just creating samples.
According to another Discord help request, seems like the thing is common and has arisen since a few weeks (some update broke it?).
I'm running Ubuntu 24.04 (kernel 6.14.0-33). python 3.13.9, nvidia-driver-580-open (CUDA 13.0) and pytorch 2.9.0+cu128.
�[1;35mtorch.OutOfMemoryError�[0m: �[35mCUDA out of memory. Tried to allocate 5.06 GiB. GPU 0 has a total capacity of 31.36 GiB of which 336.62 MiB is free. Including non-PyTorch memory, this process has 30.68 GiB memory in use. Of the allocated memory 27.17 GiB is allocated by PyTorch, and 2.92 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)�[0m