Skip to content

OOM with 32GB 5090 when trying Qwen-Image training #484

@NeedForSpeed73

Description

@NeedForSpeed73

This is for bugs only

Did you already ask in the discord?

Yes

You verified that this is a bug and not a feature request or question by asking in the discord?

Yes

Describe the bug

When trying to replicate the workflow in this official Ostris YouTube channel video, I get an OOM error before the actual training starts, when just creating samples.
According to another Discord help request, seems like the thing is common and has arisen since a few weeks (some update broke it?).

I'm running Ubuntu 24.04 (kernel 6.14.0-33). python 3.13.9, nvidia-driver-580-open (CUDA 13.0) and pytorch 2.9.0+cu128.

�[1;35mtorch.OutOfMemoryError�[0m: �[35mCUDA out of memory. Tried to allocate 5.06 GiB. GPU 0 has a total capacity of 31.36 GiB of which 336.62 MiB is free. Including non-PyTorch memory, this process has 30.68 GiB memory in use. Of the allocated memory 27.17 GiB is allocated by PyTorch, and 2.92 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)�[0m

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions