-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Allow SD pipeline to use newer schedulers, eg: FlowMatch #12015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
by skipping attribute that doesnt exist there (scale_model_input) Lines starting
hi @ppbrown feel free to end a PR :) |
Hello, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
sorry I got confused, and thought it was an issue
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
curious what you're doing exactly @ppbrown ! i have a project where i'm currently converting SD2 to flow matching (since v-pred is a good chunk of the way to flow-matching anyway).. |
It... aint easy. Initially I just tried a brute force "plug it in and retrain unet" strategy. Since that worked for prior experiments like swapping vae to sdxl VAE. But the noise schedule is way more disruptive.So the naiive approach ended up doing wierd stuff. The initial steps threw off the unet knowledge so that it couldnt render things properly.... my guess is i would have to approach it like a "retrain from noise" project. No thanks! So then I tried freeezing everythning except the "time" layers. Next I tried unfreezing time layers, and a couple more. Now I'm attempting a hybrid approach, where I do the partial training at least until the loss is within some more reasonable range. Then I'll do the switchover to full unet training from that, and see if that gets me where I want. |
Interesting @ppbrown ! in my case it's been really straightforward, but I'm resuming from SD2, which uses v-pred - which is already 90% of the way to flow-matching. I've had to add QK-norm (#12051), I'm using an effective batch size of 1000, and I've been using an SD3-inspired loss-scale-per-timestep. all the code is over here → https://github.com/damian0815/EveryDream2trainer/tree/flow_matching (flow_matching branch). WIP model on huggingface |
Allow SD pipeline to use newer schedulers, eg: FlowMatch,
by skipping attribute that doesnt exist there
(scale_model_input)
I am currently experimenting with SD + FlowMatchEuler, and had to do
an ugly hand hack in my training code to get the pipeline to continue:
Would be nice to have core diffusers handle this so I can share training results with people cleanly.
Who can review?