Replies: 1 comment 5 replies
-
the stable diffusion model is trained on 512px images, going over that might cause problems, there is the "highres.fix" checkbox in the A1111 UI, use it to fix that issue, and try training with high resolution images, ex 768x768 or 640x640 and set it to fp16 to save time. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I just finished training my first model (anime character on AnythingV3fp32) and I'm having a couple issues with its outputs.
The first and most important is that larger resolutions break things really bad. Fidelity is lost with regards to the original design and anything past about 1024x1024 ends up completely deformed (including img2img results). Example.
Secondly, the images themselves seem to struggle a bit when trying to "change" the image from the base design. This includes different poses, expressions, clothes, etc. For example, if I want the character to wear a black dress, it really struggles with the change. Example.
Lastly if I try to generate anything else, the features of the trained character are noticeably beginning to bleed into it. Example.
For reference, I trained the model on 15 reference images at 3000 steps with the text encoder set to 50% using the colab. Everything that wasn't a prompt on the colab was left default (ie, I didn't change anything in the code itself).
My main issue is things breaking at higher resolutions and the images being "samey" since I can always switch back to the base model if I want to generate anything other than the trained character (although being able to work in tandem would be nice). Is there anything I can do to fix this problem or should I look for another fork/colab if I want more fine-tuned results?
Beta Was this translation helpful? Give feedback.
All reactions