1
61
MEGATHREAD (lemmy.dbzer0.com)

This is a copy of /r/stablediffusion wiki to help people who need access to that information


Howdy and welcome to r/stablediffusion! I'm u/Sandcheeze and I have collected these resources and links to help enjoy Stable Diffusion whether you are here for the first time or looking to add more customization to your image generations.

If you'd like to show support, feel free to send us kind words or check out our Discord. Donations are appreciated, but not necessary as you being a great part of the community is all we ask for.

Note: The community resources provided here are not endorsed, vetted, nor provided by Stability AI.

#Stable Diffusion

Local Installation

Active Community Repos/Forks to install on your PC and keep it local.

Online Websites

Websites with usable Stable Diffusion right in your browser. No need to install anything.

Mobile Apps

Stable Diffusion on your mobile device.

Tutorials

Learn how to improve your skills in using Stable Diffusion even if a beginner or expert.

Dream Booth

How-to train a custom model and resources on doing so.

Models

Specially trained towards certain subjects and/or styles.

Embeddings

Tokens trained on specific subjects and/or styles.

Bots

Either bots you can self-host, or bots you can use directly on various websites and services such as Discord, Reddit etc

3rd Party Plugins

SD plugins for programs such as Discord, Photoshop, Krita, Blender, Gimp, etc.

Other useful tools

#Community

Games

  • PictionAIry : (Video|2-6 Players) - The image guessing game where AI does the drawing!

Podcasts

Databases or Lists

Still updating this with more links as I collect them all here.

FAQ

How do I use Stable Diffusion?

  • Check out our guides section above!

Will it run on my machine?

  • Stable Diffusion requires a 4GB+ VRAM GPU to run locally. However, much beefier graphics cards (10, 20, 30 Series Nvidia Cards) will be necessary to generate high resolution or high step images. However, anyone can run it online through DreamStudio or hosting it on their own GPU compute cloud server.
  • Only Nvidia cards are officially supported.
  • AMD support is available here unofficially.
  • Apple M1 Chip support is available here unofficially.
  • Intel based Macs currently do not work with Stable Diffusion.

How do I get a website or resource added here?

*If you have a suggestion for a website or a project to add to our list, or if you would like to contribute to the wiki, please don't hesitate to reach out to us via modmail or message me.

2
8
submitted 1 week ago* (last edited 1 week ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

Models for generating highly consistent automotive, apparel, and consumer goods videos using a reference single image (ideally one with a white background).

ClothConsistency: https://civitai.com/models/1993310/clothconsistency-wan22-i2v-consistencylora2

ProductConsistency: https://civitai.com/models/2000699/productconsistency-wan22-i2v-consistencylora3

CarConsistency: https://civitai.com/models/1990350/carconsistency-wan22-i2v-consistencylora1

3
2
4
17
5
2
6
8
submitted 2 weeks ago* (last edited 2 weeks ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com
7
6
Qwen-Image-Edit-2509 (huggingface.co)
8
11
9
12
10
11
11
10

Abstract

Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of content and style, a long-standing theme in style-driven research. To this end, we present USO, a Unified Style-Subject Optimized customization model. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content-style disentanglement training. Third, we incorporate a style reward-learning paradigm denoted as SRL to further enhance the model's performance. Finally, we release USO-Bench, the first benchmark that jointly evaluates style similarity and subject fidelity across multiple metrics. Extensive experiments demonstrate that USO achieves state-of-the-art performance among open-source models along both dimensions of subject consistency and style similarity. Code and model: this https URL

Technical Report: https://arxiv.org/abs/2508.18966

Code: https://github.com/bytedance/USO

USO in ComfyUI tutorial: https://docs.comfy.org/tutorials/flux/flux-1-uso

Project Page: https://bytedance.github.io/USO/

12
1
13
8
submitted 1 month ago* (last edited 1 month ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com
14
8

Abstract

Text-to-image (T2I) diffusion models excel at generating photorealistic images but often fail to render accurate spatial relationships. We identify two core issues underlying this common failure: 1) the ambiguous nature of data concerning spatial relationships in existing datasets, and 2) the inability of current text encoders to accurately interpret the spatial semantics of input descriptions. We propose CoMPaSS, a versatile framework that enhances spatial understanding in T2I models. It first addresses data ambiguity with the Spatial Constraints-Oriented Pairing (SCOP) data engine, which curates spatially-accurate training data via principled constraints. To leverage these priors, CoMPaSS also introduces the Token ENcoding ORdering (TENOR) module, which preserves crucial token ordering information lost by text encoders, thereby reinforcing the prompt's linguistic structure. Extensive experiments on four popular T2I models (UNet and MMDiT-based) show CoMPaSS sets a new state of the art on key spatial benchmarks, with substantial relative gains on VISOR (+98%), T2I-CompBench Spatial (+67%), and GenEval Position (+131%). Code is available at this https URL.

Paper: https://arxiv.org/abs/2412.13195

Code: https://github.com/blurgyy/CoMPaSS

Project Page: https://compass.blurgy.xyz/

15
4

QwenEdit InStyle is a LoRA fine-tune for QwenEdit that significantly improves its ability to generate images based on a style reference. While the base model has style transfer capabilities, it often misses the nuances of styles and can transplant unwanted details from the input image. This LoRA addresses these limitations to provide more accurate style-based image generation.

16
9

Major Updates

17
10

Chroma1-Base: 512x512 model

Chroma1-HD: 1024x1024 model

Chroma1-Flash: A fine-tuned Chroma1-Base experimental model

Chroma1-Radiance [WIP]: Chroma1-Base pixel space model

18
6
submitted 1 month ago* (last edited 1 month ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

Without paywall: https://archive.is/4oEi2

19
20
20
6
Qwen Image Edit (qianwen-res.oss-cn-beijing.aliyuncs.com)
submitted 1 month ago* (last edited 1 month ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

Introduction

We are excited to introduce Qwen-Image-Edit, the image editing version of Qwen-Image. Built upon our 20B Qwen-Image model, Qwen-Image-Edit successfully extends Qwen-Image’s unique text rendering capabilities to image editing tasks, enabling precise text editing. Furthermore, Qwen-Image-Edit simultaneously feeds the input image into Qwen2.5-VL (for visual semantic control) and the VAE Encoder (for visual appearance control), achieving capabilities in both semantic and appearance editing. To experience the latest model, visit Qwen Chat and select the "Image Editing" feature.

Technical Report: https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf

Code: https://github.com/QwenLM/Qwen-Image

Hugging Face: https://huggingface.co/Qwen/Qwen-Image-Edit

GGUFs: https://huggingface.co/QuantStack/Qwen-Image-Edit-GGUF

21
8
submitted 1 month ago* (last edited 1 month ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com
22
5
23
2

SD.Next Release 2025-08-15

New release two weeks after the last one and its a big one with over 150 commits!

  • Several new models: Qwen-Image (plus Lightning variant) and FLUX.1-Krea-Dev
  • Several updated models: Chroma, SkyReels-V2, Wan-VACE, HunyuanDiT
  • Plus continuing with major UI work with new embedded Docs/Wiki search, redesigned real-time hints, wildcards UI selector, built-in GPU monitor, CivitAI integration and more!

24
9
submitted 2 months ago* (last edited 2 months ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

An open-source implementation for training LoRA (Low-Rank Adaptation) layers for Qwen/Qwen-Image models by FlyMy.AI.

25
5
submitted 2 months ago* (last edited 2 months ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com
view more: next ›

Stable Diffusion

5120 readers
1 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 2 years ago
MODERATORS