1

61

submitted 2 years ago by db0@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

14 comments fedilink

This is a copy of /r/stablediffusion wiki to help people who need access to that information

Howdy and welcome to r/stablediffusion! I'm u/Sandcheeze and I have collected these resources and links to help enjoy Stable Diffusion whether you are here for the first time or looking to add more customization to your image generations.

If you'd like to show support, feel free to send us kind words or check out our Discord. Donations are appreciated, but not necessary as you being a great part of the community is all we ask for.

Note: The community resources provided here are not endorsed, vetted, nor provided by Stability AI.

#Stable Diffusion

Local Installation

Active Community Repos/Forks to install on your PC and keep it local.

Online Websites

Websites with usable Stable Diffusion right in your browser. No need to install anything.

Mobile Apps

Stable Diffusion on your mobile device.

Tutorials

Learn how to improve your skills in using Stable Diffusion even if a beginner or expert.

Dream Booth

How-to train a custom model and resources on doing so.

Models

Specially trained towards certain subjects and/or styles.

Embeddings

Tokens trained on specific subjects and/or styles.

Bots

Either bots you can self-host, or bots you can use directly on various websites and services such as Discord, Reddit etc

3rd Party Plugins

SD plugins for programs such as Discord, Photoshop, Krita, Blender, Gimp, etc.

Other useful tools

Diffusion Toolkit - Image viewer/organizer that scans your images for PNGInfo generated.
Pixiz Morphing - Easily transition between 2 photos.
Bulk Image Resizing Made Easy 2.0

#Community

Games

PictionAIry : (Video|2-6 Players) - The image guessing game where AI does the drawing!

Podcasts

This is Not An AI Art Podcast - Doug Smith talks about Ai Art and provides the prompts/workflow on his site.

Databases or Lists

AiArtApps
Stable Diffusion Akashic Records
Questianon's SD Updates 1
Questianon's SD Updates 2
SW-Yw's Stable Diffusion Repo List
Plonk's SD Model List (NSFW)
Nightkall's Useful Lists
Civitai - Website with a list of custom models.

Still updating this with more links as I collect them all here.

FAQ

How do I use Stable Diffusion?

Check out our guides section above!

Will it run on my machine?

Stable Diffusion requires a 4GB+ VRAM GPU to run locally. However, much beefier graphics cards (10, 20, 30 Series Nvidia Cards) will be necessary to generate high resolution or high step images. However, anyone can run it online through DreamStudio or hosting it on their own GPU compute cloud server.
Only Nvidia cards are officially supported.
AMD support is available here unofficially.
Apple M1 Chip support is available here unofficially.
Intel based Macs currently do not work with Stable Diffusion.

How do I get a website or resource added here?

*If you have a suggestion for a website or a project to add to our list, or if you would like to contribute to the wiki, please don't hesitate to reach out to us via modmail or message me.

2

6

victorchall/vlm-caption: Multiturn VLM Bulk captioning using your api service (github.com)

submitted 4 days ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

3

10

Invoke 6.0 - Major update (youtu.be)

submitted 5 days ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

Download: https://www.invoke.com/downloads Github: https://github.com/invoke-ai/InvokeAI/

4

7

robertvoy/ComfyUI-Distributed: ComfyUI extension that enables multi-GPU processing locally and remotely. (github.com)

submitted 6 days ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

Video tutorial: https://www.youtube.com/watch?v=p6eE3IlAbOs

5

benstaniford/comfy-lora-loader-with-triggerdb: A LoRa loader for ComfyUI that allows automatic storage and retrieval of trigger words (github.com)

submitted 1 week ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

6

5

NovelAI Diffusion V2 Weights Release (blog.novelai.net)

submitted 1 week ago* (last edited 1 week ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

Model: https://huggingface.co/NovelAI/nai-anime-v2

7

Beyond the Peak: A Follow-Up on CivitAI’s Creative Decline | Civitai (civitai.com)

submitted 1 week ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

8

2

safzanpirani/flux-kontext-diff-merge: This node preserves image quality by selectively merging only the changed regions from AI-generated edits back into the original image. (github.com)

submitted 1 week ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

4 comments fedilink

9

11

6chan/flux1-kontext-dev-fp8 - FP8-Quantized Weights For FLUX.1-Kontext-dev (huggingface.co)

submitted 2 weeks ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

10

5

bghira/SimpleTuner Release v2.0 - Kontext [dev], ControlNet Everywhere (github.com)

submitted 2 weeks ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

11

10

Black Forest Labs - FLUX.1 Kontext [dev] - Open Weights for Image Editing (bfl.ai)

submitted 2 weeks ago* (last edited 2 weeks ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

1 comments fedilink

FLUX.1 Kontext [dev] is a 12 billion parameter rectified flow transformer capable of editing images based on text instructions.

Model weights: https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev

Code: https://github.com/black-forest-labs/flux

Self-Serve Portal: http://bfl.ai/pricing/licensing

Helpdesk: https://help.bfl.ai/

12

VectorSpaceLab/OmniGen2: Unified Image Understanding and Generation. (github.com)

submitted 3 weeks ago* (last edited 3 weeks ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

Abstract

The emergence of Large Language Models (LLMs) has unified language generation tasks and revolutionized human-machine interaction. However, in the realm of image generation, a unified model capable of handling various tasks within a single framework remains largely unexplored. In this work, we introduce OmniGen, a new diffusion model for unified image generation. OmniGen is characterized by the following features: 1) Unification: OmniGen not only demonstrates text-to-image generation capabilities but also inherently supports various downstream tasks, such as image editing, subject-driven generation, and visual-conditional generation. 2) Simplicity: The architecture of OmniGen is highly simplified, eliminating the need for additional plugins. Moreover, compared to existing diffusion models, it is more user-friendly and can complete complex tasks end-to-end through instructions without the need for extra intermediate steps, greatly simplifying the image generation workflow. 3) Knowledge Transfer: Benefit from learning in a unified format, OmniGen effectively transfers knowledge across different tasks, manages unseen tasks and domains, and exhibits novel capabilities. We also explore the model's reasoning capabilities and potential applications of the chain-of-thought mechanism. This work represents the first attempt at a general-purpose image generation model, and we will release our resources at this https URL to foster future advancements.

Paper: https://arxiv.org/abs/2409.11340

Code: https://github.com/VectorSpaceLab/OmniGen2

Demo: https://github.com/VectorSpaceLab/OmniGen2?tab=readme-ov-file#-gradio-demo

Model: https://huggingface.co/OmniGen2/OmniGen2

Project Page: https://vectorspacelab.github.io/OmniGen2

13

23

WhatDreamsCost/Spline-Path-Control: Create shapes that follow a spline path. Import background image, edit splines, and export as a shape on a white background for use in VACE. (files.catbox.moe)

submitted 3 weeks ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

1 comments fedilink

https://github.com/WhatDreamsCost/Spline-Path-Control

14

9

yousef-rafat/miniDiffusion: A reimplementation of Stable Diffusion 3.5 in pure PyTorch (github.com)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

15

7

ottogroup/pixaris: An Evaluation Framework for Image Generation (github.com)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

16

11

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation (github.com)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

1 comments fedilink

Abstract

We present Hunyuan3D 2.0, an advanced large-scale 3D synthesis system for generating high-resolution textured 3D assets. This system includes two foundation components: a large-scale shape generation model -- Hunyuan3D-DiT, and a large-scale texture synthesis model -- Hunyuan3D-Paint. The shape generative model, built on a scalable flow-based diffusion transformer, aims to create geometry that properly aligns with a given condition image, laying a solid foundation for downstream applications. The texture synthesis model, benefiting from strong geometric and diffusion priors, produces high-resolution and vibrant texture maps for either generated or hand-crafted meshes. Furthermore, we build Hunyuan3D-Studio -- a versatile, user-friendly production platform that simplifies the re-creation process of 3D assets. It allows both professional and amateur users to manipulate or even animate their meshes efficiently. We systematically evaluate our models, showing that Hunyuan3D 2.0 outperforms previous state-of-the-art models, including the open-source models and closed-source models in geometry details, condition alignment, texture quality, and etc. Hunyuan3D 2.0 is publicly released in order to fill the gaps in the open-source 3D community for large-scale foundation generative models. The code and pre-trained weights of our models are available at: this https URL

Report: https://arxiv.org/abs/2501.12202

Code: https://github.com/Tencent-Hunyuan/Hunyuan3D-2.1

Demo: https://huggingface.co/spaces/tencent/Hunyuan3D-2.1

Project Page: https://3d.hunyuan.tencent.com/

17

5

colinurbs/FramePack-Studio: FramePack-Studio v0.4 (github.com)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

18

16

Subgraphs Are Coming to ComfyUI! 🎉 (blog.comfy.org)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

1 comments fedilink

19

4

deepbeepmeep/Wan2GP: WanGP v5.4 Hunyuan Video Avatar Support (github.com)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

Generate up to 15s of high quality speech or song driven Video on 10 GB of VRAM

20

6

Mohsyn/PromptSniffer: ExifTool by Mohsyn: View / Extract / Copy / Remove AI metadata from images ( right click support ) (github.com)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

21

8

scraed/LanPaint: High quality training free inpaint for every stable diffusion model. (github.com)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

22

4

Automagic optimizer? (sh.itjust.works)

submitted 1 month ago by blargbluuk@sh.itjust.works to c/stable_diffusion@lemmy.dbzer0.com

4 comments fedilink

I've been reading more into training (mostly for wan2.1) lately and noticed this optimizer as an option in ai-toolkit as well as in diffusion-pipe.

Aside from just trying to read through and understand the source code, does anyone know of any documentation on how this is supposed to work or recommended usage/parameters? I can't seem to find anything to learn more about it in my cursory searching.

23

13

ToTheBeginning/ComfyUI-DreamO: DreamO native implementation for ComfyUI (github.com)

submitted 1 month ago* (last edited 1 month ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

3 comments fedilink

24

11

ComfyUI Bounty Program | Notion (comfyorg.notion.site)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

1 comments fedilink

25

10

Emerging Properties in Unified Multimodal Pretraining (github.com)

submitted 1 month ago* (last edited 1 month ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink

Abstract

Unifying multimodal understanding and generation has shown impressive capabilities in cutting-edge proprietary systems. In this work, we introduce BAGEL, an open0source foundational model that natively supports multimodal understanding and generation. BAGEL is a unified, decoder0only model pretrained on trillions of tokens curated from large0scale interleaved text, image, video, and web data. When scaled with such diverse multimodal interleaved data, BAGEL exhibits emerging capabilities in complex multimodal reasoning. As a result, it significantly outperforms open-source unified models in both multimodal generation and understanding across standard benchmarks, while exhibiting advanced multimodal reasoning abilities such as free-form image manipulation, future frame prediction, 3D manipulation, and world navigation. In the hope of facilitating further opportunities for multimodal research, we share the key findings, pretraining details, data creation protocal, and release our code and checkpoints to the community. The project page is at this https URL

Paper: https://arxiv.org/abs/2505.14683

Code: https://github.com/bytedance-seed/BAGEL

Demo: https://demo.bagel-ai.org/

Project Page: https://bagel-ai.org/

Model: https://huggingface.co/ByteDance-Seed/BAGEL-7B-MoT