1
64
MEGATHREAD (lemmy.dbzer0.com)

This is a copy of /r/stablediffusion wiki to help people who need access to that information


Howdy and welcome to r/stablediffusion! I'm u/Sandcheeze and I have collected these resources and links to help enjoy Stable Diffusion whether you are here for the first time or looking to add more customization to your image generations.

If you'd like to show support, feel free to send us kind words or check out our Discord. Donations are appreciated, but not necessary as you being a great part of the community is all we ask for.

Note: The community resources provided here are not endorsed, vetted, nor provided by Stability AI.

#Stable Diffusion

Local Installation

Active Community Repos/Forks to install on your PC and keep it local.

Online Websites

Websites with usable Stable Diffusion right in your browser. No need to install anything.

Mobile Apps

Stable Diffusion on your mobile device.

Tutorials

Learn how to improve your skills in using Stable Diffusion even if a beginner or expert.

Dream Booth

How-to train a custom model and resources on doing so.

Models

Specially trained towards certain subjects and/or styles.

Embeddings

Tokens trained on specific subjects and/or styles.

Bots

Either bots you can self-host, or bots you can use directly on various websites and services such as Discord, Reddit etc

3rd Party Plugins

SD plugins for programs such as Discord, Photoshop, Krita, Blender, Gimp, etc.

Other useful tools

#Community

Games

  • PictionAIry : (Video|2-6 Players) - The image guessing game where AI does the drawing!

Podcasts

Databases or Lists

Still updating this with more links as I collect them all here.

FAQ

How do I use Stable Diffusion?

  • Check out our guides section above!

Will it run on my machine?

  • Stable Diffusion requires a 4GB+ VRAM GPU to run locally. However, much beefier graphics cards (10, 20, 30 Series Nvidia Cards) will be necessary to generate high resolution or high step images. However, anyone can run it online through DreamStudio or hosting it on their own GPU compute cloud server.
  • Only Nvidia cards are officially supported.
  • AMD support is available here unofficially.
  • Apple M1 Chip support is available here unofficially.
  • Intel based Macs currently do not work with Stable Diffusion.

How do I get a website or resource added here?

*If you have a suggestion for a website or a project to add to our list, or if you would like to contribute to the wiki, please don't hesitate to reach out to us via modmail or message me.

2
7
3
7
4
6
5
4
6
3
7
15

Abstract

Recent advances in video generation enable a new paradigm for 3D scene creation: generating camera-controlled videos that simulate scene walkthroughs, then lifting them to 3D via feed-forward reconstruction techniques. This generative reconstruction approach combines the visual fidelity and creative capacity of video models with 3D outputs ready for real-time rendering and simulation. Scaling to large, complex environments requires 3D-consistent video generation over long camera trajectories with large viewpoint changes and location revisits, a setting where current video models degrade quickly. Existing methods for long-horizon generation are fundamentally limited by two forms of degradation: spatial forgetting and temporal drifting. As exploration proceeds, previously observed regions fall outside the model's temporal context, forcing the model to hallucinate structures when revisited. Meanwhile, autoregressive generation accumulates small synthesis errors over time, gradually distorting scene appearance and geometry. We present Lyra 2.0, a framework for generating persistent, explorable 3D worlds at scale. To address spatial forgetting, we maintain per-frame 3D geometry and use it solely for information routing -- retrieving relevant past frames and establishing dense correspondences with the target viewpoints -- while relying on the generative prior for appearance synthesis. To address temporal drifting, we train with self-augmented histories that expose the model to its own degraded outputs, teaching it to correct drift rather than propagate it. Together, these enable substantially longer and 3D-consistent video trajectories, which we leverage to fine-tune feed-forward reconstruction models that reliably recover high-quality 3D scenes.

Paper: https://arxiv.org/abs/2604.13036

Code: https://github.com/nv-tlabs/lyra

Weights: https://huggingface.co/nvidia/Lyra-2.0

Project Page: https://research.nvidia.com/labs/sil/projects/lyra2/

8
3
9
2
10
4
11
5
12
10
13
8
14
3
15
4

cross-posted from: https://lemmy.world/post/45437770

What do people here think about local AI based solutions for generating Alt-Text? Could a tool be developed to use the AI horde to help generate Alt Text? I think having ways to do it in an automated fashion would be useful to the community, and could help increase adoption and usage of Alt Text.

Sorry if this isn't the right community for this kind of discussion.

original post text

I've been thinking a lot about the Fediverse ALT issue.

Some people are annoyed by posts without Alttext, & others get reminded¹ to add it.

The core question is: How can we improve accessibility?

Proposal: ☑️ Add a user filter to hide media posts without AltTag ☑️ Reduced engagement on hidden posts would encourage adding Alt text ☑️ People who need accessibility wouldn't have to encounter unlabeled media

If this gets traction, I'll open a Mastodon GitHub issue (maybe on others too?).

¹ https://mastodon.social/@madeindex/113996311493021102

16
2
17
3
18
2

Abstract

While proprietary systems such as Seedance-2.0 have achieved remarkable success in omni-capable video generation, open-source alternatives significantly lag behind. Most academic models remain heavily fragmented, and the few existing efforts toward unified video generation still struggle to seamlessly integrate diverse tasks within a single framework. To bridge this gap, we propose OmniWeaving, an omni-level video generation model featuring powerful multimodal composition and reasoning-informed capabilities. By leveraging a massive-scale pretraining dataset that encompasses diverse compositional and reasoning-augmented scenarios, OmniWeaving learns to temporally bind interleaved text, multi-image, and video inputs while acting as an intelligent agent to infer complex user intentions for sophisticated video creation. Furthermore, we introduce IntelligentVBench, the first comprehensive benchmark designed to rigorously assess next-level intelligent unified video generation. Extensive experiments demonstrate that OmniWeaving achieves SoTA performance among open-source unified models. The codes and model have already been publicly available. Project Page: this https URL.

Paper: https://arxiv.org/abs/2603.24458

Code: https://github.com/Tencent-Hunyuan/OmniWeaving

Models: https://huggingface.co/tencent/HY-OmniWeaving

Project Page: https://omniweaving.github.io/

19
3
20
2
21
2
22
2
23
7
24
6
25
9
view more: next ›

Stable Diffusion

5655 readers
1 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 2 years ago
MODERATORS