998
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 25 Jul 2024
998 points (100.0% liked)
Technology
76310 readers
2824 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
Generating multiple pictures of the same character is actually pretty hard. For example, let's say you're making a visual novel with a bunch of anime girls. You spin up your generative AI, and it gives you a great picture of a girl with a good design in a neutral pose. We'll call her Alice. Well, now you need a happy Alice, a sad Alice, a horny Alice, an Alice with her face covered with cum, a nude Alice, and a hyper breast expansion Alice. Getting the AI to recreate Alice, who does not exist in the training data, is going to be very difficult even once.
And all of this is multiplied ten times over if you want granular changes to a character. Let's say you're making a fat fetish game and Alice is supposed to gain weight as the player feeds her. Now you need everything I described, at 10 different weights. You're going to need to be extremely specific with the AI and it's probably going to produce dozens of incorrect pictures for every time it gets it right. Getting it right might just plain be impossible if the AI doesn't understand the assignment well enough.
Not from what I have seen on Civitai. You can train a model on specific character or person. Same goes for facial expressions.
Of course you need to generate hundreds of images to get only a few that you might consider acceptable.
This is a solvable problem. Just make a LoRA of the Alice character. For modifications to the character, you might also need to make more LoRAs, but again totally doable. Then at runtime, you are just shuffling LoRAs when you need to generate.
You're correct that it will struggle to give you exactly what you want because you need to have some "machine sympathy." If you think in smaller steps and get the machine to do those smaller, more do-able steps, you can eventually accomplish the overall goal. It is the difference in asking a model to write a story versus asking it to first generate characters, a scenario, plot and then using that as context to write just a small part of the story. The first story will be bland and incoherent after awhile. The second, through better context control, will weave you a pretty consistent story.
These models are not magic (even though it feels like it). That they follow instructions at all is amazing, but they simply will not get the nuance of the overall picture and be able to accomplish it un-aided. If you think of them as natural language processors capable of simple, mechanical tasks and drive them mechanistically, you'll get much better results.