1130
Consequences
(pawb.social)
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
If some alien species asked you to draw part of it's anatomy that can move into a wide array of configurations, but you are required to do so based only on pictures the aliens sent you that they tell you shows that part among other things, would you do better?
Like, what you said is specifically why it's bad at hands and table legs and the like - they can appear in many different ways and it's only reference point for them is pictures of them it's seen. You understand hands and think logically about them mostly because you have a not just wider but deeper set of experiences to work from. Even then, 4 fingered hands have been common in cartoons because even having hands, being surrounded by other beings with hands and in a culture that makes heavy use of hands a lot of artists have trouble doing them quite right.
Yes, I would do better. I would take a look at the pictures, and think about the angles / geometry, the reason of differences between the pictures, and being able to count sure helps. If they were to show me pictures in a vastly different style, I would make assumptions, like it is a different representation of the same concept. I would not just mash them together based on color values.
I get what you're coming from, but the only reason these models seem to be able to get stuff done, is the insane amount of training data and iterations.
Enjoying this discussion, by the way! It's fun to think about.