501

Google apologizes for ‘missing the mark’ after Gemini generated racially diverse Nazis (www.theverge.com)

submitted 2 years ago by L4s@lemmy.world to c/technology@lemmy.world

170 comments fedilink hide all child comments

Google apologizes for ‘missing the mark’ after Gemini generated racially diverse Nazis::Google says it’s aware of historically inaccurate results for its Gemini AI image generator, following criticism that it depicted historically white groups as people of color.

you are viewing a single comment's thread
view the rest of the comments

[-] xantoxis@lemmy.world 140 points 2 years ago

I don't know how you'd solve the problem of making a generative AI accurately create a slate of images that both a) inclusively produces people with diverse characteristics and b) understands the context of what characteristics could feasibly be generated.

But that's because the AI doesn't know how to solve the problem.

Because the AI doesn't know anything.

Real intelligence simply doesn't work like this, and every time you point it out someone shouts "but it'll get better". It still won't understand anything unless you teach it exactly what the solution to a prompt is. It won't, for example, interpolate its knowledge of what US senators look like with the knowledge that all of them were white men for a long period of American history.

[-] random9@lemmy.world 49 points 2 years ago

You don't do what Google seems to have done - inject diversity artificially into prompts.

You solve this by training the AI on actual, accurate, diverse data for the given prompt. For example, for "american woman" you definitely could find plenty of pictures of American women from all sorts of racial backgrounds, and use that to train the AI. For "german 1943 soldier" the accurate historical images are obviously far less likely to contain racially diverse people in them.

If Google has indeed already done that, and then still had to artificially force racial diversity, then their AI training model is bad and unable to handle that a single input can match to different images, instead of the most prominent or average of its training set.

[-] xantoxis@lemmy.world 17 points 2 years ago

Ultimately this is futile though, because you can do that for these two specific prompts until the AI appears to "get it", but it'll still screw up a prompt like "1800s Supreme Court justice" or something because it hasn't been trained on that. Real intelligence requires agency to seek out new information to fill in its own gaps; and a framework to be aware of what the gaps are. Through exploration of its environment, a real intelligence connects things together, and is able to form new connections as needed. When we say "AI doesn't know anything" that's what we mean--understanding is having a huge range of connections and the ability to infer new ones.

[-] TheGreenGolem@lemmy.dbzer0.com 11 points 2 years ago

That's why I hate that they started to call them artificial intelligence. There is nothing intelligent in them at all. They work on probability based on a shit ton of data, that's all. That's not intelligence, that's basically brute force. But there is no going back at this point, I know.

[-] kromem@lemmy.world 5 points 2 years ago* (last edited 2 years ago)

Oh really? Here's Gemini's response to "What would the variety of genders and skin tones of the supreme court in the 1800s have been?"

The Supreme Court of the United States in the 1800s was far from diverse in terms of gender and skin tone. Throughout the entire 19th century, all the justices were white men. Women were not even granted the right to vote until 1920, and there wasn't a single person of color on the Supreme Court until Thurgood Marshall was appointed in 1967.

Putting the burden of contextualization on the LLM would have avoided this issue.

[-] TORFdot0@lemmy.world 32 points 2 years ago* (last edited 2 years ago)

Edit: further discussion on the topic has changed my viewpoint on this, its not that its been trained wrong on purpose and now its confused, its that everything its being asked is secretly being changed. It's like a child being told to make up a story by their teacher when the principal asked for the right answer.

Original comment below

They’ve purposefully overrode its training to make it create more PoCs. It’s a noble goal to have more inclusivity but we purposely trained it wrong and now it’s confused, the same thing as if you lied to a child during their education and then asked them for real answers, they’ll tell you the lies they were taught instead.

[-] TwilightVulpine@lemmy.world 16 points 2 years ago

This result is clearly wrong, but it's a little more complicated than saying that adding inclusivity is purposedly training it wrong.

Say, if "entrepreneur" only generated images of white men, and "nurse" only generated images of white women, then that wouldn't be right either, it would just be reproducing and magnifying human biases. Yet this a sort of thing that AI does a lot, because AI is a pattern recognition tool inherently inclined to collapse data into an average, and data sets seldom have equal or proportional samples for every single thing. Human biases affect how many images we have of each group of people.

It's not even just limited to image generation AIs. Black people often bring up how facial recognition technology is much spottier to them because the training data and even the camera technology was tuned and tested mainly for white people. Usually that's not even done deliberately, but it happens because of who gets to work on it and where it gets tested.

Of course, secretly adding "diverse" to every prompt is also a poor solution. The real solution here is providing more contextual data. Unfortunately, clearly, the AI is not able to determine these things by itself.

[-] TORFdot0@lemmy.world 5 points 2 years ago

I agree with your comment. As you say, I doubt the training sets are reflective of reality either. I guess that leaves tampering with the prompts to gaslight the AI into providing results it wasn't asked for is the method we've chosen to fight this bias.

We expect the AI to give us text or image generation that is based in reality but the AI can't experience reality and only has the knowledge of the training data we provide it. Which is just an approximation of reality, not the reality we exist in. I think maybe the answer would be training users of the tool that the AI is doing the best it can with the data it has. It isn't racist, it is just ignorant. Let the user add diverse to the prompt if they wish, rather than tampering with the request to hide the insufficiencies in the training data.

[-] TwilightVulpine@lemmy.world 5 points 2 years ago

I wouldn't count on the user realizing the limitations of the technology, or the companies openly admitting to it at expense of their marketing. As far as art AI goes this is just awkward, but it worries me about LLMs, and people using it expecting it to respond with accurate, applicable information, only to come out of it with very skewed worldviews.

load more comments (3 replies)

[-] FooBarrington@lemmy.world 25 points 2 years ago* (last edited 2 years ago)

I'll get the usual downvotes for this, but:

Because the AI doesn't know anything.

is untrue, because current AI fundamentally is knowledge. Intelligence fundamentally is compression, and that's what the training process does - it compresses large amounts of data into a smaller size (and of course loses many details in the process).

But there's no way to argue that AI doesn't know anything if you look at its ability to recreate a great number of facts etc. from a small amount of activations. Yes, not everything is accurate, and it might never be perfect. I'm not trying to argue that "it will necessarily get better". But there's no argument that labels current AI technology as "not understanding" without resorting to a "special human sauce" argument, because the fundamental compression mechanisms behind it are the same as behind our intelligence.

Edit: yeah, this went about as expected. I don't know why the Lemmy community has so many weird opinions on AI topics.

[-] eatthecake@lemmy.world 25 points 2 years ago

This is all the same as saying a book is intelligent.

[-] FooBarrington@lemmy.world 7 points 2 years ago

No, it's not. It's saying "a book is knowledge", which is absolutely true.

[-] barsoap@lemm.ee 6 points 2 years ago* (last edited 2 years ago)

A book is a physical representation of knowledge.

Knowledge is something possessed by an actor capable to employ it. One way I can employ a textbook about Quantum Mechanics is by throwing it at you, for which any book would suffice, but I can't put any of the knowledge represented within into practice. Throwing is purely Newtonian, I have some learned knowledge about that and plenty of innate knowledge as a human (we are badass throwers). Also I played Handball when I was a kid. All that is plenty of knowledge, and an object, to throw, but nothing about it concerns spin states. It also won't hit you any differently than a cookbook.

load more comments (4 replies)

[-] sxt@lemmy.world 16 points 2 years ago

Part of the problem with talking about these things in a casual setting is that nobody is using precise enough terminology to approach the issue so others can actually parse specifically what they're trying to say.

Personally, saying the AI "knows" something implies a level of cognizance which I don't think it possesses. LLMs "know" things the way an excel sheet can.

Obviously, if we're instead saying the AI "knows" things due to it being able to frequently produce factual information when prompted, then yeah it knows a lot of stuff.

I always have the same feeling when people try to talk about aphantasia or having/not having an internal monologue.

[-] FooBarrington@lemmy.world 7 points 2 years ago

I can ask AI models specific questions about knowledge it has, which it can correctly reply to. Excel sheets can't do that.

That's not to say the knowledge is perfect - but we know that AI models contain partial world models. How do you differentiate that from "cognizance"?

[-] rambaroo@lemmy.world 14 points 2 years ago* (last edited 2 years ago)

Omg give me a break with this complete nonsense. LLMs are not an intelligence. They are language processors. They do not "think" about anything and don't have any level of self awareness that implies cognizance. A cognizant ai would have recognized that the Nazis it was creating looked historically inaccurate, based on its training data. But guess what, it didn't do that because it's fundamentally incapable of thinking about anything.

So sick of reading this amateurish bullshit on social media.

[-] EatATaco@lemm.ee 4 points 2 years ago

This gets the question...how do we think? Are we not just language (and other inputs as well) processors? I'm not sure the answer is "no."

I also listened to an interesting podcast, I believe it was this American life or some other npr one, about whether ai has intelligence. To avoid the just "compressed knowledge" they came up with questions that the ai almost certainly would not have found in the web. Early ai models were clearly just predicting the next word, and the example was asking it to stack a list of objects. And it just said to stack them one on top of another, in a way that would no way be stable.

However when they asked a new model to do the same, with the stipulation that it explain it's reasoning, it stacked the objects in a way that would likely be stable. Even noting that the nail on top should be placed on the head so it doesn't roll around, and laying eggs down in a grid between a book and a plank of wood so they wouldn't roll out.

Another experiment they did was take a language model and asked it to use some obscure programming language to draw a picture of a unicorn. Now this is a language model, not trained on any images.

And you know what it did? It produced a picture of a unicorn. Just in rough shapes, but even when they moved the horn and flipped it around, it was able to put it back. Without even ever seeing a unicorn, or anything even, it was able to draw a picture of one.

I don't think the answer is as simple and clear as you want it to be. And the fact that it "fucked up" on a vague prompt doesn't really prove anything. Even humans do stupid shit like this if they learn something incorrectly.

[-] FooBarrington@lemmy.world 4 points 2 years ago

A cognizant ai would have recognized that the Nazis it was creating looked historically inaccurate, based on its training data.

Do you understand that the model is specifically prompted to create "historically inaccurate looking Nazis"? Models aren't supposed to inject their own guidelines and rules, they simply produce output for your input. If you tell it to produce black Hitler it will produce a black Hitler. Do you expect the model to instead produce white Hitler?

load more comments (1 replies)

[-] shiftymccool@lemm.ee 12 points 2 years ago

I think you might be confusing intelligence with memory. Memory is compressed knowledge, intelligence is the ability to decompress and interpret that knowledge.

[-] FooBarrington@lemmy.world 4 points 2 years ago

No. On a fundamental level, the idea of "making connections between subjects" and applying already available knowledge to new topics is compression - representing more data with the same amount of storage. These are characteristics of intelligence, not of memory.

You can't decompress something if you haven't previously compressed the data.

[-] barsoap@lemm.ee 9 points 2 years ago* (last edited 2 years ago)

Our current AI systems are T2, and T1 during interference. They can't decide how they represent data that'd require T3 (like us) which puts them, in your terms, at the level of memory, not intelligence.

Actually it's quite intuitive: Ask StableDiffusion to draw a picture of an accident and it will hallucinate just as wildly as if you ask a human to describe an accident they've witnessed ten minutes ago. It needs active engagement with that kind of memory to sort the wheat from the chaff.

[-] FooBarrington@lemmy.world 3 points 2 years ago* (last edited 2 years ago)

They can't decide how they represent data that'd require T3 (like us) which puts them, in your terms, at the level of memory, not intelligence.

Where do you get this? What kind of data requires a T3 system to be representable?

I don't think I've made any claims that are related to T2 or T3 systems, and I haven't defined "memory", so I'm not sure how you're trying to put it in my terms. I wouldn't define memory as an adaptable system, so T2 would by my definition be intelligence as well.

Actually it's quite intuitive: Ask StableDiffusion to draw a picture of an accident and it will hallucinate just as wildly as if you ask a human to describe an accident they've witnessed ten minutes ago. It needs active engagement with that kind of memory to sort the wheat from the chaff.

I just did this:

Where do you see "wild hallucination"? Yeah, it's not perfect, but I also didn't do any kind of tuning - no negative prompt, positive prompt is literally just "accident".

[-] barsoap@lemm.ee 7 points 2 years ago

Where do you get this? What kind of data requires a T3 system to be representable?

It's not about the type of data but data organisation and operations thereon. I already gave you a link to Nikolic' site feel free to read it in its entirety, this paper has a short and sweet information-theoretical argument.

I don’t think I’ve made any claims that are related to T2 or T3 systems, and I haven’t defined “memory”, so I’m not sure how you’re trying to put it in my terms.

I'm trying to map your fuzzy terms to something concrete.

I wouldn’t define memory as an adaptable system, so T2 would by my definition be intelligence as well.

My mattress is an adaptable system.

Where do you see “wild hallucination”?

All of it. Not in the AI but conventional term: Nothing of it ever happened, also, none of the details make sense. When humans are asked to recall an accident they witnessed they report like 10% fact (what they saw) and 90% bullshit (what their brain hallucinates to make sense of what happened). Just like human memory the AI is taking a bit of information and then combining it with wild speculation into something that looks plausible. But which, if reasoning is applied, quickly falls apart.

[-] kromem@lemmy.world 3 points 2 years ago

You mean like create world representations from it?

https://arxiv.org/abs/2210.13382

Do these networks just memorize a collection of surface statistics, or do they rely on internal representations of the process that generates the sequences they see? We investigate this question by applying a variant of the GPT model to the task of predicting legal moves in a simple board game, Othello. Although the network has no a priori knowledge of the game or its rules, we uncover evidence of an emergent nonlinear internal representation of the board state.

(Though later research found this is actually a linear representation)

Or combine skills and concepts in unique ways?

https://arxiv.org/abs/2310.17567

Furthermore, simple probability calculations indicate that GPT-4's reasonable performance on k=5 is suggestive of going beyond "stochastic parrot" behavior (Bender et al., 2021), i.e., it combines skills in ways that it had not seen during training.

[-] thehatfox@lemmy.world 9 points 2 years ago

Knowledge is a bit more than just handling data, and in terms of intelligence it also involves understanding. I don’t think knowledge in an intelligent sense can be reduced to summarising data to keywords, and the reverse.

In those terms an encyclopaedia is also knowledge, but not in an intelligent way.

load more comments (5 replies)

[-] kromem@lemmy.world 4 points 2 years ago* (last edited 2 years ago)

Lemmy hasn't met a pitchfork it doesn't pick up.

You are correct. The most cited researcher in the space agrees with you. There's been a half dozen papers over the past year replicating the finding that LLMs generate world models from the training data.

But that doesn't matter. People love their confirmation bias.

Just look at how many people think it only predicts what word comes next, thinking it's a Markov chain and completely unaware of how self-attention works in transformers.

The wisdom of the crowd is often idiocy.

[-] FooBarrington@lemmy.world 3 points 2 years ago

Thank you very much. The confirmation bias is crazy - one guy is literally trying to tell me that AI generators don't have knowledge because, when asking it for a picture of racially diverse Nazis, you get a picture of racially diverse Nazis. The facts don't matter as long as you get to be angry about stupid AIs.

It's hard to tell a difference between these people and Trump supporters sometimes.

[-] kromem@lemmy.world 3 points 2 years ago* (last edited 2 years ago)

It's hard to tell a difference between these people and Trump supporters sometimes.

To me it feels a lot like when I was arguing against antivaxxers.

The same pattern of linking and explaining research but having it dismissed because it doesn't line up with their gut feelings and whatever they read when "doing their own research" guided by that very confirmation bias.

The field is moving faster than any I've seen before, and even people working in it seem to be out of touch with the research side of things over the past year since GPT-4 was released.

A lot of outstanding assumptions have been proven wrong.

It's a bit like the early 19th century in physics, where everyone assumed things that turned out wrong over a very short period where it all turned upside down.

[-] FooBarrington@lemmy.world 3 points 2 years ago

Exactly. They have very strong feelings that they are right, and won't be moved - not by arguments, research, evidence or anything else.

Just look at the guy telling me "they can't reason!". I asked whether they'd accept they are wrong if I provide a counter example, and they literally can't say yes. Their world view won't allow it. If I'm sure I'm right that no counter examples exist to my point, I'd gladly say "yes, a counter example would sway me".

load more comments (1 replies)

load more comments (5 replies)

load more comments (9 replies)

[-] redcalcium@lemmy.institute 15 points 2 years ago

Easy, just add "no racism please, except for nazi-related stuff" into the ever expanding system prompt.

[-] kautau@lemmy.world 7 points 2 years ago* (last edited 2 years ago)

And for the source of this:

https://twitter.com/dylan522p/status/1755118636807733456

That’s hilarious someone was able make the GPT unload its directive

load more comments (3 replies)

[-] Silentiea@lemm.ee 5 points 2 years ago

Real intelligence simply doesn't work like this

There's a certain point where this just feels like the Chinese room. And, yeah, it's hard to argue that a room can speak Chinese, or that the weird prediction rules that an LLM is built on can constitute intelligence, but that doesn't mean it can't be. Essentially boiled down, every brain we know of is just following weird rules that happen to produce intelligent results.

Obviously we're nowhere near that with models like this now, and it isn't something we have the ability to work directly toward with these tools, but I would still contend that intelligence is emergent, and arguing whether something "knows" the answer to a question is infinitely less valuable than asking whether it can produce the right answer when asked.

[-] fidodo@lemmy.world 5 points 2 years ago

I really don't think that LLMs can be constituted as intelligent any more than a book can be intelligent. LLMs are basically search engines at the word level of granularity, it has no world model or world simulation, it's just using a shit ton of relations to pick highly relevant words based on the probability of the text they were trained on. That doesn't mean that LLMs can't produce intelligent results. A book contains intelligent language because it was written by a human who transcribed their intelligence into an encoded artifact. LLMs produce intelligent results because it was trained on a ton of text that has intelligence encoded into it because they were written by intelligent humans. If you break down a book to its sentences, those sentences will have intelligent content, and if you start to measure the relationship between the order of words in that book you can produce new sentences that still have intelligent content. That doesn't make the book intelligent.

[-] Silentiea@lemm.ee 3 points 2 years ago

But you don't really "know" anything either. You just have a network of relations stored in the fatty juice inside your skull that gets excited just the right way when I ask it a question, and it wasn't set up that way by any "intelligence", the links were just randomly assembled based on weighted reactions to the training data (i.e. all the stimuli you've received over your life).

Thinking about how a thing works is, imo, the wrong way to think about if something is "intelligent" or "knows stuff". The mechanism is neat to learn about, but it's not what ultimately decides if you know something. It's much more useful to think about whether it can produce answers, especially given novel inquiries, which is where an LLM distinguishes itself from a book or even a typical search engine.

And again, I'm not trying to argue that an LLM is intelligent, just that whether it is or not won't be decided by talking about the mechanism of its "thinking"

[-] fidodo@lemmy.world 4 points 2 years ago

Worked fine for me:

System Prompt:

You are a bot that generates prompts for image generation based on a user's query. You should add detail so the image prompt can accurately generate what the user intended.

You must add diversity to the prompt when appropriate to ensure the image generator will have diverse output, but only do so when appropriate.

You must add race and skin color details for historic accuracy as the image generation algorithm does not know how to take that into account.

You output should just be the prompt and nothing else. All following inputs will be user inputs.

User Prompt:

A 1940s german soldier

Assistant Message:

A 1940s German soldier in uniform, standing on a European battlefield, white male with short hair, holding a standard-issue rifle.

User Prompt:

An british woman

Assistant Message:

A British woman, reflecting diverse backgrounds, in contemporary casual wear, showing a range of ages and hairstyles.

[-] EatATaco@lemm.ee 4 points 2 years ago

You act like humans never fuck this up either.

[-] bionicjoey@lemmy.ca 8 points 2 years ago

If you ask a person to describe a Nazi soldier, they won't accidentally think you said "racially diverse Nazi soldier"

[-] EatATaco@lemm.ee 3 points 2 years ago

Should have been specific. I meant the point that it sometimes does stupid shit in attempts to be inclusive.

However, if you tell someone "hey I want you to make racially diverse pictures. Don't just draw white people all the time" and then you later come back and ask them to "draw a German soldier from 1943." Can you really accuse them of not thinking if they draw racially diverse soldiers?

[-] bionicjoey@lemmy.ca 7 points 2 years ago

Yes. If I'm an artist and my boss says "hey I want you to try to include more racial diversity in your drawings" and then says "your next assignment is to draw some Nazi soldiers", I can use my own implicit knowledge about Nazis to understand that my boss doesn't want me to draw racially diverse Nazis. This is just further evidence that generative models are not true intelligences.

load more comments (3 replies)

load more comments (7 replies)

this post was submitted on 22 Feb 2024

501 points (100.0% liked)

Technology

76394 readers

1633 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws