submitted 2 years ago by EndlessApollo@lemmy.world to c/196

24 comments fedilink hide all child comments

KICK TECH BROS OUT OF 196

you are viewing a single comment's thread
view the rest of the comments

[-] Even_Adder@lemmy.dbzer0.com 1 points 2 years ago

What about this? These weird little dictionaries have lots of emergent properties we're still exploring.

[-] huginn@feddit.it 2 points 2 years ago

The paper states that the graphs representing those relations are the result of training LLMs on a very small subset of unambiguous true and false statements.

While these emergent properties may provide interesting avenues to model refinement and inspecting outputs it doesn't change the fact that these weird little dictionaries aren't doing anything truly unexpected. We just are learning the extra data associated with the training data.

It's not far removed from the primary complaint of Gebru's On Stochastic Parrots where she points out the ways that our biases are implicitly trained into LLMs because of the uncontrolled and unexamined inputs: except in this case those biases are the linguistics of truth and lies in unambiguous boolean inputs.

[-] Even_Adder@lemmy.dbzer0.com 1 points 2 years ago

This may provide interesting avenues to model refinement that aren't spitting things out and being retrained by “consciousness” telling it yes or no, or feeding it additional info.

[-] huginn@feddit.it 1 points 2 years ago

Only if the "direction of truth" exists in the wild with unchecked training data.

That clustering is a representation of the nature of the data fed to the model: all their training data was unambitious true or false... It's not surprising that it clusters.

[-] Even_Adder@lemmy.dbzer0.com 1 points 2 years ago

Cool.

this post was submitted on 23 Oct 2023

82 points (100.0% liked)

196

19448 readers

232 users here now

Be sure to follow the rule before you head out.

Rule: You must post before you leave.

Other rules

Behavior rules:

No bigotry (transphobia, racism, etc…)
No genocide denial
No support for authoritarian behaviour (incl. Tankies)
No namecalling
Accounts from lemmygrad.ml, threads.net, or hexbear.net are held to higher standards
Other things seen as cleary bad

Posting rules:

No AI generated content (DALL-E etc…)
No advertisements
No gore / violence
Mutual aid posts are not allowed

NSFW: NSFW content is permitted but it must be tagged and have content warnings. Anything that doesn't adhere to this will be removed. Content warnings should be added like: [penis], [explicit description of sex]. Non-sexualized breasts of any gender are not considered inappropriate and therefore do not need to be blurred/tagged.

If you have any questions, feel free to contact us on our matrix channel or email.

Other 196's:

founded 3 years ago

MODERATORS