724
top 50 comments
sorted by: hot top controversial new old
[-] impure9435@kbin.run 235 points 1 year ago

The thing that I find the most funny about this post, is the fact that you call this Italian

[-] lseif@sopuli.xyz 211 points 1 year ago

how am i supposed to know how italians speak. i've never seen one

[-] jballs@sh.itjust.works 44 points 1 year ago

From my experience, they speak mostly with their hands

[-] thesporkeffect@lemmy.world 24 points 1 year ago

They're not real, but they can hurt you.

[-] lseif@sopuli.xyz 5 points 1 year ago

like reverse vampires ?

[-] Meowie_Gamer@lemmy.world 6 points 1 year ago

It's a me, Mario!

load more comments (2 replies)
[-] Phoenix3875@lemmy.world 122 points 1 year ago

Let me simplify it: proceeds to print the same expression

[-] ChanchoManco@lemm.ee 53 points 1 year ago* (last edited 1 year ago)

Typical AI behavior

Edit: and then it will gaslight you if you say the answer is the same.

[-] driving_crooner@lemmy.eco.br 18 points 1 year ago

Fucking hate when do that.

You are repeating the same mistake.

I'm sorry for repeating the same mistake, here's a new solution with corrections *proceed to write the exactly thing already told it was wrong*

[-] Wappen@lemmy.world 13 points 1 year ago

Nope, they replaced an asterisk with an arrow!

[-] samus12345@lemmy.world 4 points 1 year ago

Oh, right, now I get it!

load more comments (1 replies)
[-] abrahambelch@programming.dev 78 points 1 year ago

Which language uses these signs? It truly looks like some kind of alien language

[-] chapapa@discuss.tchncs.de 128 points 1 year ago* (last edited 1 year ago)

Glagolitic script. Oldest known Slavic alphabet according to Wikipedia.

[-] sunoc@sh.itjust.works 9 points 1 year ago

I would like to know too! Never saw that writing system before.

[-] nimpnin@sopuli.xyz 7 points 1 year ago
load more comments (2 replies)
[-] stingpie@lemmy.world 69 points 1 year ago

This might be happening because of the 'elegant' (incredibly hacky) way openai encodes multiple languages into their models. Instead of using all character sets, they use a modulo operator on each character, to make all Unicode characters represented by a small range of values. On the back end, it somehow detects which language is being spoken, and uses that character set for the response. Seeing as the last line seems to be the same mathematical expression as what you asked, my guess is that your equation just happened to perfectly match some sentence that would make sense in the weird language.

[-] PlexSheep@infosec.pub 31 points 1 year ago

Do you have a source for that? Seems like an internal detail a corpo wouldn't publish

[-] stingpie@lemmy.world 19 points 1 year ago

Can't find the exact source–I'm on mobile right now–but the code for the gpt-2 encoder uses a utf-8 to unicode look up table to shrink the vocab size. https://github.com/openai/gpt-2/blob/master/src/encoder.py

load more comments (2 replies)
[-] NeatNit@discuss.tchncs.de 18 points 1 year ago

I suppose it's conceivable that there's a bug in converting between different representations of Unicode, but I'm not buying and of this "detected which language is being spoken" nonsense or the use of character sets. It would just use Unicode.

The modulo idea makes absolutely no sense, as LLMs use tokens, not characters, and there's soooooo many tokens. It would make no sense to make those tokens ambiguous.

[-] stingpie@lemmy.world 8 points 1 year ago

I completely agree that it's a stupid way of doing things, but it is how openai reduced the vocab size of gpt-2 & gpt-3. As far as I know–I have only read the comments in the source code– the conversion is done as a preprocessing step. Here's the code to gpt-2: https://github.com/openai/gpt-2/blob/master/src/encoder.py I did apparently make a mistake, as the vocab reduction is done through a lut instead of a simple mod.

[-] Redex68@lemmy.world 62 points 1 year ago

Damn, wild Glagolitic script found. I didn't even realise it was in the Unicode standard.

[-] Hupf@feddit.de 60 points 1 year ago

Well, it certainly doesn't overflow on 32 bit systems

[-] Annoyed_Crabby@monyet.cc 49 points 1 year ago

That's not italian that's obviously Unown

[-] Vitaly@feddit.uk 35 points 1 year ago

It looks so badass, I could have used that script now because im Ukrainian but instead I have cyrillic script which is so boring

[-] match@pawb.social 8 points 1 year ago

rebel against Russian imperialism, return to glagolitic

[-] Vitaly@feddit.uk 5 points 1 year ago* (last edited 1 year ago)

It's not russian, If my bulgarian friend is right then it was created by a bulgarian guy

[-] TwilightKiddy@programming.dev 4 points 1 year ago

There is no single person responsible for Cyrillic script. It is mostly believed to be created by mixing and changing Greek and Glagolic scripts by the scholars of Preslav Literary School, which was indeed in Bulgaria. After a while, Peter the Great changed it a lot. And then Stalin stomped out almost all the deviations in the usage of the script.

The last part is mostly why it is considered Russian. A lot of languages suffered because of Moscow just forcing them to use the version of Cyrillic that Russians were using.

load more comments (1 replies)
[-] ICastFist@programming.dev 18 points 1 year ago

Title mentions speaking italian

Not a single hand gesture anywhere

I've been duped

[-] RacoonVegetable@reddthat.com 18 points 1 year ago

I felt that when he said *83h400+93)*38hpfhi0

[-] 9point6@lemmy.world 16 points 1 year ago

Never go full APL

[-] iAvicenna@lemmy.world 15 points 1 year ago
[-] QuazarOmega@lemy.lol 14 points 1 year ago

You may not understand, but we do.
Questo segreto rimarrà custodito gelosamente dalla stirpe italica. ◉‿◉

[-] MazonnaCara89@lemmy.ml 6 points 1 year ago

No brother non possiamo tenere questo segreto fino alla fine

[-] supercriticalcheese@lemmy.world 4 points 1 year ago

Perché no? Un' po' come il segreto per come preparare la pasta

load more comments (1 replies)
load more comments (12 replies)
[-] Vitaly@feddit.uk 9 points 1 year ago

Kind of looks like the writing system of Georgian language but I'm not sure

[-] TwilightKiddy@programming.dev 18 points 1 year ago

Nah, Georgian is arcs and circles everywhere, like this: ეს ქართული დამწერლობაა.

[-] Vitaly@feddit.uk 6 points 1 year ago

Well, then I was wrong

[-] r00ty@kbin.life 9 points 1 year ago

Wow, an alien ion drive formula! Try to get warp drive out of it too!

[-] BlueMagma@sh.itjust.works 7 points 1 year ago

Looks like UiUa: uiua.org

[-] NotSpez@lemmy.ml 5 points 1 year ago

We are so cooked

load more comments
view more: next ›
this post was submitted on 12 Jun 2024
724 points (100.0% liked)

Programmer Humor

36939 readers
293 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

founded 5 years ago
MODERATORS