157

ChatGPT gets code questions wrong 52% of the time (www.theregister.com)

submitted 1 year ago by ylai@lemmy.ml to c/artificial_intel@lemmy.ml

8 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] SirGolan@lemmy.sdf.org 7 points 1 year ago

Yeah. They buried it in there (and for some of their experiments just said "ChatGPT" which could mean either), but they used 3.5 and oddly enough, 3.5 gets 48% on HumanEval.

[-] fristislurper@feddit.nl 6 points 1 year ago* (last edited 1 year ago)

They "burried" it in the methodology section, where they describe how they generate prompts. This is the place I expect this to be mentioned, or am I missing something? Where else would they put it.

[-] SirGolan@lemmy.sdf.org 4 points 1 year ago

It's a pretty important fact since there's a huge difference between 3.5 and 4. Mentioning it once in one place is not great, plus they also just mention ChatGPT without specifying 3.5 or 4 earlier in that paragraph. The problem I have is this has led to press (and hence many other people) thinking ChatGPT is terrible at coding when in fact using the GPT 4 version, it's actually pretty decent.

this post was submitted on 10 Aug 2023

157 points (100.0% liked)

AI

4126 readers

1 users here now

Artificial intelligence (AI) is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals, which involves consciousness and emotionality. The distinction between the former and the latter categories is often revealed by the acronym chosen.

founded 3 years ago