93
submitted 1 year ago by throws_lemy@lemmy.nz to c/science@mander.xyz
you are viewing a single comment's thread
view the rest of the comments
[-] andrew_bidlaw@sh.itjust.works 4 points 1 year ago

Wilkinson ... has examined several data sets generated by earlier versions of the large language model, which he says lacked convincing elements when scrutinized, because they struggled to capture realistic relationships between variables.

This revealed a mismatch in many ‘participants’ between designated sex and the sex that would typically be expected from their name. Furthermore, no correlation was found between preoperative and postoperative measures of vision capacity and the eye-imaging test. Wilkinson and Lu also inspected the distribution of numbers in some of the columns in the data set to check for non-random patterns. The eye-imaging values passed this test, but some of the participants’ age values clustered in a way that would be extremely unusual in a genuine data set: there was a disproportionate number of participants whose age values ended with 7 or 8.

It's 2 am and the homework is due this morning-energy. It seems they were careless and probably thought their data wouldn't be studied at all. Relationships between columns is where forgeries like these would always suffer. It takes a good amount of understanding to make one, and LLMs lack it unless explicitly guided by a human to take them into account. Otherwise they would find their own, where post-op condition may depend on patient's last name and 8's and 9's are the most popular age's second digit to choose.

this post was submitted on 24 Nov 2023
93 points (100.0% liked)

Science

3191 readers
30 users here now

General discussions about "science" itself

Be sure to also check out these other Fediverse science communities:

https://lemmy.ml/c/science

https://beehaw.org/c/science

founded 2 years ago
MODERATORS