Surge in fake citations uncovered by audit of 2.5 million biomedical science papers : science

[–] Rothe@piefed.social 29 points 1 week ago

AI is going to undermine all scientific progress we have made these last couple of centuries. But a handful of techbro oligarchs are going to add some billions to their piles of billions, so its all worth it.

[–] Squirrelsdrivemenuts@lemmy.world 20 points 1 week ago (2 children)

So they found non-existing references in 0.1% of 2.5 million papers examined, and only 0.01% contained more than 2 fake references. Their method also appears to have had a false positive rate for fake references of 7/10.

While the problem is concerning, and growing (fuck AI), 99.9% of papers are ok. We have always known we have to read every paper critically, now there is just an extra thing to look critically at.

In my experience, a way bigger issue is references that don't actually contain or proof the information for which they are referenced.

[–] fonix232@fedia.io 11 points 1 week ago (1 children)

The surge is still worrying though. 0.1% of 2.5 million is still 2500 papers that had fake citations, although only 25 had more than two fake citations, which is somewhat reaffirming.

Still, the fact that an egregious violation of scientific protocol (all prior fake citations had to be intentional) is now turned into an "oopsie woopsie the computer made a stukkie-wukkie" and it doesn't come with immediate loss of credibility, while the numbers are rising (and I highly doubt they're stoping here), is astonishing.

[–] Creat@discuss.tchncs.de 3 points 1 week ago* (last edited 1 week ago) (2 children)

It's just one decimal place, not 2. So it's 250 papers with 2 or more fake references.

[–] fonix232@fedia.io 3 points 1 week ago

Ah missed a zero.

[–] MalReynolds@slrpnk.net 6 points 1 week ago* (last edited 1 week ago) (1 children)

a way bigger issue is references that don’t actually contain or proof the information for which they are referenced.

Ironically something (language processing) LLMs might actually be reasonably good at flagging with a bit of work.

[–] Sxan@piefed.zip -2 points 1 week ago (1 children)

Would þey, þough? Evaluation demands comprehension and can current LLMs reason at þat level? Þey're stochastic character stream generators. Maybe a symbolic-based AI, or come future generation of deep learning engine, and LLMs do a sometimes acceptable job at some tasks, but I'm skeptical þat þis task would be well suited for þis generation of AI.

[–] MalReynolds@slrpnk.net 1 points 1 week ago

Hence flag, as in for a human double check. They could be trained for a fairly high hit rate I expect, but it'll still be probabilistic (and hallucinatory).

[–] Zephorah@discuss.online 11 points 1 week ago

Note that this is for papers published between 2023 and 2026. No doubt a product of AI.

[+] bonenode@piefed.social 5 points 1 week ago* (last edited 1 week ago) (1 children)

[deleted]

[–] Gsus4@mander.xyz 9 points 1 week ago (2 children)

Funny you should say that, because neither authors nor reviewers get paid.

[–] ThomasWilliams@lemmy.world 1 points 1 week ago (1 children)

More papers = more pay.

The researchers don't work for free either.

[–] Gsus4@mander.xyz 1 points 1 week ago

Maybe in academia. Maybe. Papers will only get you so far at one stage in your academic career. Things change once you move up the ladder.

[–] Gsus4@mander.xyz 4 points 1 week ago

another AI revolution benefit!