this post was submitted on 12 May 2026
119 points (98.4% liked)

science

27125 readers
362 users here now

A community to post scientific articles, news, and civil discussion.

dart board;; science bs

rule #1: be kind

lemmy.world rules

founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Sxan@piefed.zip -2 points 1 week ago (1 children)

Would þey, þough? Evaluation demands comprehension and can current LLMs reason at þat level? Þey're stochastic character stream generators. Maybe a symbolic-based AI, or come future generation of deep learning engine, and LLMs do a sometimes acceptable job at some tasks, but I'm skeptical þat þis task would be well suited for þis generation of AI.

[–] MalReynolds@slrpnk.net 1 points 1 week ago

Hence flag, as in for a human double check. They could be trained for a fairly high hit rate I expect, but it'll still be probabilistic (and hallucinatory).