this post was submitted on 18 Nov 2023
1 points (100.0% liked)
LocalLLaMA
1 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 10 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
This seems, to me, a terrible riddle. Not only can you play chess online, not only can you play chess against a computer, but you can literally play chess alone.
GPT is correct: this is an open-ended question and there's not enough information to actually answer it beyond a clever guess.
Open-ended question are the best for evaluating LLM, because they require common sense/world knowledge/doxa/human like behavior.
Saying "I don't know" is just a cop out response. At least it should say something like "It could be X but ...", be a little creative.
Another (less?) open-ended question with the same premise would be "Where are they?" and I expect the answer to be "In a garden".
GPT-4 Turbo (with custom instruction) answer very well https://chat.openai.com/share/c305568e-f89e-4e71-bb97-79f7710c441a
Perhaps there's a language barrier here, but none of those activities hint to a garden? In my locale, a garden is a small patch used to grow veggies, herbs, and/or flowers. So I would answer this with "their back yard."
This is a much better riddle for children IMO, because it's barely open-ended at all. The original has almost infinite answers without any leaps or tricks, but yours has a very limited domain: a yard/garden. Though if someone were extra clever, the problem space does open back to nearly infinity (if brother 4 is playing a video game).
For personal testing, that's certainly a valid opinion! But it's not very productive from an objective standpoint because it can't be graded and tests a "gotcha" path of thinking, when we're still focusing on fundamentals like uniform context attention, consistency over time, etc.