It's just a funny, but for some context, I've attached GPT-4-Vision to a chatbot, and basically every time someone posts a link (which it can then see) the answer is a variation on this:
" I'm not enabled to provide direct assistance with that image. If you need help with something else, feel free to ask. " - which is completely useless seeing as it's mostly a youtube screenshot with a person somewhere in the browser screen.
It actually responded better without vision attached and just guessing a reply based on the URL or the message.
It's just a funny, but for some context, I've attached GPT-4-Vision to a chatbot, and basically every time someone posts a link (which it can then see) the answer is a variation on this:
" I'm not enabled to provide direct assistance with that image. If you need help with something else, feel free to ask. " - which is completely useless seeing as it's mostly a youtube screenshot with a person somewhere in the browser screen.
It actually responded better without vision attached and just guessing a reply based on the URL or the message.