justletmefuckinggo

joined 2 years ago

[–] justletmefuckinggo@alien.top 1 points 2 years ago

im new here. but is this true multimodality, or is it the llm communicating with a vision model?

and what are those 4 models being benchmark tested here for exactly?