justletmefuckinggo

joined 1 year ago

[–] justletmefuckinggo@alien.top 1 points 11 months ago

im new here. but is this true multimodality, or is it the llm communicating with a vision model?

and what are those 4 models being benchmark tested here for exactly?