this post was submitted on 21 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 1 year ago
MODERATORS
 

What with the ongoing turmoil at OAI, has anyone found an alternative for their vision endpoint that offers comparable functionality? I am aware of LLaVa which seems early in its maturity, but are there any commercial offerings?

top 4 comments
sorted by: hot top controversial new old
[–] vatsadev@alien.top 1 points 11 months ago

There's fuyu-8b, but no commercial license.

It can really cover the "GPT-4 reads websites" and stuff like that, helpful with complex charts too. Other than that LLava is your best hope.

[–] thomasxin@alien.top 1 points 11 months ago

https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/neva-22b

https://replicate.com/joehoover/instructblip-vicuna13b/api

Here are a couple that haven't been mentioned; they're quite a lot weaker than GPT4V though, as to be expected from small models.

[–] mincksthethird@alien.top 1 points 11 months ago

have you checked out the new release from OpenVL? Their vision API is gaining traction and might fit your needs.

[–] brunoezechutari@alien.top 1 points 11 months ago

have you checked out LLaVa's early maturity? seems like a promising alternative. not sure about commercial offerings though.