durden111111

joined 10 months ago
[–] durden111111@alien.top 1 points 10 months ago

nice. From my tests it seems to be about the same as LLava v1.5 13B and Bakllava. I'm starting to suspect that the CLIP-Large model all of these multi-model LLMs are using is holding them back.

[–] durden111111@alien.top 1 points 10 months ago (5 children)

I found it to be worse than openhermes 2.5. It just gives shorter, more robotic responses

[–] durden111111@alien.top 1 points 10 months ago

Openhermes 2.5 still feels significantly better imo

[–] durden111111@alien.top 1 points 10 months ago (2 children)

Hopefully we get GGUFs soon

[–] durden111111@alien.top 1 points 10 months ago

Text Gen UI for general inference

llama.cpp server for multimodal

[–] durden111111@alien.top 1 points 10 months ago

I'm wondering too. Openhermes 2.5 works fine for me on Oobabooga but it just stops outputting any tokens once it reaches 4k context despite having everything set for 8k (I'm running GGUF offloaded to gpu).

[–] durden111111@alien.top 1 points 10 months ago