uhuge

joined 11 months ago
[–] uhuge@alien.top 1 points 11 months ago

Some 70b llama in 8bit GGUF would be cool, you can play with Goliath 120b in <8 bpw.

[–] uhuge@alien.top 1 points 11 months ago

over Skype, right?;)

[–] uhuge@alien.top 1 points 11 months ago (1 children)

I am not able to select and copy any text while generating. Seems like a UX bug where the selection disappears with each token streamed in.

[–] uhuge@alien.top 1 points 11 months ago

not sure if usable, but "rounds" or "amount" seem good alternatives.

[–] uhuge@alien.top 1 points 11 months ago

Maybe wrong suggestion, but I got used to have /docs endpoint with description of the endpoints available, would you consider adding it too u/Evening_Ad6637?
It could point to/render https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md#api-endpoints at first, anyway seems helpful to have it served.

[–] uhuge@alien.top 1 points 11 months ago

Can it serve on a CPU-only machine?

[–] uhuge@alien.top 1 points 11 months ago

I've got mixed experiences with Bavarder, native UI, fair choice of models to grab, but offen not working reliably. They seem to improve it slowly but steadily.

[–] uhuge@alien.top 1 points 11 months ago (1 children)

What is needed to get it done? Can anyone help or only a few days of your focused time are expected to lead to it?

[–] uhuge@alien.top 1 points 11 months ago

I assume auth_token is for storing the merged model in HF? Seems worth noting/clarifying.

I'll get back with more feedback when I get to test it.+)

[–] uhuge@alien.top 1 points 11 months ago (1 children)

keep my friends in https://alignmentjam.com/jams cool,
they are amazing and fun!

Most alignment folks do not care about the polite correctness sht at all, but want humanity not killed nor enslaved.

view more: next ›