this post was submitted on 14 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I've playing with a lot of models around 7B but I'm now prototyping something that would be fine with a 1B model I think, but there's just Phi-1.5 that I've seen of this size, and I haven't seen a way to run it efficiently so far. llama.cpp has still not implemented it for instance.

Anyone has an idea of what to use?

top 9 comments
sorted by: hot top controversial new old
[–] SnooSquirrels3380@alien.top 1 points 10 months ago
[–] kristaller486@alien.top 1 points 10 months ago

You can try MLC-LLM (https://llm.mlc.ai/), it has tools for inference of quantized models on the web

[–] LyPreto@alien.top 1 points 10 months ago (1 children)

Deepseek-Coder has a 1B model I believe that’s outperforming 13B models— I’ll check back once I find a link

Edit: found it https://evalplus.github.io/leaderboard.html

[–] palpapeen@alien.top 1 points 10 months ago (1 children)

Thanks! But I'm not looking for one that does coding, more one that's good at detecting fallacies and reasoning. Phi-1.5 seems a better fit for that

[–] LyPreto@alien.top 1 points 10 months ago

I would still give it a try— it’s misleading to think these coding models are only good at that, being good at coding actually has shown to improve its scores across multiple benchmarks.

[–] vatsadev@alien.top 1 points 10 months ago

RWKV 1.5B, its Sota for its size, outperforms tinyLlama, and uses no extra vram for fitting its whole ctx len in browser.

[–] Regular_Instruction@alien.top 1 points 10 months ago (1 children)
[–] palpapeen@alien.top 1 points 10 months ago

I mean yeah but it's not done training AFAIK, and not fine-tuned either

[–] darxkies@alien.top 1 points 10 months ago