this post was submitted on 01 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I'm using llama models for local inference with Langchain , so i get so much hallucinations with GGML models i used both LLM and chat of ( 7B, !3 B) beacuse i have 16GB of RAM.
So Now i'm exploring new models and want to get a good model , should i try GGUF format ??
Kindly give me suggestions if someone using Local models with langchain at production level .

top 3 comments
sorted by: hot top controversial new old
[–] tortistic_turtle@alien.top 1 points 10 months ago

GGUF won't change the level of hallucination, but you are right that most newer language models are quantized to GGUF, so it makes sense to use one.

[–] __SlimeQ__@alien.top 1 points 10 months ago

ggml is totally deprecated, so much so that the make-ggml.py script in llama.cpp now makes ggufs

[–] StrikePrice@alien.top 1 points 10 months ago