this post was submitted on 01 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

I'm using llama models for local inference with Langchain , so i get so much hallucinations with GGML models i used both LLM and chat of ( 7B, !3 B) beacuse i have 16GB of RAM.
So Now i'm exploring new models and want to get a good model , should i try GGUF format ??
Kindly give me suggestions if someone using Local models with langchain at production level .

top 3 comments
sorted by: hot top controversial new old
[–] tortistic_turtle@alien.top 1 points 1 year ago

GGUF won't change the level of hallucination, but you are right that most newer language models are quantized to GGUF, so it makes sense to use one.

[–] __SlimeQ__@alien.top 1 points 1 year ago

ggml is totally deprecated, so much so that the make-ggml.py script in llama.cpp now makes ggufs

[–] StrikePrice@alien.top 1 points 1 year ago