this post was submitted on 28 Nov 2023
1 points (100.0% liked)
LocalLLaMA
14 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
load-in-4bit takes a long time to load a model and the performance is poor in both speed and output quality.
I have compared a bunch of quant methods at https://desync.xyz/ for Mistral, llama-7b, orca2-13b if you are interested.