Would I get better results in general by running a 7B model with Q8, or a 13B model with Q4/Q5? My laptop can do either.
I'm guessing the quantized 13B model will be better but has anyone ever benchmarked 7B vs 13B for different levels of quantization?
Would I get better results in general by running a 7B model with Q8, or a 13B model with Q4/Q5? My laptop can do either.
I'm guessing the quantized 13B model will be better but has anyone ever benchmarked 7B vs 13B for different levels of quantization?