ae_dataviz

joined 1 year ago

Quantizing 70b models to 4-bit, how much does performance degrade? (alien.top)

submitted 11 months ago by ae_dataviz@alien.top to c/localllama@poweruser.forum

22 comments fedilink

The title, pretty much.

I'm wondering whether a 70b model quantized to 4bit would perform better than a 7b/13b/34b model at fp16. Would be great to get some insights from the community.