My office PC had a RTX2060 12G and it runs 13b models at 4bit no problem.
That's pretty much it's limit though. 13b 4bit + 4096 context would max out the vram, but it is stable.
My office PC had a RTX2060 12G and it runs 13b models at 4bit no problem.
That's pretty much it's limit though. 13b 4bit + 4096 context would max out the vram, but it is stable.