Grimulkan

joined 10 months ago
[–] Grimulkan@alien.top 1 points 9 months ago

Do you find any repetition problems at longer context lengths (closer to 4K)?

[–] Grimulkan@alien.top 1 points 9 months ago (2 children)

A 4.x bit 70b model trained with 16k context with exllamav2 fits with room to spare. If you can add a 3090 or 4090 as well you can include a 6bit 32k 70b. That’s my standard inference setup and it covers a lot of ground.